Bug 329612

Summary: Incorrect handling of AT_BASE for image execution
Product: [Developer tools] valgrind Reporter: Dale Weiler <cube2killfield>
Component: generalAssignee: Philippe Waroquiers <philippe.waroquiers>
Status: RESOLVED FIXED    
Severity: grave CC: cube2killfield, philippe.waroquiers, tom
Priority: NOR    
Version: 3.9.0   
Target Milestone: ---   
Platform: unspecified   
OS: Other   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Dale Weiler 2014-01-05 09:23:34 UTC
Image execution incorrectly fills AT_BASE with AT_IGNORE effectively breaking dynamic linking in existing libc implementations.

I spent some time digging in Valgrind sources and various libc implementations and kernel documentation so this is going to be a lengthy post. Essentially the libc I use musl (http://www.musl-libc.org/) depends on AT_BASE from the auxiliary vector the kernel prepares to get the dynamic linkers interpreters base address.

The handling of this relevant stuff in Valgrind can be found in m_initimg/initimg-linux.c around line 664-700, as of revision 13658. As you can already see there is a work around for Android 4.1 in there and I don't think anyone understands it. It turns out Bionic libc (the libc for Android) also depends on AT_BASE for dynamic linking. Omitting AT_BASE like this is illegal for dynamically linked libraries.

The relevant pieces of documentation in the kernel that show how to properly parse the auxiliary vector can be found in kernel/Documentation/vDSO and kernel/ABI/stable/vsdo, it's illegal otherwise to ignore AT_BASE like this. I looked into some other popular libc implementations including glibc and ulibc to see why Valgrind was working on them at all, much to my surprise they use different (slightly more hackish) was at reading the base address. Essentially they take the PC-relative address to a global and pointer-to it that needs relocation preformed, then difference them to get the base address. This works because the relocation hasn't been performed, so bobs your uncle.

It seem this ignoring of this AT_BASE has broken two libcs majorly, once of which is merely working due to a terrible hack. If you continue to read the bizarre comment it seems to rationalize the existence of the hack as some work around for GDB doing weird things with symbol offsets. I would think the correct course of action is to remove that crap and file a bug report on GDB since they're the ones at fault here. 

Reproducible: Always

Steps to Reproduce:
You can reproduce this on any x86 or x86-64, MIPS, ARM (does valgrind even do these architectures? don't know) .. system if you obtain musl and compile it.
 This is trivial and can be done without damaging your system with a bit of work.

1.
$ git clone git://git.musl-libc.org/musl
$ cd musl
We'll compile with debug for now.
$ CFLAGS=-"ggdb3 -O0" ./configure --prefix=/usr/musl --exec-prefix=/usr --syslibdir=/usr/lib && make install
$ mkdir /usr/etc
$ echo "/usr/musl/lib" > /usr/etc/ld-musl-$ARCH.path # where arch is i386 or x86_64
This will produce a musl build with a musl-gcc tool to compile some code and link against musl. You'll want to consider creating a simple test file and compile it with the musl-gcc wrapper.

2. musl-gcc test.c -o foo
3. valgrind ./foo # observe it crash

Now lets see where this is hitching up.

4. valgrind --track-origins=yes --vgdb=yes --vgdb-error=0 ./foo

Switch to new tty and connect to it with gdb

5. gdb ./foo
(gdb) target remote | /usr/lib/valgrind../../bin/vgdb --pid=$PID_OF_FOO
(gdb) c
(gdb) # at this point a segmentation fault has occurred and Valgrind has trapped it
(gdb) bt full
Actual Results:  
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at src/ldso/dynlink.c:120
120	src/ldso/dynlink.c: No such file or directory.
(gdb) bt full
#0  0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at src/ldso/dynlink.c:120
No locals.
#1  0x0000000004028805 in decode_dyn (p=0x42ba980 <builtin_dsos.4516+256>) at src/ldso/dynlink.c:512
        dyn = {0 <repeats 34 times>}
#2  0x000000000402a548 in __dynlink (argc=1, argv=0xfff000998) at src/ldso/dynlink.c:1005
        aux = {2189687674, 67108864, 0, 0, 0, 0, 4096, 4194304, 0, 4207936, 0, 0, 0, 0, 0, 68702703608, 395049983, 100, 0, 0, 0, 0, 0, 0, 0, 68702703582, 0, 0, 0, 0, 0, 68702703598, 0, 0, 0, 0, 0, 0}
        i = 37
        phdr = 0x0
        ehdr = 0x400000
        builtin_dsos = {{base = 0x0, name = 0x0, dynv = 0x0, next = 0x0, prev = 0x0, phdr = 0x0, phnum = 0, refcnt = 0, syms = 0x0, hashtab = 0x0, ghashtab = 0x0, versym = 0x0, strings = 0x0, map = 0x0, map_len = 0, 
            dev = 0, ino = 0, global = 0 '\000', relocated = 0 '\000', constructed = 0 '\000', kernel_mapped = 0 '\000', deps = 0x0, needed_by = 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0, tls_len = 0, 
            tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0, new_dtv = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, fini_next = 0x0, shortname = 0x0, buf = 0x42ba980 <builtin_dsos.4516+256> ""}, {
            base = 0x400000 "\177ELF\002\001\001", name = 0x40913e0 "libc.so", dynv = 0xa0b1e8, next = 0x0, prev = 0x0, phdr = 0x400040, phnum = 6, refcnt = 0, syms = 0x0, hashtab = 0x0, ghashtab = 0x0, versym = 0x0, 
            strings = 0x0, map = 0x800000 <Address 0x800000 out of bounds>, map_len = 2146304, dev = 0, ino = 0, global = 1 '\001', relocated = 0 '\000', constructed = 0 '\000', kernel_mapped = 1 '\001', deps = 0x0, 
            needed_by = 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0, tls_len = 0, tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0, new_dtv = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, 
            fini_next = 0x0, shortname = 0x40913e0 "libc.so", buf = 0x42baa80 <builtin_dsos.4516+512> ""}, {base = 0x0, name = 0x0, dynv = 0x0, next = 0x0, prev = 0x0, phdr = 0x0, phnum = 0, refcnt = 0, syms = 0x0, 
            hashtab = 0x0, ghashtab = 0x0, versym = 0x0, strings = 0x0, map = 0x0, map_len = 0, dev = 0, ino = 0, global = 0 '\000', relocated = 0 '\000', constructed = 0 '\000', kernel_mapped = 0 '\000', deps = 0x0, 
            needed_by = 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0, tls_len = 0, tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0, new_dtv = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, 
            fini_next = 0x0, shortname = 0x0, buf = 0x42bab80 <password.1998> ""}}
        app = 0x42ba880 <builtin_dsos.4516>
        lib = 0x42ba980 <builtin_dsos.4516+256>
        vdso = 0x42baa80 <builtin_dsos.4516+512>
        env_preload = 0xfff000f7a "/usr/lib/valgrind/vgpreload_core-amd64-linux.so:/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so"
        vdso_base = 0
        auxv = 0xfff000ac8
        envp = 0xfff0009a8
#3  0x000000000402bd41 in _start () at src/ldso/x86_64/start.s:6
No locals.
#4  0x0000000000000001 in ?? ()
No symbol table info available.
#5  0x0000000fff000c03 in ?? ()
No symbol table info available.
#6  0x0000000000000000 in ?? ()
No symbol table info available.
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at src/ldso/dynlink.c:120
120	in src/ldso/dynlink.c
(gdb) q
A debugging session is active.
	Inferior 1 [Remote target] will be detached.

Expected Results:  
The software is simply unusable with AT_BASE being ignored so I'm marking this Grave. The expected results are for Valgrind to properly give AT_BASE.

The relevant sources for this provided to make develops life a little easier:

Documentation of vDSO in the kernel
https://www.kernel.org/doc/Documentation/ABI/stable/vdso

A primer on how to do vDSO:
http://www2.comp.ufscar.br/lxr/source/Documentation/vDSO/parse_vdso.c

A small vDSO test:
http://www2.comp.ufscar.br/lxr/source/Documentation/vDSO/vdso_test.c

The sources that depend on AT_BASE:
  musl:
  http://git.musl-libc.org/cgit/musl/tree/src/ldso/dynlink.c#n995

  bionic:
  https://github.com/android/platform_bionic/blob/master/linker/linker.cpp#L2208
Comment 1 Dale Weiler 2014-01-05 09:29:21 UTC
(In reply to comment #0)
> Image execution incorrectly fills AT_BASE with AT_IGNORE effectively
> breaking dynamic linking in existing libc implementations.
> 
> I spent some time digging in Valgrind sources and various libc
> implementations and kernel documentation so this is going to be a lengthy
> post. Essentially the libc I use musl (http://www.musl-libc.org/) depends on
> AT_BASE from the auxiliary vector the kernel prepares to get the dynamic
> linkers interpreters base address.
> 
> The handling of this relevant stuff in Valgrind can be found in
> m_initimg/initimg-linux.c around line 664-700, as of revision 13658. As you
> can already see there is a work around for Android 4.1 in there and I don't
> think anyone understands it. It turns out Bionic libc (the libc for Android)
> also depends on AT_BASE for dynamic linking. Omitting AT_BASE like this is
> illegal for dynamically linked libraries.
> 
> The relevant pieces of documentation in the kernel that show how to properly
> parse the auxiliary vector can be found in kernel/Documentation/vDSO and
> kernel/ABI/stable/vsdo, it's illegal otherwise to ignore AT_BASE like this.
> I looked into some other popular libc implementations including glibc and
> ulibc to see why Valgrind was working on them at all, much to my surprise
> they use different (slightly more hackish) was at reading the base address.
> Essentially they take the PC-relative address to a global and pointer-to it
> that needs relocation preformed, then difference them to get the base
> address. This works because the relocation hasn't been performed, so bobs
> your uncle.
> 
> It seem this ignoring of this AT_BASE has broken two libcs majorly, once of
> which is merely working due to a terrible hack. If you continue to read the
> bizarre comment it seems to rationalize the existence of the hack as some
> work around for GDB doing weird things with symbol offsets. I would think
> the correct course of action is to remove that crap and file a bug report on
> GDB since they're the ones at fault here. 
> 
> Reproducible: Always
> 
> Steps to Reproduce:
> You can reproduce this on any x86 or x86-64, MIPS, ARM (does valgrind even
> do these architectures? don't know) .. system if you obtain musl and compile
> it.
>  This is trivial and can be done without damaging your system with a bit of
> work.
> 
> 1.
> $ git clone git://git.musl-libc.org/musl
> $ cd musl
> We'll compile with debug for now.
> $ CFLAGS=-"ggdb3 -O0" ./configure --prefix=/usr/musl --exec-prefix=/usr
> --syslibdir=/usr/lib && make install
> $ mkdir /usr/etc
> $ echo "/usr/musl/lib" > /usr/etc/ld-musl-$ARCH.path # where arch is i386 or
> x86_64
> This will produce a musl build with a musl-gcc tool to compile some code and
> link against musl. You'll want to consider creating a simple test file and
> compile it with the musl-gcc wrapper.
> 
> 2. musl-gcc test.c -o foo
> 3. valgrind ./foo # observe it crash
> 
> Now lets see where this is hitching up.
> 
> 4. valgrind --track-origins=yes --vgdb=yes --vgdb-error=0 ./foo
> 
> Switch to new tty and connect to it with gdb
> 
> 5. gdb ./foo
> (gdb) target remote | /usr/lib/valgrind../../bin/vgdb --pid=$PID_OF_FOO
> (gdb) c
> (gdb) # at this point a segmentation fault has occurred and Valgrind has
> trapped it
> (gdb) bt full
> Actual Results:  
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at
> src/ldso/dynlink.c:120
> 120	src/ldso/dynlink.c: No such file or directory.
> (gdb) bt full
> #0  0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at
> src/ldso/dynlink.c:120
> No locals.
> #1  0x0000000004028805 in decode_dyn (p=0x42ba980 <builtin_dsos.4516+256>)
> at src/ldso/dynlink.c:512
>         dyn = {0 <repeats 34 times>}
> #2  0x000000000402a548 in __dynlink (argc=1, argv=0xfff000998) at
> src/ldso/dynlink.c:1005
>         aux = {2189687674, 67108864, 0, 0, 0, 0, 4096, 4194304, 0, 4207936,
> 0, 0, 0, 0, 0, 68702703608, 395049983, 100, 0, 0, 0, 0, 0, 0, 0,
> 68702703582, 0, 0, 0, 0, 0, 68702703598, 0, 0, 0, 0, 0, 0}
>         i = 37
>         phdr = 0x0
>         ehdr = 0x400000
>         builtin_dsos = {{base = 0x0, name = 0x0, dynv = 0x0, next = 0x0,
> prev = 0x0, phdr = 0x0, phnum = 0, refcnt = 0, syms = 0x0, hashtab = 0x0,
> ghashtab = 0x0, versym = 0x0, strings = 0x0, map = 0x0, map_len = 0, 
>             dev = 0, ino = 0, global = 0 '\000', relocated = 0 '\000',
> constructed = 0 '\000', kernel_mapped = 0 '\000', deps = 0x0, needed_by =
> 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0, tls_len = 0, 
>             tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0, new_dtv
> = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, fini_next = 0x0,
> shortname = 0x0, buf = 0x42ba980 <builtin_dsos.4516+256> ""}, {
>             base = 0x400000 "\177ELF\002\001\001", name = 0x40913e0
> "libc.so", dynv = 0xa0b1e8, next = 0x0, prev = 0x0, phdr = 0x400040, phnum =
> 6, refcnt = 0, syms = 0x0, hashtab = 0x0, ghashtab = 0x0, versym = 0x0, 
>             strings = 0x0, map = 0x800000 <Address 0x800000 out of bounds>,
> map_len = 2146304, dev = 0, ino = 0, global = 1 '\001', relocated = 0
> '\000', constructed = 0 '\000', kernel_mapped = 1 '\001', deps = 0x0, 
>             needed_by = 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0,
> tls_len = 0, tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0,
> new_dtv = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, 
>             fini_next = 0x0, shortname = 0x40913e0 "libc.so", buf =
> 0x42baa80 <builtin_dsos.4516+512> ""}, {base = 0x0, name = 0x0, dynv = 0x0,
> next = 0x0, prev = 0x0, phdr = 0x0, phnum = 0, refcnt = 0, syms = 0x0, 
>             hashtab = 0x0, ghashtab = 0x0, versym = 0x0, strings = 0x0, map
> = 0x0, map_len = 0, dev = 0, ino = 0, global = 0 '\000', relocated = 0
> '\000', constructed = 0 '\000', kernel_mapped = 0 '\000', deps = 0x0, 
>             needed_by = 0x0, rpath_orig = 0x0, rpath = 0x0, tls_image = 0x0,
> tls_len = 0, tls_size = 0, tls_align = 0, tls_id = 0, tls_offset = 0,
> new_dtv = 0x0, new_tls = 0x0, new_dtv_idx = 0, new_tls_idx = 0, 
>             fini_next = 0x0, shortname = 0x0, buf = 0x42bab80
> <password.1998> ""}}
>         app = 0x42ba880 <builtin_dsos.4516>
>         lib = 0x42ba980 <builtin_dsos.4516+256>
>         vdso = 0x42baa80 <builtin_dsos.4516+512>
>         env_preload = 0xfff000f7a
> "/usr/lib/valgrind/vgpreload_core-amd64-linux.so:/usr/lib/valgrind/
> vgpreload_memcheck-amd64-linux.so"
>         vdso_base = 0
>         auxv = 0xfff000ac8
>         envp = 0xfff0009a8
> #3  0x000000000402bd41 in _start () at src/ldso/x86_64/start.s:6
> No locals.
> #4  0x0000000000000001 in ?? ()
> No symbol table info available.
> #5  0x0000000fff000c03 in ?? ()
> No symbol table info available.
> #6  0x0000000000000000 in ?? ()
> No symbol table info available.
> (gdb) c
> Continuing.
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000004027213 in decode_vec (v=0xa0b1e8, a=0xfff000680, cnt=34) at
> src/ldso/dynlink.c:120
> 120	in src/ldso/dynlink.c
> (gdb) q
> A debugging session is active.
> 	Inferior 1 [Remote target] will be detached.
> 
> Expected Results:  
> The software is simply unusable with AT_BASE being ignored so I'm marking
> this Grave. The expected results are for Valgrind to properly give AT_BASE.
> 
> The relevant sources for this provided to make develops life a little easier:
> 
> Documentation of vDSO in the kernel
> https://www.kernel.org/doc/Documentation/ABI/stable/vdso
> 
> A primer on how to do vDSO:
> http://www2.comp.ufscar.br/lxr/source/Documentation/vDSO/parse_vdso.c
> 
> A small vDSO test:
> http://www2.comp.ufscar.br/lxr/source/Documentation/vDSO/vdso_test.c
> 
> The sources that depend on AT_BASE:
>   musl:
>   http://git.musl-libc.org/cgit/musl/tree/src/ldso/dynlink.c#n995
> 
>   bionic:
>  
> https://github.com/android/platform_bionic/blob/master/linker/linker.
> cpp#L2208
Comment 2 Tom Hughes 2014-01-05 10:39:50 UTC
Well if the comment in the source is to be believed that the AT_BASE only actually needs to be patched out in the copy of the auxv that gdbserver sends to gdb so I'm not sure why it is also being patched out in the version we provide to the client program.

In any case as it's related to gdbserver I guess Philippe needs to look at it.
Comment 3 Philippe Waroquiers 2014-01-07 20:17:56 UTC
(In reply to comment #0)

> It seem this ignoring of this AT_BASE has broken two libcs majorly, once of
> which is merely working due to a terrible hack. If you continue to read the
> bizarre comment it seems to rationalize the existence of the hack as some
> work around for GDB doing weird things with symbol offsets. I would think
> the correct course of action is to remove that crap and file a bug report on
> GDB since they're the ones at fault here. 
Thanks for the detailed look at all this. I have to admit I do not master
much of this AT_BASE business, which then resulted in the hack done
in initimg-linux.c. Some platforms had a problem with AT_BASE, and some
others not. The hack (at that time) was solving the problem on all problematic
platforms and was not know to create other problems.
We have then seen a first problem with android, and now with other libs.

At least on my old fedora12/x86 box, not ignoring AT_BASE is giving bad gdbsrv test results.
I believe the same on a RHEL 5.5 in both 32 and 64 bits (need more time to double check,
and I need to check on some other platforms).

So, stopping to ignore AT_BASE looks to  create problems on several setups
and I am not persuaded (yet?) that it is really gdb's fault:
For what concerns the reason of ignoring it for the client (and not only when 
auxv is sent by gdbserver: that was just simpler to implement, and as explained
above, was not known to create problems initially).

So, now,  it seems it is time to bite the bullet, I will work analysing more
in depth  (time permitting in the evening, also taking into account the time
needed to prepare fosdem and do my real paid job during the day :).

I guess that in the meantime, you can bypass the problem by compiling your own
valgrind, with the AT_IGNORE assignment commented.
Comment 4 Tom Hughes 2014-01-07 22:28:24 UTC
Fix committed as r13768.