Bug 253519

Summary: Memcheck reports auxv pointer accesses as invalid reads.
Product: [Developer tools] valgrind Reporter: Jacob Bramley <Jacob.Bramley+kde>
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: minor CC: flo2030, joao.m.santos.silva, mark, peter.maydell, tom
Priority: NOR    
Version: 3.6 SVN   
Target Milestone: ---   
Platform: Unlisted Binaries   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Intercept /proc/self/auxv

Description Jacob Bramley 2010-10-07 16:51:30 UTC
Version:           3.6 SVN
OS:                Linux

Reading /proc/self/auxv on ARM Linux gives you Elf32_auxv_t structures, which consist of a 32-bit "a_type" field and a 32-bit "a_val" (payload) field. In some cases, such as when a_type is AT_PLATFORM, the payload is actually a (32-bit) pointer to some user-accessible memory region that is set up for the application, presumably by the kernel. Dereferencing this pointer causes Valgrind to give errors like this:

==14623== Invalid read of size 1
==14623==    at 0x12AE48: js::arm_read_auxv() (in /home/jacbra01/moz/tm.r/shell/js)
==14623==  Address 0xbed3f6d6 is not stack'd, malloc'd or (recently) free'd

The error above came from a release build of Mozilla's Trace Monkey shell. (The VM runs auxv checks on initialization, so any script or test will trigger the problem.)

Reproducible: Always

Steps to Reproduce:
If building the Mozilla shell is undesirable, a trivial test program can be constructed:

// This code excludes necessary headers and error catching, but is otherwise sufficient to trigger the bug.

int main(void)
{
    Elf32_auxv_t aux;
    int fd = open("/proc/self/auxv", O_RDONLY);
    while(read(fd, &aux. sizeof(Elf32_auxv_t))) {
        if (aux.a_type == AT_PLATFORM) {
            printf("Platform: %s\n", (char const *) aux.a_un.a_val);
        }
    }
    close(fd);
    return 0;
}


Actual Results:  
The program executes correctly but Valgrind complains about the memory access.

Expected Results:  
Accept memory accesses that were advised by auxv. More generally, accept memory accesses advised by the system (if possible). For example, there are some helper routines provided by the kernel in the 0xb....... memory address range (I think).

I'm putting severity as minor because Valgrind doesn't distrupt program execution so this is just an irritation, but it might not be a "trivial bug to find and fix".
Comment 1 Tom Hughes 2010-10-07 17:00:18 UTC
Part of the problem here is that when you access /proc/self/auxv you will be seeing the auxv that the kernel setup for valgrind, rather than the one that valgrind setup for your program.

Short of intercepting and emulating all the access to that file there isn't much we can do to fix that.

I'm not sure why you're accessing auxv like that though - the usual way to access it is just to search up the stack past the environment data as shown here:

  http://articles.manugarg.com/aboutelfauxiliaryvectors.html

That's what valgrind does to find it anyway.

If you do that then I think it will work because you will see the client's version of auxv which should be readable by you.
Comment 2 Jacob Bramley 2010-10-07 17:32:59 UTC
(In reply to comment #1)
> I'm not sure why you're accessing auxv like that though - the usual way to
> access it is just to search up the stack past the environment data as shown
> here:

I think Trace Monkey does it because the code that reads auxv is well below main() in the call stack. Hmm. It sounds like perhaps Trace Monkey is just doing the wrong thing.
Comment 3 Peter Maydell 2010-10-07 17:42:48 UTC
Pixman also uses the "read /proc/self/auxv" approach:
http://cgit.freedesktop.org/pixman/tree/pixman/pixman-cpu.c#n266

This was actually the way that I had recommended to me for accessing the auxv. In the absence of more definite documentation deprecating it I don't think I'd classify TraceMonkey as wrong here.
Comment 4 Julian Seward 2010-10-07 18:43:20 UTC
(In reply to comment #1)
> I'm not sure why you're accessing auxv like that though - the usual way to
> access it is just to search up the stack past the environment data as shown
> here:

That's all very well if you're a complete executable, and you have
some way to figure out where the frame for main() is, so you can work
upwards from that.  But what if you're just a library and you want to
find out?  That's the use case that Jacob is referring to, plus there
is another such case somewhere in Firefox -- one of the innumerable
graphics libraries needs to figure out whether Neon is supported.
Comment 5 Julian Seward 2010-10-07 18:58:08 UTC
(In reply to comment #4)
That said, I agree with Tom -- I can't see an easy way to shut
Memcheck up if the program insists on using /proc/self/auxv to look
behind the curtains.  (We already have special-case hacks for at least
one other of the /proc/self/ files, I can't remember which).
Comment 6 Jacob Bramley 2010-10-08 08:53:25 UTC
Hmm. I suppose it could matter if Valgrind ever wanted to lie to a program about the available features.
Comment 7 Mark Wielaard 2012-09-29 21:04:19 UTC
Created attachment 74237 [details]
Intercept /proc/self/auxv

This also is a problem for any program using librpm, which also reads /proc/self/auxv.

(In reply to comment #5)
> I can't see an easy way to shut
> Memcheck up if the program insists on using /proc/self/auxv to look
> behind the curtains.  (We already have special-case hacks for at least
> one other of the /proc/self/ files, I can't remember which).

That file is /proc/self/cmdline. Since valgrind already sets up auxv for the process anyway it was easy to just take that and put it in a fake proc file like cmdline. The attached patch implements that. It also included is a small testcase that shows various issues before the fix and passes after it.
Comment 8 Tom Hughes 2012-10-04 20:28:00 UTC
Committed as r13019.
Comment 9 Florian Krohm 2012-10-05 03:40:26 UTC
Older kernels on s390 do not provide AT_PLATFORM. So platform0 == platform1 == NULL and
if (strcmp (platform0, platform1) == 0) dies. 
Is there another AT_  value that can be used instead to get the same effect you seek?
Comment 10 Mark Wielaard 2012-10-05 07:05:43 UTC
(In reply to comment #9)
> Older kernels on s390 do not provide AT_PLATFORM. So platform0 == platform1
> == NULL and
> if (strcmp (platform0, platform1) == 0) dies. 
> Is there another AT_  value that can be used instead to get the same effect
> you seek?

Sorry about that. The test just wants to make sure that a) values reported in the auxv record are the same, for which it picks AT_ENTRY, and b) that pointers provided through auxv records can be followed (and point to the same thing), for which it picks AT_PLATFORM.

So for the second test any AT_ value that points to a string can be used.
If AT_EXECFN is available (see with LD_SHOW_AUXV=1 /bin/echo) we can use that.
Alternatively, if there isn't a common string AT_ value on all arches we can just keep using
AT_PLATFORM and allow:
(platform0 == NULL && platform1 == NULL) || (strcmp (platform0, platform1) == 0)
Comment 11 Florian Krohm 2012-10-05 23:32:19 UTC
(In reply to comment #10)
> 
> Sorry about that. The test just wants to make sure that a) values reported
> in the auxv record are the same, for which it picks AT_ENTRY, and b) that
> pointers provided through auxv records can be followed (and point to the
> same thing), for which it picks AT_PLATFORM.
> 
> So for the second test any AT_ value that points to a string can be used.
> If AT_EXECFN is available (see with LD_SHOW_AUXV=1 /bin/echo) we can use
> that

I've disabled the test for those platforms that don't have AT_PLATFORM. Should be rare anyhow.
Comment 12 Tom Hughes 2014-09-03 09:31:50 UTC
*** Bug 327583 has been marked as a duplicate of this bug. ***