Bug 321960

Summary: pthread_create() then alloca() causing invalid stack write errors.
Product: [Developer tools] valgrind Reporter: Daniel Stodden <daniel>
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: dimhen, philippe.waroquiers
Priority: NOR    
Version: 3.9.0.SVN   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: Demo test prog.

Description Daniel Stodden 2013-07-04 18:38:02 UTC
Created attachment 80949 [details]
Demo test prog.

Running the attached test2.c, valgrind typically fails as follows:

$ gcc -O0 -g test2.c -lpthread && /var/tmp/valgrind/bin/valgrind --vgdb-error=1 ./a.out
==23755== Memcheck, a memory error detector
==23755== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==23755== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
==23755== Command: ./a.out
==23755== 
==23755== 
==23755== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==23755==   /path/to/gdb ./a.out
==23755== and then give GDB the following command
==23755==   target remote | /var/tmp/valgrind/lib/valgrind/../../bin/vgdb --pid=23755
==23755== --pid is optional if only one valgrind process is running
==23755== 
==23755== Invalid write of size 8
==23755==    at 0x4006B7: __yell (test2.c:16)
==23755==    by 0x40076C: main (test2.c:30)
==23755==  Address 0xffeffeed0 is on thread 1's stack
==23755== 
==23755== (action on error) vgdb me ... 

(gdb) print &buf
$6 = (char (*)[256]) 0xffeffeed0

(gdb) monitor get_vbits 0xffeffeed0 256
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable

It seems to depend on:
 - Some (small) number of threads being spawned.
 - A > page-sized alloca().
 - Reasonably sized memset on top.
 - It's always the main thread which suffers.

Seen with valgrind 3.7, 3.8.1 and yesterday's SVN. 

Got one comment from email:

--snip--
From: 	John Reiser <jreiser@...>
To: 	valgrind-users@lists.sourceforge.net
Subject: 	Re: [Valgrind-users] threads vs main and invalid stack writes
Date: 	Thu, 04 Jul 2013 11:07:30 -0700

>     alloca(4096);
>     __yell();

> 
> (gdb) monitor get_vbits 0xffeffeed0 256
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable
> 
> Any ideas? It seems to depend on:
> 
>  - Some (small) number of threads being spawned.
>  - A > page-sized alloca().
>  - Reasonably sized memset on top.
>  - It's always the main thread which suffers.

Thank you for the small, reproducible testcase!

It's a bug in valgrind, so please file a bug report, and include the testcase.
See the "Bug Reports" entry in the left column of the main page
http://www.valgrind.org/ .

Meanwhile the trivial workaround is to memset every result of alloca.
[..]
--snip--
Comment 1 Philippe Waroquiers 2013-07-08 15:26:36 UTC
(In reply to comment #0)
> 
> ==23755== Invalid write of size 8
> ==23755==    at 0x4006B7: __yell (test2.c:16)
> ==23755==    by 0x40076C: main (test2.c:30)
> ==23755==  Address 0xffeffeed0 is on thread 1's stack
> ==23755== 
> ==23755== (action on error) vgdb me ... 
Testing on Ubuntu 12.10 on amd64 and x86, no such error is reported by
Valgrind 3.8.1 or by the last 3.9.0 SVN.
Also tested on some others systems/platforms (e.g. f12/x86, debian/ppc), none of them
gives an error.
Comment 2 Philippe Waroquiers 2013-07-08 15:29:24 UTC
(In reply to comment #0)

> Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable
> 
> Meanwhile the trivial workaround is to memset every result of alloca.
Also, it is not very clear how a memset of the alloca result will solve an "unnaddressable"
error. It would however solve a "use uninitialised error".
Comment 3 Daniel Stodden 2013-07-09 08:07:22 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > 
> > ==23755== Invalid write of size 8
> > ==23755==    at 0x4006B7: __yell (test2.c:16)
> > ==23755==    by 0x40076C: main (test2.c:30)
> > ==23755==  Address 0xffeffeed0 is on thread 1's stack
> > ==23755== 
> > ==23755== (action on error) vgdb me ... 
> Testing on Ubuntu 12.10 on amd64 and x86, no such error is reported by
> Valgrind 3.8.1 or by the last 3.9.0 SVN.
> Also tested on some others systems/platforms (e.g. f12/x86, debian/ppc),
> none of them
> gives an error.

You need to hit a race between threads spawning and main()'s eventualy entry
into __yell(). I agree that's not as deterministic as desirable, for a simple test/demo.

I'd suggest to bump up the thread count, to e.g. thr[32]. 
That's where I got it to repro on the notebook I'm looking at. Raring/amd64:

ii  libc6:amd64         2.17-0ubuntu5  amd64          Embedded GNU C Library: Shared libraries
ii  valgrind            1:3.8.1-1ubunt amd64          instrumentation framework for building dyna
Comment 4 Philippe Waroquiers 2013-07-21 16:06:28 UTC
fixed in revision 13467.
Thanks for the small reproducer (used as the basis of the regression test).