Bug 321960 - pthread_create() then alloca() causing invalid stack write errors.
Summary: pthread_create() then alloca() causing invalid stack write errors.
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.9.0.SVN
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-04 18:38 UTC by Daniel Stodden
Modified: 2013-07-21 16:06 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Demo test prog. (554 bytes, text/x-csrc)
2013-07-04 18:38 UTC, Daniel Stodden
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Stodden 2013-07-04 18:38:02 UTC
Created attachment 80949 [details]
Demo test prog.

Running the attached test2.c, valgrind typically fails as follows:

$ gcc -O0 -g test2.c -lpthread && /var/tmp/valgrind/bin/valgrind --vgdb-error=1 ./a.out
==23755== Memcheck, a memory error detector
==23755== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==23755== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
==23755== Command: ./a.out
==23755== 
==23755== 
==23755== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==23755==   /path/to/gdb ./a.out
==23755== and then give GDB the following command
==23755==   target remote | /var/tmp/valgrind/lib/valgrind/../../bin/vgdb --pid=23755
==23755== --pid is optional if only one valgrind process is running
==23755== 
==23755== Invalid write of size 8
==23755==    at 0x4006B7: __yell (test2.c:16)
==23755==    by 0x40076C: main (test2.c:30)
==23755==  Address 0xffeffeed0 is on thread 1's stack
==23755== 
==23755== (action on error) vgdb me ... 

(gdb) print &buf
$6 = (char (*)[256]) 0xffeffeed0

(gdb) monitor get_vbits 0xffeffeed0 256
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ________ ________ ________ ________
________ ________ ________ ________ ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable

It seems to depend on:
 - Some (small) number of threads being spawned.
 - A > page-sized alloca().
 - Reasonably sized memset on top.
 - It's always the main thread which suffers.

Seen with valgrind 3.7, 3.8.1 and yesterday's SVN. 

Got one comment from email:

--snip--
From: 	John Reiser <jreiser@...>
To: 	valgrind-users@lists.sourceforge.net
Subject: 	Re: [Valgrind-users] threads vs main and invalid stack writes
Date: 	Thu, 04 Jul 2013 11:07:30 -0700

>     alloca(4096);
>     __yell();

> 
> (gdb) monitor get_vbits 0xffeffeed0 256
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ________ ________ ________ ________
> ________ ________ ________ ________ ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
> Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable
> 
> Any ideas? It seems to depend on:
> 
>  - Some (small) number of threads being spawned.
>  - A > page-sized alloca().
>  - Reasonably sized memset on top.
>  - It's always the main thread which suffers.

Thank you for the small, reproducible testcase!

It's a bug in valgrind, so please file a bug report, and include the testcase.
See the "Bug Reports" entry in the left column of the main page
http://www.valgrind.org/ .

Meanwhile the trivial workaround is to memset every result of alloca.
[..]
--snip--
Comment 1 Philippe Waroquiers 2013-07-08 15:26:36 UTC
(In reply to comment #0)
> 
> ==23755== Invalid write of size 8
> ==23755==    at 0x4006B7: __yell (test2.c:16)
> ==23755==    by 0x40076C: main (test2.c:30)
> ==23755==  Address 0xffeffeed0 is on thread 1's stack
> ==23755== 
> ==23755== (action on error) vgdb me ... 
Testing on Ubuntu 12.10 on amd64 and x86, no such error is reported by
Valgrind 3.8.1 or by the last 3.9.0 SVN.
Also tested on some others systems/platforms (e.g. f12/x86, debian/ppc), none of them
gives an error.
Comment 2 Philippe Waroquiers 2013-07-08 15:29:24 UTC
(In reply to comment #0)

> Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable
> 
> Meanwhile the trivial workaround is to memset every result of alloca.
Also, it is not very clear how a memset of the alloca result will solve an "unnaddressable"
error. It would however solve a "use uninitialised error".
Comment 3 Daniel Stodden 2013-07-09 08:07:22 UTC
(In reply to comment #1)
> (In reply to comment #0)
> > 
> > ==23755== Invalid write of size 8
> > ==23755==    at 0x4006B7: __yell (test2.c:16)
> > ==23755==    by 0x40076C: main (test2.c:30)
> > ==23755==  Address 0xffeffeed0 is on thread 1's stack
> > ==23755== 
> > ==23755== (action on error) vgdb me ... 
> Testing on Ubuntu 12.10 on amd64 and x86, no such error is reported by
> Valgrind 3.8.1 or by the last 3.9.0 SVN.
> Also tested on some others systems/platforms (e.g. f12/x86, debian/ppc),
> none of them
> gives an error.

You need to hit a race between threads spawning and main()'s eventualy entry
into __yell(). I agree that's not as deterministic as desirable, for a simple test/demo.

I'd suggest to bump up the thread count, to e.g. thr[32]. 
That's where I got it to repro on the notebook I'm looking at. Raring/amd64:

ii  libc6:amd64         2.17-0ubuntu5  amd64          Embedded GNU C Library: Shared libraries
ii  valgrind            1:3.8.1-1ubunt amd64          instrumentation framework for building dyna
Comment 4 Philippe Waroquiers 2013-07-21 16:06:28 UTC
fixed in revision 13467.
Thanks for the small reproducer (used as the basis of the regression test).