Bug 365273

Summary: Invalid write to stack location reported after signal handler runs
Product: [Developer tools] valgrind Reporter: earl_chew
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: philippe.waroquiers
Priority: NOR    
Version First Reported In: 3.11 SVN   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
URL: http://thread.gmane.org/gmane.comp.debugging.valgrind.devel/32601
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: This test program causes memcheck to generate an "Invalid write" warning.
This is the proposed patch to register the extended stack region when delivering a signal.
This is the proposed patch to register the extended stack region when delivering a signal.
This is a log of the test run using plain 3.11
This is the log of the run with the proposed patch.

Description earl_chew 2016-07-09 07:51:35 UTC
A complete description of the observed failed is available on the link to the email thread.

In summary, I believe that the problem is set up as follows:

o Main thread consumes nearly all of its registered stack region
o Signal is handled in the main thread (Linux will prefer to deliver to the main thread)
o The signal frame causes the stack region to grow, but m_signal.c incorrectly records the base of the grown region
o Another thread runs while the signal handler runs in the main thread, causing memcheck to become a little confused
o The signal handler returns
o An invalid write is observed in the main thread

Reproducible: Always

Steps to Reproduce:
The following test program provides an example of the "Invalid write" message, though in a different context to that observed in the email thread.

Actual Results:  
==12529== Invalid write of size 4
==12529==    at 0x400E09C: _dl_fixup (dl-runtime.c:69)
==12529==    by 0x40144BF: _dl_runtime_resolve (dl-trampoline.S:36)
==12529==    by 0x8048768: main (in /home/earl/Development/valgrind/test)
==12529==  Address 0xbeeccfb0 is on thread 1's stack
==12529==  in frame #0, created by _dl_fixup (dl-runtime.c:66)


Expected Results:  
The proposed patch silences the warnings from memcheck.
Comment 1 earl_chew 2016-07-09 07:54:59 UTC
Created attachment 99958 [details]
This test program causes memcheck to generate an "Invalid write" warning.

The test program was compiled on:

Linux bambi 3.13.0-52-generic #86-Ubuntu SMP Mon May 4 04:32:15 UTC 2015 i686 i686 i686 GNU/Linux

Distributor ID:	LinuxMint
Description:	Linux Mint 17 Qiana
Release:	17
Codename:	qiana

gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
Copyright (C) 2013 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Comment 2 earl_chew 2016-07-09 07:57:41 UTC
Created attachment 99959 [details]
This is the proposed patch to register the extended stack region when delivering a signal.
Comment 3 earl_chew 2016-07-09 08:02:18 UTC
Comment on attachment 99959 [details]
This is the proposed patch to register the extended stack region when delivering a signal.

Sorry, I copied this from gmane, and the text is broken.
Comment 4 earl_chew 2016-07-09 08:04:10 UTC
Created attachment 99960 [details]
This is the proposed patch to register the extended stack region when delivering a signal.
Comment 5 Philippe Waroquiers 2016-07-09 09:53:01 UTC
Compiled the program on an x86 debian 8.5 and an amd64 debian 7.9.
Tried with several valgrind versions, but could not reproduce any write error.

Can you run with the traces -v -v -v -d -d -d --trace-signals=yes
and attach the resulting trace.

Can you attach the trace both with and without your patch ?

Thanks
Comment 6 Philippe Waroquiers 2016-07-09 09:58:55 UTC
Just to be sure to be complete, please also add --trace-signals=yes

Thanks
Comment 7 Philippe Waroquiers 2016-07-09 09:59:32 UTC
Humph, I mean:
Just to be sure to be complete, please also add --trace-syscalls=yes
Comment 8 earl_chew 2016-07-09 20:51:34 UTC
I compiled ran the test program like this:

gcc -O0 -o test test.c -lpthread
taskset 0x03 valgrind -v -v -v -d -d -d --trace-signals=yes --trace-syscalls=yes ./test

Using gcc: Version: 4:4.8.2-1ubuntu6
Using libc: 2.19-0ubuntu6.9

I also discovered that the valgrind diagnostic is not issued if taskset 0x01 is used.
Comment 9 earl_chew 2016-07-09 20:54:13 UTC
Created attachment 99971 [details]
This is a log of the test run using plain 3.11
Comment 10 earl_chew 2016-07-09 20:56:19 UTC
Created attachment 99972 [details]
This is the log of the run with the proposed patch.
Comment 11 Philippe Waroquiers 2016-07-10 21:18:56 UTC
Thanks for the analysis and the patch, this was a tricky problem.

Patch (slightly modified)  committed in revision 15902.