| Summary: | valgrind (--tool=none) hangs after main has exited if there are starting threads | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | Konstantin Serebryany <konstantin.s.serebryany> |
| Component: | general | Assignee: | Julian Seward <jseward> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | bart.vanassche+kde, dank, tom |
| Priority: | HI | ||
| Version First Reported In: | 3.6 SVN | ||
| Target Milestone: | --- | ||
| Platform: | Unlisted Binaries | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
|
Description
Konstantin Serebryany
2010-02-10 08:51:02 UTC
Confirming. The script
i=1
while true
do
/usr/local/valgrind-11038/bin/valgrind --tool=none --trace-syscalls=yes --trace-signals=yes -q ./a.out > hang$i.log 2>&1
i=`expr $i + 1`
done
and valgrind r11038 with no patches, on an 8 core monster (hp Z600) running
32 bit karmic, hung at i=27, i=7, i=34 on three of five tries, and crashed with
valgrind: m_libcprint.c:398 (add_to__vmessage_buf): Assertion 'b->buf_used >= 0 && b->buf_used < sizeof(b->buf)-128' failed.
on my third and fourth tries at i=4 and i=5.
Definitely need to fix this. Kostya, Dan, can you try the following patch? It appears to cause
thread exiting to work reliably for me, even with the delay loop in
place. (Note, this is for amd64-linux only; Dan; if you want to try
on 32-bit, you'll need to make the equivalent change in
syswrap-x86-linux instead.)
If this works for you, I'll put up a proper patch for more extensive
testing -- I noticed something else w.r.t. thread creation that needs
to be fixed really, but the patch below doesn't include that fix.
This is all a bit hairy so your multicore/multiprocess bashing on it
is appreciated.
===================================================================
--- coregrind/m_syswrap/syswrap-amd64-linux.c (revision 11050)
+++ coregrind/m_syswrap/syswrap-amd64-linux.c (working copy)
@@ -251,6 +251,13 @@
ctst->sig_mask = ptst->sig_mask;
ctst->tmp_sig_mask = ptst->sig_mask;
+ // PROVISIONAL FIX: start with my threadgroup being the same
+ // as my parents, so that any exit_group calls that happen before
+ // this thread actually sets its threadgroup for real (which
+ // happens in thread_wrapper in syswrap-linux.c) will kill
+ // the new thread.
+ ctst->os_state.threadgroup = ptst->os_state.threadgroup;
+
/* We don't really know where the client stack is, because its
allocated by the client. The best we can do is look at the
memory mappings and try to derive some useful information. We
With this patch the test did not hand after 10k runs (while w/o this patch it hangs after 20-40 runs) I don't have access to the 8 core machine anymore, but thestig does, ask him if you want it tortured. I've run this patch on 4-way and 8-way machines (64-bits). Works. Committed (r11053). |