Summary: | Potential false positive when posting to stack allocated semaphore | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Godmar Back <godmar> |
Component: | helgrind | Assignee: | Julian Seward <jseward> |
Status: | REOPENED --- | ||
Severity: | normal | CC: | godmar, pjfloyd |
Priority: | NOR | ||
Version First Reported In: | 3.18.1 | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
semtest.c that shows the potential false positive
test program that shows spurious warning even when threads are joined |
I'll see if I can reproduce this with clang. Does joining the thread change anything? I'm not a threading expert, but I would expect the thread to be created detached or joined. Answering my own question. On Linux, if I add pthread_join: static void use_stack_allocated_semaphore(void) { sem_t sem; sem_init(&sem, 0, 0); pthread_t t; pthread_create(&t, NULL, post_thread, &sem); sem_wait(&sem); sem_destroy(&sem); pthread_join(t, NULL); } then I get no error ==17867== Helgrind, a thread error detector ==17867== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==17867== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info ==17867== Command: ./451327 ==17867== ==17867== ==17867== Use --history-level=approx or =none to gain increased speed, at ==17867== the cost of reduced accuracy of conflicting-access information ==17867== For lists of detected and suppressed errors, rerun with: -s ==17867== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) And the manual says https://valgrind.org/docs/manual/hg-manual.html 7.5. Hints and Tips for Effective Use of Helgrind 6. Round up all finished threads using pthread_join. Avoid detaching threads: don't create threads in the detached state, and don't call pthread_detach on existing threads. The actual application in which the false positive occurs uses pthread_join properly. I created a minimum reproducible test case here for you to see the issue. Adding pthread_join introduces additional edges in the happens-before graph, which influences Helgrind's race detection. I'll prepare a larger example that uses pthread_join. Created attachment 148434 [details]
test program that shows spurious warning even when threads are joined
I've attached a larger example from the actual code to show the potential issue.
OK, I've reopened this. The error is intermittent for me so I assume that there is some race. It's more reliable if you `NTHREADS` and `n` (In reply to Godmar Back from comment #7) > It's more reliable if you `NTHREADS` and `n` if you *increase* |
Created attachment 147408 [details] semtest.c that shows the potential false positive SUMMARY Helgrind 3.18.1 appears to flag what I believe to be a false positive when a stack-allocated semaphore is signaled to and the stack space on which it is allocated is then reused. STEPS TO REPRODUCE 1. Compile the attached program with gcc -g -pthread semtest.c -o semtest (gcc 9.4.0 on Ubuntu 20.04 with GNU libc 2.2.5 I believe) 2. Run it with valgrind --tool=helgrind ./semtest OBSERVED RESULT ==2965601== Helgrind, a thread error detector ==2965601== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==2965601== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==2965601== Command: ./semtest ==2965601== ==2965601== ---Thread-Announcement------------------------------------------ ==2965601== ==2965601== Thread #1 is the program's root thread ==2965601== ==2965601== ---Thread-Announcement------------------------------------------ ==2965601== ==2965601== Thread #2 was created ==2965601== at 0x49BC152: clone (clone.S:71) ==2965601== by 0x48812EB: create_thread (createthread.c:101) ==2965601== by 0x4882E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817) ==2965601== by 0x4846FBA: pthread_create_WRK (hg_intercepts.c:445) ==2965601== by 0x48480BD: pthread_create@* (hg_intercepts.c:478) ==2965601== by 0x1092C9: use_stack_allocated_semaphore (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x10931C: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== ---------------------------------------------------------------- ==2965601== ==2965601== Possible data race during write of size 8 at 0x1FFEFFFDE0 by thread #1 ==2965601== Locks held: none ==2965601== at 0x484BCD9: memset (vg_replace_strmem.c:1358) ==2965601== by 0x10923F: touch_the_stack (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x109317: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== This conflicts with a previous read of size 8 by thread #2 ==2965601== Locks held: none ==2965601== at 0x488CA37: sem_post@@GLIBC_2.2.5 (sem_post.c:44) ==2965601== by 0x4847ED1: sem_post_WRK (hg_intercepts.c:2993) ==2965601== by 0x4848AB6: sem_post@* (hg_intercepts.c:3013) ==2965601== by 0x109272: post_thread (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x48471B2: mythread_wrapper (hg_intercepts.c:406) ==2965601== by 0x4882608: start_thread (pthread_create.c:477) ==2965601== by 0x49BC162: clone (clone.S:95) ==2965601== Address 0x1ffefffde0 is on thread #1's stack ==2965601== in frame #1, created by touch_the_stack (???:) ==2965601== ==2965601== ---------------------------------------------------------------- ==2965601== ==2965601== Possible data race during write of size 4 at 0x1FFEFFFDE8 by thread #1 ==2965601== Locks held: none ==2965601== at 0x484BCDC: memset (vg_replace_strmem.c:1358) ==2965601== by 0x10923F: touch_the_stack (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x109317: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== This conflicts with a previous read of size 4 by thread #2 ==2965601== Locks held: none ==2965601== at 0x488CA34: sem_post@@GLIBC_2.2.5 (sem_post.c:36) ==2965601== by 0x4847ED1: sem_post_WRK (hg_intercepts.c:2993) ==2965601== by 0x4848AB6: sem_post@* (hg_intercepts.c:3013) ==2965601== by 0x109272: post_thread (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x48471B2: mythread_wrapper (hg_intercepts.c:406) ==2965601== by 0x4882608: start_thread (pthread_create.c:477) ==2965601== by 0x49BC162: clone (clone.S:95) ==2965601== Address 0x1ffefffde8 is on thread #1's stack ==2965601== in frame #1, created by touch_the_stack (???:) ==2965601== ==2965601== ==2965601== Use --history-level=approx or =none to gain increased speed, at ==2965601== the cost of reduced accuracy of conflicting-access information ==2965601== For lists of detected and suppressed errors, rerun with: -s ==2965601== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) EXPECTED RESULT There should be no errors shown. SOFTWARE/OS VERSIONS - Valgrind 3.18.1 (built from source) - gcc 9.4.0 - Ubuntu 20.04.4 ADDITIONAL INFORMATION The error is not shown if the semaphore is dynamically allocated.