| Summary: | Potential false positive when posting to stack allocated semaphore | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | Godmar Back <godmar> |
| Component: | helgrind | Assignee: | Julian Seward <jseward> |
| Status: | REOPENED --- | ||
| Severity: | normal | CC: | godmar, pjfloyd |
| Priority: | NOR | ||
| Version First Reported In: | 3.18.1 | ||
| Target Milestone: | --- | ||
| Platform: | Compiled Sources | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: |
semtest.c that shows the potential false positive
test program that shows spurious warning even when threads are joined |
||
I'll see if I can reproduce this with clang. Does joining the thread change anything? I'm not a threading expert, but I would expect the thread to be created detached or joined. Answering my own question. On Linux, if I add pthread_join:
static void
use_stack_allocated_semaphore(void)
{
sem_t sem;
sem_init(&sem, 0, 0);
pthread_t t;
pthread_create(&t, NULL, post_thread, &sem);
sem_wait(&sem);
sem_destroy(&sem);
pthread_join(t, NULL);
}
then I get no error
==17867== Helgrind, a thread error detector
==17867== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==17867== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
==17867== Command: ./451327
==17867==
==17867==
==17867== Use --history-level=approx or =none to gain increased speed, at
==17867== the cost of reduced accuracy of conflicting-access information
==17867== For lists of detected and suppressed errors, rerun with: -s
==17867== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
And the manual says https://valgrind.org/docs/manual/hg-manual.html 7.5. Hints and Tips for Effective Use of Helgrind 6. Round up all finished threads using pthread_join. Avoid detaching threads: don't create threads in the detached state, and don't call pthread_detach on existing threads. The actual application in which the false positive occurs uses pthread_join properly. I created a minimum reproducible test case here for you to see the issue. Adding pthread_join introduces additional edges in the happens-before graph, which influences Helgrind's race detection. I'll prepare a larger example that uses pthread_join. Created attachment 148434 [details]
test program that shows spurious warning even when threads are joined
I've attached a larger example from the actual code to show the potential issue.
OK, I've reopened this. The error is intermittent for me so I assume that there is some race. It's more reliable if you `NTHREADS` and `n` (In reply to Godmar Back from comment #7) > It's more reliable if you `NTHREADS` and `n` if you *increase* |
Created attachment 147408 [details] semtest.c that shows the potential false positive SUMMARY Helgrind 3.18.1 appears to flag what I believe to be a false positive when a stack-allocated semaphore is signaled to and the stack space on which it is allocated is then reused. STEPS TO REPRODUCE 1. Compile the attached program with gcc -g -pthread semtest.c -o semtest (gcc 9.4.0 on Ubuntu 20.04 with GNU libc 2.2.5 I believe) 2. Run it with valgrind --tool=helgrind ./semtest OBSERVED RESULT ==2965601== Helgrind, a thread error detector ==2965601== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==2965601== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==2965601== Command: ./semtest ==2965601== ==2965601== ---Thread-Announcement------------------------------------------ ==2965601== ==2965601== Thread #1 is the program's root thread ==2965601== ==2965601== ---Thread-Announcement------------------------------------------ ==2965601== ==2965601== Thread #2 was created ==2965601== at 0x49BC152: clone (clone.S:71) ==2965601== by 0x48812EB: create_thread (createthread.c:101) ==2965601== by 0x4882E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817) ==2965601== by 0x4846FBA: pthread_create_WRK (hg_intercepts.c:445) ==2965601== by 0x48480BD: pthread_create@* (hg_intercepts.c:478) ==2965601== by 0x1092C9: use_stack_allocated_semaphore (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x10931C: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== ---------------------------------------------------------------- ==2965601== ==2965601== Possible data race during write of size 8 at 0x1FFEFFFDE0 by thread #1 ==2965601== Locks held: none ==2965601== at 0x484BCD9: memset (vg_replace_strmem.c:1358) ==2965601== by 0x10923F: touch_the_stack (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x109317: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== This conflicts with a previous read of size 8 by thread #2 ==2965601== Locks held: none ==2965601== at 0x488CA37: sem_post@@GLIBC_2.2.5 (sem_post.c:44) ==2965601== by 0x4847ED1: sem_post_WRK (hg_intercepts.c:2993) ==2965601== by 0x4848AB6: sem_post@* (hg_intercepts.c:3013) ==2965601== by 0x109272: post_thread (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x48471B2: mythread_wrapper (hg_intercepts.c:406) ==2965601== by 0x4882608: start_thread (pthread_create.c:477) ==2965601== by 0x49BC162: clone (clone.S:95) ==2965601== Address 0x1ffefffde0 is on thread #1's stack ==2965601== in frame #1, created by touch_the_stack (???:) ==2965601== ==2965601== ---------------------------------------------------------------- ==2965601== ==2965601== Possible data race during write of size 4 at 0x1FFEFFFDE8 by thread #1 ==2965601== Locks held: none ==2965601== at 0x484BCDC: memset (vg_replace_strmem.c:1358) ==2965601== by 0x10923F: touch_the_stack (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x109317: main (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== ==2965601== This conflicts with a previous read of size 4 by thread #2 ==2965601== Locks held: none ==2965601== at 0x488CA34: sem_post@@GLIBC_2.2.5 (sem_post.c:36) ==2965601== by 0x4847ED1: sem_post_WRK (hg_intercepts.c:2993) ==2965601== by 0x4848AB6: sem_post@* (hg_intercepts.c:3013) ==2965601== by 0x109272: post_thread (in /home/gback/cs3214/dutchblitz.sem/tests/semtest) ==2965601== by 0x48471B2: mythread_wrapper (hg_intercepts.c:406) ==2965601== by 0x4882608: start_thread (pthread_create.c:477) ==2965601== by 0x49BC162: clone (clone.S:95) ==2965601== Address 0x1ffefffde8 is on thread #1's stack ==2965601== in frame #1, created by touch_the_stack (???:) ==2965601== ==2965601== ==2965601== Use --history-level=approx or =none to gain increased speed, at ==2965601== the cost of reduced accuracy of conflicting-access information ==2965601== For lists of detected and suppressed errors, rerun with: -s ==2965601== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) EXPECTED RESULT There should be no errors shown. SOFTWARE/OS VERSIONS - Valgrind 3.18.1 (built from source) - gcc 9.4.0 - Ubuntu 20.04.4 ADDITIONAL INFORMATION The error is not shown if the semaphore is dynamically allocated.