Bug 374414 - unlocking mutex before calling notify_one/all on condition variable triggers an error
Summary: unlocking mutex before calling notify_one/all on condition variable triggers ...
Status: RESOLVED NOT A BUG
Alias: None
Product: valgrind
Classification: Developer tools
Component: helgrind (show other bugs)
Version: 3.10.0
Platform: Debian stable Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-01 11:20 UTC by Yuval Lifshitz
Modified: 2017-01-04 15:52 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
file causing the error (1.31 KB, text/x-csrc)
2017-01-01 11:20 UTC, Yuval Lifshitz
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yuval Lifshitz 2017-01-01 11:20:04 UTC
Created attachment 103121 [details]
file causing the error

When compiling the example from here (also attached file): http://en.cppreference.com/w/cpp/thread/condition_variable
for using a C++11 std::condition_variable.
compilation:
 g++ -Wall -std=c++11 -g -O0 -Wall -o condition_variable condition_variable.cpp -lpthread

It is specifically recommended to unlock the mutex before notification. However, helgrind reports an error. If code is modified to send notification when mutex is locked, the error disappears.

$ valgrind --tool=helgrind ./condition_variable
==11932== Helgrind, a thread error detector
==11932== Copyright (C) 2007-2013, and GNU GPL'd, by OpenWorks LLP et al.
==11932== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==11932== Command: ./condition_variable
==11932==
main() signals data ready for processing
==11932== ---Thread-Announcement------------------------------------------
==11932==
==11932== Thread #1 is the program's root thread
==11932==
==11932== ----------------------------------------------------------------
==11932==
==11932== Thread #1: pthread_cond_{signal,broadcast}: dubious: associated lock is not held by any thread
==11932==    at 0x4C2D195: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==11932==    by 0x510AC98: std::condition_variable::notify_one() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==11932==    by 0x401562: main (condition_variable.cpp:44)
==11932==
Worker thread is processing data
Worker thread signals data processing completed
==11932== ---Thread-Announcement------------------------------------------
==11932==
==11932== Thread #2 was created
==11932==    at 0x59625EE: clone (clone.S:74)
==11932==    by 0x4E422B9: do_clone.constprop.3 (createthread.c:75)
==11932==    by 0x4E43762: create_thread (createthread.c:245)
==11932==    by 0x4E43762: pthread_create@@GLIBC_2.2.5 (pthread_create.c:606)
==11932==    by 0x4C2EEBD: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==11932==    by 0x510EA90: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==11932==    by 0x401AFC: std::thread::thread<void (&)()>(void (&)()) (thread:138)
==11932==    by 0x401516: main (condition_variable.cpp:35)
==11932==
==11932== ----------------------------------------------------------------
==11932==
==11932== Thread #2: pthread_cond_{signal,broadcast}: dubious: associated lock is not held by any thread
==11932==    at 0x4C2D195: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==11932==    by 0x510AC98: std::condition_variable::notify_one() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==11932==    by 0x4014BC: worker_thread() (condition_variable.cpp:30)
==11932==    by 0x402C22: void std::_Bind_simple<void (*())()>::_M_invoke<>(std::_Index_tuple<>) (functional:1700)
==11932==    by 0x402B6A: std::_Bind_simple<void (*())()>::operator()() (functional:1688)
==11932==    by 0x402AE7: std::thread::_Impl<std::_Bind_simple<void (*())()> >::_M_run() (thread:115)
==11932==    by 0x510E96F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==11932==    by 0x4C2F056: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==11932==    by 0x4E430A3: start_thread (pthread_create.c:309)
==11932==    by 0x596262C: clone (clone.S:111)
==11932==
Back in main(), data = Example data after processing
==11932==
==11932== For counts of detected and suppressed errors, rerun with: -v
==11932== Use --history-level=approx or =none to gain increased speed, at
==11932== the cost of reduced accuracy of conflicting-access information
==11932== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 33 from 33)
Comment 1 Ivo Raisr 2017-01-01 13:04:04 UTC
POSIX [1] also states that the lock does not need to be held:

"The pthread_cond_broadcast() or pthread_cond_signal() functions may be called by a thread whether or not it currently owns the mutex that threads calling pthread_cond_wait() or pthread_cond_timedwait() have associated with the condition variable during their waits; however, if predictable scheduling behavior is required, then that mutex shall be locked by the thread calling pthread_cond_broadcast() or pthread_cond_signal()."

The above applies to mutexes that have priorities associated
with them, that is, either ceiling mutexes (PTHREAD_PRIO_PROTECT) or
priority-inheritance mutexes (PTHREAD_PRIO_INHERIT). Normal mutexes
do not contain any priority scheduling; therefore there is usually a (tiny) performance benefit to call mutex_unlock() before calling cond_signal(). 

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_signal.html
Comment 2 Bart Van Assche 2017-01-01 13:22:43 UTC
Why is this behavior considered to be a bug? Not holding the associated mutex around pthread_cond_signal() may lead to missed wakeups. It is easy to see in the example on http://en.cppreference.com/w/cpp/thread/condition_variable that a wakeup may be missed. Inserting a call to sleep() after std::unique_lock<std::mutex> lk(m) and before cv.wait(lk, []{return processed;}) will cause the wakeup not to be noticed.
Comment 3 Yuval Lifshitz 2017-01-04 14:37:13 UTC
what about the following case:
we change: 
cv.wait(lk, []{return ready;});
to (which should be similar):
while (!ready) cv.wait(lk);
in this case, as long as there is only one thread waiting on the condition the program would always be correct - even if we add sleep before the waiting.
the waiter cannot be sleeping while "ready" is modified, because it is mutex protected. this mean that either "ready" was modified before the waiter takes its mutex, and then it does not even wait. or it is waiting and then it is getting notified.
note that even after the code is modified helgrind still gives the same error
Comment 4 Bart Van Assche 2017-01-04 15:02:32 UTC
I do not agree that a loop like "while (!ready) cv.wait(lk);" would be a correct way to wait for cv to be signaled. If a sleep() statement would be inserted before or inside that loop the wake-up could still be missed.
Comment 5 Yuval Lifshitz 2017-01-04 15:52:22 UTC
didn't say that wake-up is not missed. but the program would still perform correctly, since the protected variables "ready" and "processed" are either being checked inside the mutex or being waited on using the condition variable.
downside for signaling when not locked is explained here: http://www.domaigne.com/blog/computing/condvars-signal-with-mutex-locked-or-not/
but this is not the case in the sample from cppreference