Created attachment 43089 [details] test program to cause the helgrind error Using valgrind from SVN @r11106. This is an OS X 10.5 machine. "uname -a" output: Darwin mcswl176.mcs.anl.gov 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_I386 i386 i386 MacBookPro3,1 Darwin When using helgrind on simple code that uses pthread_cond_wait (and related functionality) in a typical manner, it complains that the non-recursive mutex associated with the cond var is recursively locked. The message is as follows: --------8<-------- ==58084== Thread #2 was created ==58084== at 0x227FDE: __bsdthread_create (in /usr/lib/libSystem.B.dylib) ==58084== by 0x17738: pthread_create_WRK (hg_intercepts.c:257) ==58084== by 0x1780C: pthread_create (hg_intercepts.c:294) ==58084== by 0x1D43: main (foo.c:49) ==58084== ==58084== Thread #2: Bug in libpthread: recursive write lock granted on mutex/wrlock which does not support recursion ==58084== at 0x1825F: pthread_cond_wait* (hg_intercepts.c:694) ==58084== by 0x1B34: run_fn (foo.c:23) ==58084== by 0x1766C: mythread_wrapper (hg_intercepts.c:221) ==58084== by 0x228154: _pthread_start (in /usr/lib/libSystem.B.dylib) ==58084== by 0x228011: thread_start (in /usr/lib/libSystem.B.dylib) ==58084== --------8<-------- A simple test case (foo.c) is attached that will elicit the error. My understanding of helgrind's implementation is limited, but the error appears to be caused by two invocations of _VG_USERREQ__HG_PTHREAD_MUTEX_LOCK_POST. One comes directly from the cond_wait wrapper (hg_intercepts.c:685). But the other one comes just before, apparently from the wrapper for mutex_lock (hg_intercepts.c:496). I think that the actual system pthread_cond_wait implementation is calling pthread_mutex_lock under the hood, which is intercepted by valgrind. Oddly, even though it's listed in [1], my test case doesn't seem to cause pthread_cond_wait to call pthread_mutex_unlock. I'm not sure why at this point. This might be a Darwin-only bug. A co-worker tested with 3.5.0 on Linux/x86_64 and didn't get this error from helgrind. I've also attached a patch that makes the problem go away, although it's entirely possible that it's not the best fix. Perhaps a configure-time or run-time test for whether cond_wait uses the mutex functions under the hood is better? Also, I don't know why a similar guard isn't needed on the unlock_pre client request. A similar problem probably exists for pthread_cond_timedwait, but I haven't tested it yet. [1] http://www.opensource.apple.com/source/Libc/Libc-498/pthreads/pthread_cond.c
Created attachment 43090 [details] a hack-ish patch to fix the bug