I get the following 4 failures drd/tests/fork-parallel (stderr) drd/tests/fork-serial (stderr) drd/tests/threaded-fork-vcs (stderr) drd/tests/threaded-fork (stder They all look like lock reinitialization issues. The diff for fork-serial is +Thread 2: +Reader-writer lock reinitialization: rwlock 0x......... + at 0x........: pthread_rwlock_init (drd_pthread_intercepts.c:?) + by 0x........: fork (in /...libc...) + by 0x........: startproc (fork.c:?) + by 0x........: vgDrd_thread_wrapper (drd_pthread_intercepts.c:?) + by 0x........: start_thread + by 0x........: clone (in /...libc...) +rwlock 0x........ was first observed at: + at 0x........: pthread_rwlock_rdlock (drd_pthread_intercepts.c:?) + by 0x........: _Fork (in /...libc...) + by 0x........: fork (in /...libc...) + by 0x........: startproc (fork.c:?) + by 0x........: vgDrd_thread_wrapper (drd_pthread_intercepts.c:?) + by 0x........: start_thread + by 0x........: clone (in /...libc...) + /bin/ls +Thread 3: +Reader-writer lock reinitialization: rwlock 0x......... + at 0x........: pthread_rwlock_init (drd_pthread_intercepts.c:?) + by 0x........: fork (in /...libc...) + by 0x........: startproc (fork.c:?) + by 0x........: vgDrd_thread_wrapper (drd_pthread_intercepts.c:?) + by 0x........: start_thread + by 0x........: clone (in /...libc...) +rwlock 0x........ was first observed at: + at 0x........: pthread_rwlock_rdlock (drd_pthread_intercepts.c:?) + by 0x........: _Fork (in /...libc...) + by 0x........: fork (in /...libc...) + by 0x........: startproc (fork.c:?) + by 0x........: vgDrd_thread_wrapper (drd_pthread_intercepts.c:?) + by 0x........: start_thread + by 0x........: clone (in /...libc...) + My guess is that this is a glibc 2.41 issue.
Lookin at this it looks like it is a regression in glibc. Version 2.41 added some locking around fork(). Without filtering the errors look like ==101957== Thread 3: ==101957== Reader-writer lock reinitialization: rwlock 0x4a71b60. ==101957== at 0x486191A: pthread_rwlock_init_intercept (drd_pthread_intercepts.c:1734) ==101957== by 0x486191A: pthread_rwlock_init@* (drd_pthread_intercepts.c:1742) ==101957== by 0x494CA6F: fork (fork.c:88) ==101957== by 0x4004EE: startproc (fork.c:18) ==101957== by 0x4848562: vgDrd_thread_wrapper (drd_pthread_intercepts.c:512) ==101957== by 0x48F91D3: start_thread (pthread_create.c:448) ==101957== by 0x497BB13: clone (clone.S:100) ==101957== rwlock 0x4a71b60 was first observed at: ==101957== at 0x4862ABA: pthread_rwlock_rdlock_intercept (drd_pthread_intercepts.c:1798) ==101957== by 0x4862ABA: pthread_rwlock_rdlock@* (drd_pthread_intercepts.c:1812) ==101957== by 0x4946CE4: _Fork (_Fork.c:31) ==101957== by 0x494CA1F: fork (fork.c:75) ==101957== by 0x4004EE: startproc (fork.c:18) ==101957== by 0x4848562: vgDrd_thread_wrapper (drd_pthread_intercepts.c:512) ==101957== by 0x48F91D3: start_thread (pthread_create.c:448) ==101957== by 0x497BB13: clone (clone.S:100) Looking at the source for that, the creation of the phtread_rwlock looks like internal_sigset_t original_sigmask; __abort_lock_rdlock (&original_sigmask); which is calling into this code __libc_rwlock_define_initialized (static, lock); void __abort_lock_rdlock (internal_sigset_t *set) { internal_signal_block_all (set); __libc_rwlock_rdlock (lock); } as always navigating the macros is non-obvious but I guess that comes from #define __libc_rwlock_define_initialized(CLASS,NAME) \ CLASS __libc_rwlock_t NAME = PTHREAD_RWLOCK_INITIALIZER; So the summary here is that pthread_rwlock_rdlock is being called using a file static that is correctly initialized. Then where the error occurs the code is call_function_static_weak (__abort_fork_reset_child); which does void __abort_fork_reset_child (void) { __libc_rwlock_init (lock); } Note that I don't see any intervening calls to pthread_rwlock_destroy. The man page says Results are undefined if pthread_rwlock_init() is called specifying an already initialized read-write lock.
I've opened a glibc bugzilla item for this: https://sourceware.org/bugzilla/show_bug.cgi?id=32994
This looks like a glibc bug. At least the proposed fix for glibc in https://sourceware.org/bugzilla/show_bug.cgi?id=32994 works for me.
(In reply to Mark Wielaard from comment #3) > This looks like a glibc bug. At least the proposed fix for glibc in > https://sourceware.org/bugzilla/show_bug.cgi?id=32994 works for me. That patch is now in glibc git and on the 2.41 and 2.42 release branches so once distros pick that up these drd failures should be gone.
With Fedora 42 I now get -- Finished tests in drd/tests (in 241 sec) ---------------------------- == 136 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==