| Summary: | Helgrind Assertion 'lk->kind == LK_rdwr' failed. | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | Aaron Merey <amerey> |
| Component: | helgrind | Assignee: | Paul Floyd <pjfloyd> |
| Status: | REPORTED --- | ||
| Severity: | normal | CC: | amerey, pjfloyd |
| Priority: | NOR | ||
| Version First Reported In: | 3.25.1 | ||
| Target Milestone: | --- | ||
| Platform: | Other | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
I should also note that the test triggering this helgrind error is exclusively single threaded. and here is a more informative valgrind backtrace: ==360952== at 0x580275EF: show_sched_status_wrk.lto_priv.0 (m_libcassert.c:426) ==360952== by 0x5802762F: report_and_quit (m_libcassert.c:497) ==360952== by 0x580277AA: vgPlain_assert_fail (m_libcassert.c:563) ==360952== by 0x580038C9: lockN_acquire_reader (hg_main.c:309) ==360952== by 0x580043CA: evhH__post_thread_r_acquires_lock (hg_main.c:1256) ==360952== by 0x58013464: hg_handle_client_request (hg_main.c:5581) ==360952== by 0x5809492B: UnknownInlinedFun (scheduler.c:2280) ==360952== by 0x5809492B: vgPlain_scheduler (scheduler.c:1573) ==360952== by 0x580EFAE7: UnknownInlinedFun (syswrap-linux.c:102) ==360952== by 0x580EFAE7: run_a_thread_NORETURN.lto_priv.0 (syswrap-linux.c:155) The line of code that is firing the assert is
tl_assert(lk->kind == LK_rdwr);
where kind ins an enum
enum {
LK_mbRec=1001, /* normal mutex, possibly recursive */
LK_nonRec, /* normal mutex, definitely non recursive */
LK_rdwr /* reader-writer lock */
}
LockKind;
Aaron, could you build Valgrind with a small change in the source so that we can see what the input kind is?
Before the assert on line 309 put something like
if (lk->kind != LK_rdwr)
VG_(printf)("Unexpected lock acquire type is %d\n", lk->kind);
|
I encountered a failed assert while running an elfutils tests under helgrind on Fedora 43 with kernel 6.17.11-300.fc43.x86_64. Helgrind believes that pthread_rwlock_{rd, rw}lock is called with a pthread_mutex_t* argument and the following assert fails: Helgrind: hg_main.c:309 (lockN_acquire_reader): Assertion 'lk->kind == LK_rdwr' failed. I took a look at the chain of mutex and rwlock inits, locks and unlocks and I can't see any mixing of mutexes and rwlocks. Memcheck, UBSan and ASan show no errors and I haven't found any other evidence of memory corruption, overwriting or unexpected memcpy. Strangely the error disappears if I move the declaration of the lock to a different place within the containing struct in libdw. I also had a very similar case a few months ago with a different lock in a separate internal libdw struct where moving the lock declarations prevented the helgrind error. In this case as well all other testing with memcheck, UBSan, etc, didn't find any errors. Here is the full helgrind output ==276459== ---Thread-Announcement------------------------------------------ ==276459== ==276459== Thread #1 is the program's root thread ==276459== ==276459== ---------------------------------------------------------------- ==276459== ==276459== Thread #1: pthread_rwlock_{rd,rw}lock with a pthread_mutex_t* argument ==276459== at 0x484C01F: pthread_rwlock_rdlock_WRK (hg_intercepts.c:2552) ==276459== by 0x484FCFE: pthread_rwlock_rdlock (hg_intercepts.c:2573) ==276459== by 0x48D2E1B: eu_tfind (eu-search.c:48) ==276459== by 0x48AF5E5: ____libdw_findcu (libdw_findcu.c:250) ==276459== by 0x48AF704: __libdw_findcu (libdw_findcu.c:298) ==276459== by 0x489E54A: __libdw_offdie (dwarf_offdie.c:61) ==276459== by 0x489E591: dwarf_offdie (dwarf_offdie.c:76) ==276459== by 0x404259: collect_sourcefiles(Dwfl_Module*, void**, char const*, unsigned long, void*) (srcfiles.cxx:237) ==276459== by 0x48BDA1A: dwfl_getmodules (dwfl_getmodules.c:86) ==276459== by 0x403DF4: main (srcfiles.cxx:441) ==276459== Helgrind: hg_main.c:309 (lockN_acquire_reader): Assertion 'lk->kind == LK_rdwr' failed. host stacktrace: ==276459== at 0x580275EF: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x5802762F: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x580277AA: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x580038C9: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x580043CA: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x58013464: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x5809492B: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) ==276459== by 0x580EFAE7: ??? (in /usr/libexec/valgrind/helgrind-amd64-linux) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 276459) ==276459== at 0x484C0F6: pthread_rwlock_rdlock_WRK (hg_intercepts.c:2558) ==276459== by 0x484FCFE: pthread_rwlock_rdlock (hg_intercepts.c:2573) ==276459== by 0x48D2E1B: eu_tfind (eu-search.c:48) ==276459== by 0x48AF5E5: ____libdw_findcu (libdw_findcu.c:250) ==276459== by 0x48AF704: __libdw_findcu (libdw_findcu.c:298) ==276459== by 0x489E54A: __libdw_offdie (dwarf_offdie.c:61) ==276459== by 0x489E591: dwarf_offdie (dwarf_offdie.c:76) ==276459== by 0x404259: collect_sourcefiles(Dwfl_Module*, void**, char const*, unsigned long, void*) (srcfiles.cxx:237) ==276459== by 0x48BDA1A: dwfl_getmodules (dwfl_getmodules.c:86) ==276459== by 0x403DF4: main (srcfiles.cxx:441) client stack range: [0x1FFEFFA000 0x1FFF000FFF] client SP: 0x1FFEFFBB80 valgrind stack range: [0x1002E8E000 0x1002F8DFFF] top usage: 9712 of 1048576 Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org