While attempting to run "kwrite" (from KDE), valgrind crashes with the message: valgrind: m_debuginfo/image.c:517 (realloc_CEnt): Assertion 'szB >= CACHE_ENTRY_SIZE' failed. host stacktrace: ==15349== at 0x58087103: show_sched_status_wrk (m_libcassert.c:355) ==15349== by 0x58087204: report_and_quit (m_libcassert.c:426) ==15349== by 0x58087399: vgPlain_assert_fail (m_libcassert.c:492) ==15349== by 0x58128C0A: realloc_CEnt (image.c:517) ==15349== by 0x58128C0A: get_slowcase (image.c:773) ==15349== by 0x58128DD7: get (image.c:816) ==15349== by 0x58128DD7: vgModuleLocal_img_get (image.c:1088) ==15349== by 0x58128EC8: vgModuleLocal_img_get_ULong (image.c:1188) ==15349== by 0x581303B3: get_ULong (readdwarf3.c:285) ==15349== by 0x581307D2: get_UWord (readdwarf3.c:352) ==15349== by 0x581307D2: make_general_GX (readdwarf3.c:691) ==15349== by 0x58136212: parse_var_DIE (readdwarf3.c:2296) ==15349== by 0x58136CF6: read_DIE (readdwarf3.c:4219) ==15349== by 0x58136E96: read_DIE (readdwarf3.c:4280) ==15349== by 0x58136E96: read_DIE (readdwarf3.c:4280) ==15349== by 0x5813781D: new_dwarf3_reader_wrk.constprop.31 (readdwarf3.c:4757) ==15349== by 0x5813993F: vgModuleLocal_new_dwarf3_reader (readdwarf3.c:5200) ==15349== by 0x580C5D6B: vgModuleLocal_read_elf_debug_info (readelf.c:3111) ==15349== by 0x580B91BA: di_notify_ACHIEVE_ACCEPT_STATE (debuginfo.c:748) ==15349== by 0x580B91BA: vgPlain_di_notify_mmap (debuginfo.c:1063) ==15349== by 0x580E2FBD: vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:2388) ==15349== by 0x58117BF1: vgSysWrap_amd64_linux_sys_mmap_before (syswrap-amd64-linux.c:400) ==15349== by 0x580DFA5A: vgPlain_client_syscall (syswrap-main.c:1857) ==15349== by 0x580DC61A: handle_syscall (scheduler.c:1126) ==15349== by 0x580DDB2E: vgPlain_scheduler (scheduler.c:1443) ==15349== by 0x580ED146: thread_wrapper (syswrap-linux.c:103) ==15349== by 0x580ED146: run_a_thread_NORETURN (syswrap-linux.c:156) Kwrite (and a couple other components) were built with -O0 through Gentoo's emerge (I was looking for a separate bug). I'm using Gentoo "Valgrind-3.13.0 and LibVEX" on x86_64.
Just prior to the crash, the last log message was: --15349-- Reading syms from /usr/lib64/libQt5Qml.so.5.7.1 --15349-- Considering /usr/lib/debug/usr/lib64/libQt5Qml.so.5.7.1.debug .. --15349-- .. CRC is valid Would attaching the debug symbols file help?
I can't imagine how this failed. Can you still reproduce it?
Despite my attempts, I am no longer able to trigger this. I do not recall what bug I was looking at when I stumbled onto this, and thus can't retrace my steps. Also, Gentoo has upgraded GCC since this was originally reported, so that might have had an effect on this as well. I was confident I had saved the debugging symbols file somewhere in case it would be required, but can't find it. *sigh* I suppose I am unable to provide you with additional information at present :(
We are able to consistently reproduce this with Valgrind-3.15.0-608cb11914-20190413 (Different application, not kwrite)
In the failure, the values are as such: szB=424 CACHE_ENTRY_SIZE=8192
We found that the assertion is no longer hit when we converted our application from compressed to uncompressed debug symbols.
This seems to be a logic bug in the realloc_CEnt function that was never adjusted for compressed symbol support. alloc_CEnt has this logic: if (fromC) { // szB can be arbitrary } else { vg_assert(szB == CACHE_ENTRY_SIZE); } However realloc_CEnt does not have such a fromC argument and unconditionally checks vg_assert(szB >= CACHE_ENTRY_SIZE); Shouldn't these simply be aligned in behaviour? Unfortunately I can't share any examples, but I would greatly appreciate if someone could check my logic and consider a patch based on that. I think it requires a rather large binary with lots of debug symbols, as the cache re-uses compressed entries last, and that is when this bug happens.
I can confirm that something trivial like e.g. below fixes it: --- a/coregrind/m_debuginfo/image.c +++ b/coregrind/m_debuginfo/image.c @@ -509,10 +509,10 @@ static UInt alloc_CEnt ( DiImage* img, SizeT szB, Bool fromC ) return entNo; } -static void realloc_CEnt ( DiImage* img, UInt entNo, SizeT szB ) +static void realloc_CEnt ( DiImage* img, UInt entNo, SizeT szB, Bool fromC ) { vg_assert(img != NULL); - vg_assert(szB >= CACHE_ENTRY_SIZE); + vg_assert(fromC || szB >= CACHE_ENTRY_SIZE); vg_assert(is_sane_CEnt("realloc_CEnt-pre", img, entNo)); img->ces[entNo] = ML_(dinfo_realloc)("di.realloc_CEnt.1", img->ces[entNo], @@ -768,7 +768,7 @@ static UChar get_slowcase ( DiImage* img, DiOffT off ) } vg_assert(i >= 0 && i < CACHE_N_ENTRIES); - realloc_CEnt(img, i, size); + realloc_CEnt(img, i, size, /*fromC?*/cslc != NULL); img->ces[i]->size = size; img->ces[i]->used = 0; if (cslc == NULL) {
Committed, 3542be5bdc706b1a7d5d080ea01e81d4791e20b4. Thank you for the patch and the analysis.