I got this kind of error with my KDE application, and it occurs with kwrite too: Reproducible: Always Steps to Reproduce: 1. valgrind kwrite Actual Results: ==5570== Memcheck, a memory error detector ==5570== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==5570== Using Valgrind-3.8.0 and LibVEX; rerun with -h for copyright info ==5570== Command: kwrite ==5570== --5570-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --5570-- si_code=1; Faulting address: 0x403796000; sp: 0x4030d1018 valgrind: the 'impossible' happened: Killed by fatal signal ==5570== at 0x380D3B00: read_leb128 (readdwarf.c:221) ==5570== by 0x380D3B96: read_leb128U (readdwarf.c:247) ==5570== by 0x380D6294: vgModuleLocal_read_debuginfo_dwarf3 (readdwarf.c:956) ==5570== by 0x38085FEF: vgModuleLocal_read_elf_debug_info (readelf.c:2682) ==5570== by 0x3807EED5: vgPlain_di_notify_mmap (debuginfo.c:628) ==5570== by 0x380A0E68: vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:2066) ==5570== by 0x380CA0C4: vgSysWrap_amd64_linux_sys_mmap_before (syswrap-amd64-linux.c:1012) ==5570== by 0x3809D9B2: vgPlain_client_syscall (syswrap-main.c:1464) ==5570== by 0x3809A6FF: handle_syscall (scheduler.c:1057) ==5570== by 0x3809BC36: vgPlain_scheduler (scheduler.c:1335) ==5570== by 0x380AB739: run_a_thread_NORETURN (syswrap-linux.c:103) sched status: running_tid=1 Thread 1: status = VgTs_Runnable ==5570== at 0x30C8217A0A: mmap (syscall-template.S:81) ==5570== by 0x30C82068DB: _dl_map_object_from_fd (dl-load.c:1344) ==5570== by 0x30C82083C2: _dl_map_object (dl-load.c:2359) ==5570== by 0x30C820CCE1: openaux (dl-deps.c:63) ==5570== by 0x30C820EDC5: _dl_catch_error (dl-error.c:177) ==5570== by 0x30C820D3C1: _dl_map_object_deps (dl-deps.c:256) ==5570== by 0x30C820377B: dl_main (rtld.c:1834) ==5570== by 0x30C821529A: _dl_sysdep_start (dl-sysdep.c:242) ==5570== by 0x30C8204FC1: _dl_start (rtld.c:336) ==5570== by 0x30C8201597: ??? (in /usr/lib64/ld-2.16.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. Expected Results: No crash
Created attachment 73360 [details] Correctly skip block forms This bug is also https://bugzilla.redhat.com/show_bug.cgi?id=849783 Block forms aren't correctly skipped. p += read_leb128U( &p ); results in undefined behaviour. The underlying issue is that with dwz there are partial_units and import_units which valgrind doesn't understand yet, so it keeps looking for a "proper" compile_unit TAG which it never finds and so needs to process all the DIEs. (Which should get its own bug/fix.)
Created attachment 73363 [details] make read_unitinfo_dwarf2 more robust jseward asked if we could make things a little bit more robust here. This patch makes the loops bound by checking we don't go beyond end_img. There is still some potential for bad things to happen in for example the read_leb128 functions, which don't know how far they can read. But this patch alone would have prevented the crash seen (without the other patch). With only this patch applied the the testcase would spew some warnings/errors, but continue running.
Created attachment 73368 [details] make read_unitinfo_dwarf2 more robust (In reply to comment #2) > With only this patch applied the the testcase would spew some warnings/errors, but > continue running. Which isn't too surprising since I messed up the check and no abbrev was read anymore... doh. This new patch checks the correct bounds.
Created attachment 73411 [details] GDB trace with those patches Even with those patches, "valgrind rpm -q" still segfaults on Mageia Cauldron when rpm-debug is installed (equivalent of rpm-debuginfo on RH/FC)
Created attachment 73412 [details] GDB trace with those patches with more debug info
(In reply to comment #4) > Created attachment 73411 [details] > GDB trace with those patches > > Even with those patches, "valgrind rpm -q" still segfaults on Mageia > Cauldron when rpm-debug is installed (equivalent of rpm-debuginfo on RH/FC) I am not sure that is the same bug, it happens later. I also cannot replicate it (against fedora rpm). You are still seeing ### unknown abbrev 0x124 messages. That indicates either that you didn't apply patch 1, or there is some other DWARF parsing bug in there. Could you run with -v to get some more info, rebuild with CFLAGS="-g -O0" to help your gdb a bit (or maybe install a newer gdb) and change the if (0) to if (1) inside read_unitinfo_dwarf2() so the VG_(printf)s are triggered to get a bit more info about what is going on.
Created attachment 73435 [details] valgrind output with s/if (0/if (1/ Here's the ouput you asked for
Created attachment 73436 [details] GDB trace with those patches with more debug info GDB trace with -O0 (& -g3 like previous trace) Note that I only "unknown abbrev" on the first run, I never saw them anymore in all the next runs
BTW, you can see our packages spec file & sources at http://svnweb.mageia.org/packages/cauldron/valgrind/current/
Created attachment 73504 [details] GDB trace with those patches with debug info (+vg printff enabled) I've upgraded binutils & dwz, rebuild rpm but still, once rpm-debug is installed, valgrind segfaults (same error but it now parses quite a lot more info before crashing). As before, the last open files are those of rpm-debug
Created attachment 73505 [details] GDB trace with those patches with debug info (vg printff disabled)
Created attachment 73507 [details] prevent valgrind to segfault For the record, both ui.name, ui.compdir results in segfault. With that patch, it no more segfaults.
Found an issue that would explain why It has been fixed in r13367 "read_unitinfo_dwarf2 DW_FORM_ref_addr is address size in DWARF version 2." The reason the read_unitinfo_dwarf2 DWARF reader is giving some trouble now is that it is only really used to pick up the debug_line info from the compile unit DIE. That used to be simple since it is just the first DIE of a CU. But with DWZ there can also be partial units. Which read_unitinfo_dwarf2 completely scans. So now a lot more of the parser is exercised. But I wonder if we shouldn't simplify read_unitinfo_dwarf2 to at least skip partial units. Or maybe even just parse the debug_line section directly.
Created attachment 78833 [details] Simplify read_unitinfo_dwarf2. Only try to read the first DIE. > But I wonder if we shouldn't simplify read_unitinfo_dwarf2 to at least skip > partial units. We can do even better by just checking the first DIE tag. It must be DW_TAG_compile_unit. If it isn't then we really don't need to scan through the whole DIE tree. The children will not have the information we need anyway (only the first top-level DW_TAG_compule_unit will have a DW_AT_stmt_list). Quote DWARF spec: "For each compilation unit compiled with a DWARF producer, a contribution is made to the .debug_info section of the object file. Each such contribution consists of a compilation unit header (see Section 7.5.1.1) followed by a single DW_TAG_compile_unit or DW_TAG_partial_unit debugging information entry, together with its children."
BTW. attachment 73360 [details] (" Correctly skip block forms") was already committed as r12892 Fix skipping of block forms when this code is compiled by gcc 4.8 -- it has always been incorrect, modifying 'p' twice between sequence points. Fixes #305513. (Mark Wielaard, mjw@redhat.com)
13369(In reply to comment #14) > Created attachment 78833 [details] > Simplify read_unitinfo_dwarf2. Only try to read the first DIE. After a brief discussion on irc nobody seemed too concerned about this change and I think it simplifies the code. So committed as r13369. I think we can close this bug since the root cause of SIGSEGV has probably been dealt with. If there are new/other crashes in this area it would be better to get new bug reports about them.
Fix committed by mjw as r13369.