When reading the debuginfo (DWARF) of a file that uses dwz debug altfiles the handling of the DW_OP_GNU_ref_alt isn't going as expected when --read-inline-info=yes is used. Errors include: --27307-- WARNING: Serious error when reading debug info --27307-- When reading debug info from /opt/local/src/valgrind/inlinfo1: --27307-- abbv_code not found in ht_abbvs table and/or --27254-- WARNING: Serious error when reading debug info --27254-- When reading debug info from /usr/lib64/liblzma.so.5.0.99: --27254-- get_inlFnName: absori not a subprogram Reproducible: Always This partly depends on solving bug #338791 "alt dwz files can be relative of debug/main file" to see it with non-system binaries/libraries. Philippe analysed it and the issue has at least two parts: - the inline function absorigin pointing to alt debug info is wrongly used as being in the normal debug info (passing around DIEs means "cooking/uncooking" them, which get_inlFnName doesn't do). - then the abbreviation used in the absori is wrongly interpreted as an abbreviation coming from the normal debug info; while it should be in the alt debug info. The fix for this will very probably imply to have 2 abbrev hash tables in the cc : one for the normal info; and one for the alt info. [Note: "absori" refers to the DIE referenced by the DW_AT_abstract_origin attribute of an DW_TAG_inlined_subroutine DIE.]
Created attachment 88565 [details] Testcase for bug 338803. Handling of dwz debug alt files is broken. If we have dwz installed create inlinfoalt and inlinfoalt.dwz from the original inlinfo testcase. The expected output should be the same as from the inlinfo testcase. Depends on fix of bug #338791. Currently fails.
Created attachment 88566 [details] Partial fix Partial fix based on code from Philippe. This doesn't produce any warnings anymore with the new inlinfoalt testcase, but the stacktrace is not correct/complete. Probably because get_abbv () returns the wrong result in the alt case.
Created attachment 88567 [details] complete patch to fix inline info reading in alternate debug info Solves all known problems. And contains an ugly kludge.
Created attachment 88578 [details] Updated Testcase for bug 338803. Handling of dwz debug alt files is broken. If we have dwz installed create inlinfoalt and inlinfoalt.dwz from the original inlinfo testcase. The expected output should be the same as from the inlinfo testcase. Updated to include the new symlinked exp files in EXTRA_DIST and add stderr_filter_args: inlinfo to inlinfoalt.vgtest so the filters really work as if this really is inlinfo. This now fails without the proposed fix and and passes with.
(In reply to Mark Wielaard from comment #4) > This now fails without the proposed fix and and passes with. But I am afraid the proposed fix still isn't completely correct. I can still get the wrong abbrev being handled with larger programs. The issue as far as I can see is that the abbrev cache is only for the "current CU", but a DIE ref can be in a completely different CU (either full or partial). I'll try to get a smaller testcase.
To show what seems to go wrong with the larger example. First we see this CU that contains the subprogram definition: Compilation Unit @ offset 0x5e4: Length: 41 Version: 4 Abbrev Offset: 20884 Pointer Size: 8 Adding abbv_code 1 TAG DW_TAG_formal_parameter [no children] nf 3 [8,0] [4, 0] [0,0] [...] Adding abbv_code 98 TAG DW_TAG_subprogram [has children] nf 7 [11,0] [7,0] [6,0] [5,0] [5,0] [1,0] [0,0] [...] <0><5ef>: Abbrev Number: 25 (DW_TAG_partial_unit) DW_AT_stmt_list : 921 DW_AT_comp_dir : (indirect alt string, offset: 0x1d77): /usr/src/debug/xz-5.1.2alpha/src/liblzma The Directory Table: common /usr/include ../../src/liblzma/api/lzma read_filename_table: 1 fndn_ix 158 common block_util.c read_filename_table: 2 fndn_ix 159 common index.h read_filename_table: 3 fndn_ix 2 /usr/include stdint.h read_filename_table: 4 fndn_ix 148 ../../src/liblzma/api/lzma base.h read_filename_table: 5 fndn_ix 155 ../../src/liblzma/api/lzma vli.h read_filename_table: 6 fndn_ix 156 ../../src/liblzma/api/lzma check.h read_filename_table: 7 fndn_ix 157 ../../src/liblzma/api/lzma filter.h read_filename_table: 8 fndn_ix 160 ../../src/liblzma/api/lzma block.h <1><5f8>: Abbrev Number: 98 (DW_TAG_subprogram) DW_AT_name : (indirect alt string, offset: 0x774): vli_ceil4 DW_AT_decl_file : 2 DW_AT_decl_line : 39 DW_AT_prototyped : 1 DW_AT_type : 0x30 DW_AT_inline : 3 uninteresting DIE -> skipping ... [...] Then some time later we see this CU that contains the inlined_subroutine: Compilation Unit @ offset 0x3321: Length: 384 Version: 4 Abbrev Offset: 3753 Pointer Size: 8 Adding abbv_code 1 TAG DW_TAG_formal_parameter [no children] nf 6 [14,0] [1 0,0] [9,0] [8,0] [4,0] [0,0] [...] Adding abbv_code 98 TAG DW_TAG_inlined_subroutine [has children] nf 6 [12,2] [8,2] [4294967295,3] [2,0] [1,0] [0,0] [...] <0><332c>: Abbrev Number: 80 (DW_TAG_compile_unit) DW_AT_producer : (indirect alt string, offset: 0x58a): GNU C 4.8.2 20140120 (Red Hat 4.8.2-12) -m64 -mtune=generic -march=x86-64 -g -O2 -std=gnu99 -fvisibility=hidden -fexceptions -fstack-protector-strong -fPIC --param ssp-buffer-size=4 DW_AT_language : 1 DW_AT_name : (indirect alt string, offset: 0x1f4f): common/block_util.c DW_AT_comp_dir : (indirect alt string, offset: 0x1d77): /usr/src/debug/xz-5.1.2alpha/src/liblzma DW_AT_low_pc : 0x3720 DW_AT_high_pc : 268 DW_AT_stmt_list : 921 The Directory Table: common /usr/include ../../src/liblzma/api/lzma [...] <2><3475>: Abbrev Number: 87 (DW_TAG_inlined_subroutine) DW_AT_abstract_ori: 0x5f8 DW_AT_low_pc : 0x381f DW_AT_high_pc : 8 DW_AT_call_file : 1 DW_AT_call_line : 87 DW_AT_sibling : <3491> <get_inlFnName><5f8>: Abbrev Number: 98 (DW_TAG_inlined_subroutine) ------ .debug_info reading failed ------ --9061-- WARNING: Serious error when reading debug info --9061-- When reading debug info from /usr/lib64/liblzma.so.5.0.99: --9061-- get_inlFnName: absori not a subprogram Oops, we used the abbrev table cache from this CU, but the subprogram DIE that the abstract_origin points to is in another DIE, so we misinterpret Abbrev Number: 98.
Created attachment 88584 [details] disable dereferencing of cross-CU inlined fn name After more in depth analysis and discussion of the problematic cases, it became clear that any cross CU reference is broken. For the inline info, this can happen for the inlined function name. For var info, it seems other problems happen but not analysed. The attached patch bypasses the problem for inlined info by detecting cross-CU reference and giving UnknownInlinedFun for this case. A proper solution might be implemented later. Note that such cross-CU references are only known to appear when using alternate debug dwz file and/or executables that have dwarf info optimised by dwz.
After review by Mark, committed a slightly revised version of the disabling patch in revision 14476. That allows inlined info to be read (but inlined function names might be reported as unknown). A better solution is needed.
I like the workaround/solution, but I am not a fan of the warning which shows up even with -q. On a system (fedora) with lots of system libraries having been compressed by DWZ this shows up a lot. Would you be fine with something like this to suppress it with -q: diff --git a/coregrind/m_debuginfo/readdwarf3.c b/coregrind/m_debuginfo/readdwarf3.c index 825df53..8453d3d 100644 --- a/coregrind/m_debuginfo/readdwarf3.c +++ b/coregrind/m_debuginfo/readdwarf3.c @@ -2558,7 +2558,7 @@ static HChar* get_inlFnName (Int absori, CUConst* cc, Bool td3) || posn < cc->cu_start_offset || posn >= cc->cu_start_offset + cc->unit_length) { static Bool reported = False; - if (!reported) { + if (!reported && VG_(clo_verbosity) > 0) { VG_(message)(Vg_DebugMsg, "Warning: cross-CU LIMITATION: some inlined fn names\n" "might be shown as UnknownInlinedFun\n");
(In reply to Mark Wielaard from comment #9) > I like the workaround/solution, but I am not a fan of the warning which > shows up even with -q. > On a system (fedora) with lots of system libraries having been compressed by > DWZ this shows up a lot. Would you be fine with something like this to > suppress it with -q: > > diff --git a/coregrind/m_debuginfo/readdwarf3.c > b/coregrind/m_debuginfo/readdwarf3.c > index 825df53..8453d3d 100644 > --- a/coregrind/m_debuginfo/readdwarf3.c > +++ b/coregrind/m_debuginfo/readdwarf3.c > @@ -2558,7 +2558,7 @@ static HChar* get_inlFnName (Int absori, CUConst* cc, > Bool td3) > || posn < cc->cu_start_offset > || posn >= cc->cu_start_offset + cc->unit_length) { > static Bool reported = False; > - if (!reported) { > + if (!reported && VG_(clo_verbosity) > 0) { > VG_(message)(Vg_DebugMsg, > "Warning: cross-CU LIMITATION: some inlined fn > names\n" > "might be shown as UnknownInlinedFun\n"); Yes, fine. Maybe we might even use '> 1' to have it shown only when the user asks for the non default verbosity ? user specifies -v
(In reply to Philippe Waroquiers from comment #10) > Yes, fine. > Maybe we might even use '> 1' to have it shown only when the user asks for > the non > default verbosity ? > user specifies -v Did that as valgrind svn r14492. Having the warning with -v is actually nice since it will immediately follow the first "Reading syms from ..." message that prints which debug files were considered. Which gives the user a hint which file contains the problematic/compressed DWARF.
Created attachment 182310 [details] Rewrite DWARF inlined subroutine handling to work cross CU Rewrite DWARF inlined subroutine handling to work cross CU https://code.wildebeest.org/git/user/mjw/valgrind/commit/?h=inline-backtrace-post The readdwarf3 parsers cannot read DIEs across CUs. An inlined subroutine refers to an subprogram which has a name (or refers to a declaration of a subprogram that has a name). These subprograms can be (and often are when dwz has been used to compress the DWARF) in a different CU. So a lot of inlined subroutines in backtraces are just called "UnknownInlinedFun". To work around not being able to read DIEs across CUs directly we don't try to immediately resolve the name of the inlined subroutine by following the abstract origin reference to the subprogram, but just record it in the DiInlLoc. We also record all subprogram indexes while parsing in a new DiSubprogram structure and whether the subprogram had a name or had a reference to another subprogram (specification). We have to look under a couple more DIEs. We normally want to skip any DIE that doesn't have an address range when looking for inlined subroutines, but there are various other DIEs that can contain a subprogram (specification). We also want to walk the DIEs from low to high (cooked DIE) index, so we first pass over the main .debug_info, then the .debug_types, and finally the alt .debug_info. That way we can store the DiSubprograms in an array from low to high index and use a binary search to connect the inlined subroutines to the subprogram that contains the name. The code also tracks whether the subprogram is artificial, but this isn't used yet. But should make it possible for a followup patch to remove artificial inlined subroutines from a backtrace. Tested against emacs and libreoffice as packaged in Fedora where the programs and all shared libraries used are processed with dwz. The new code gives a name to every inlined subroutine. Except when the DWARF produced is bad and the DW_AT_subroutine didn't contain an DW_AT_abstract_origin and so no DW_AT_subprogram can be found.
commit f7dccaab11b8dc1af2bbcd31dea5bb7a50c6f811 Author: Mark Wielaard <mark@klomp.org> Date: Thu May 29 23:41:52 2025 +0200 Rewrite DWARF inlined subroutine handling to work cross CU