Bug 452758 - Valgrind does not read properly DWARF5 as generated by Clang14
Summary: Valgrind does not read properly DWARF5 as generated by Clang14
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.19 GIT
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Mark Wielaard
URL:
Keywords:
: 457412 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-04-19 11:04 UTC by Lubos Lunak
Modified: 2023-01-31 15:41 UTC (History)
10 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
valgrind patch (3.43 KB, patch)
2022-04-19 11:05 UTC, Lubos Lunak
Details
valgrind patch (21.89 KB, patch)
2022-04-19 11:06 UTC, Lubos Lunak
Details
valgrind patch (6.43 KB, patch)
2022-04-19 11:06 UTC, Lubos Lunak
Details
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx (7.44 KB, patch)
2022-04-19 13:47 UTC, Lubos Lunak
Details
avoid warning about missing DW_AT_*_base in skip_DIE() (1.53 KB, patch)
2022-04-19 13:48 UTC, Lubos Lunak
Details
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx (7.33 KB, patch)
2022-04-19 14:16 UTC, Lubos Lunak
Details
implement support for missing DW_LLE_* and DW_RLE_* values (5.16 KB, patch)
2022-04-19 14:17 UTC, Lubos Lunak
Details
read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14 (21.88 KB, patch)
2022-04-25 20:21 UTC, Lubos Lunak
Details
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx (7.43 KB, patch)
2022-04-25 20:22 UTC, Lubos Lunak
Details
implement support for missing DW_LLE_* and DW_RLE_* values (5.15 KB, patch)
2022-04-25 20:23 UTC, Lubos Lunak
Details
read properly unit headers depending on dwarf5 unit_type (3.43 KB, patch)
2022-04-26 06:41 UTC, Lubos Lunak
Details
avoid warning about missing DW_AT_*_base in skip_DIE() (1.53 KB, patch)
2022-04-26 06:42 UTC, Lubos Lunak
Details
treat DW_TAG_skeleton_unit like DW_TAG_compile_unit (3.09 KB, patch)
2022-04-26 06:46 UTC, Lubos Lunak
Details
support DW_FORM_addrx3 and DW_FORM_strx3 (2.51 KB, patch)
2022-04-26 12:00 UTC, Lubos Lunak
Details
implement support for missing DW_LLE_* and DW_RLE_* values (8.73 KB, patch)
2022-06-14 06:16 UTC, Lubos Lunak
Details
support DW_FORM_addrx3 and DW_FORM_strx3 (2.45 KB, patch)
2022-06-14 06:16 UTC, Lubos Lunak
Details
implement support for missing DW_LLE_* and DW_RLE_* values (8.20 KB, patch)
2022-06-14 16:38 UTC, Lubos Lunak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lubos Lunak 2022-04-19 11:04:37 UTC
Trying to use Valgrind with a binary compiled by Clang14 with DWARF5 enabled results in Valgrind complaing about possibly corrupted debuginfo and aborting.

Attached are 3 patches that improve the situation. I'm not very familiar with Valgrind internals or DWARF, but with these patches Valgrind is now able to run Clang14-compiled programs without complaining and generally seems to work. The one thing I've noticed that still does not work is backtraces with split-dwarf, they report the binary name instead of source file location, I don't know if the 3rd patch is incorrect in the assumption that rnglistx/loclistx may be ignored, or if it's something else missing.
Comment 1 Lubos Lunak 2022-04-19 11:05:43 UTC
Created attachment 148233 [details]
valgrind patch
Comment 2 Lubos Lunak 2022-04-19 11:06:01 UTC
Created attachment 148234 [details]
valgrind patch
Comment 3 Lubos Lunak 2022-04-19 11:06:20 UTC
Created attachment 148235 [details]
valgrind patch
Comment 4 Lubos Lunak 2022-04-19 13:47:51 UTC
Created attachment 148239 [details]
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx
Comment 5 Lubos Lunak 2022-04-19 13:48:30 UTC
Created attachment 148240 [details]
avoid warning about missing DW_AT_*_base in skip_DIE()
Comment 6 Lubos Lunak 2022-04-19 14:16:22 UTC
Created attachment 148242 [details]
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx

Removed left-over debug statement.
Comment 7 Lubos Lunak 2022-04-19 14:17:23 UTC
Created attachment 148243 [details]
implement support for missing DW_LLE_* and DW_RLE_*  values
Comment 8 Paul Floyd 2022-04-19 19:59:48 UTC
Here are my results with clang-devel (15.0.0) on FreeBSD 13.0.


== 757 tests, 15 stderr failures, 4 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
memcheck/tests/amd64/insn-pmovmskb       (stderr)
memcheck/tests/clientperm                (stderr)
memcheck/tests/freebsd/static_allocs     (stderr)
memcheck/tests/gone_abrt_xml             (stderr)
memcheck/tests/origin5-bz2               (stderr)
memcheck/tests/varinfo2                  (stderr)
memcheck/tests/varinfo5                  (stderr)
memcheck/tests/varinfo6                  (stderr)
memcheck/tests/wrap6                     (stdout)
helgrind/tests/tls_threads               (stderr)
drd/tests/atomic_var                     (stderr)
drd/tests/omp_matinv                     (stdout)
drd/tests/omp_matinv                     (stderr)
drd/tests/omp_matinv_racy                (stdout)
drd/tests/omp_matinv_racy                (stderr)
drd/tests/omp_prime_racy                 (stderr)
drd/tests/omp_printf                     (stderr)
drd/tests/pth_mutex_signal               (stderr)
none/tests/amd64/amd64locked             (stdout)

Most of those fail with clang 11.0.1. There are a few extra OMP failes, but I doubt that they are related.

The extra failures are
memcheck/tests/freebsd/static_allocs
^^^ this does look like it could be dwarf related, seems to be failing to read the client stack for the errror

memcheck/tests/wrap6
^^^ missing wrap output

none/tests/amd64/amd64locked
^^^ difference in crc calculation in guest

However, none of the above uses dwarf5.

A small leaking C++ exe gives me
paulf> ../vg_llvm15/vg-in-place ./leak                         
==60775== Memcheck, a memory error detector
==60775== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==60775== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
==60775== Command: ./leak
==60775== 
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x1b
==60775== Valgrind: debuginfo reader: ensure_valid failed:
==60775== Valgrind:   during call to ML_(img_get)
==60775== Valgrind:   request for range [86779272, +4) exceeds
==60775== Valgrind:   valid image size of 15768 for image:
==60775== Valgrind:   "/usr/home/paulf/scratch/vg_examples/leak"
==60775== 
==60775== Valgrind: debuginfo reader: Possibly corrupted debuginfo file.
==60775== Valgrind: I can't recover.  Giving up.  Sorry.
Comment 9 Paul Floyd 2022-04-19 20:07:36 UTC
I'll try the patches tomorrow.
Comment 10 Lubos Lunak 2022-04-20 09:18:12 UTC
The only remaining issue I can see is that DWARF5 + -gsplit-dwarf does not output source locations in backtraces. So far I have not been able to find out which part of Valgrind code is responsible, the furthest I know so far is that search_one_loctab() does not find the location, possibly because di->loctab_used is slightly smaller than in non-split case. I also don't see any difference in DWARF info when compared to split DWARF4 or non-split DWARF5 that would seem relevant (llvm-dwarfdump claims .debug_line are the same in both cases when generated by GCC 11).

Testcase:
$ cat a.cpp 
static int func(int a, int* b)
{
    if(*b)
        return a + 2;
    else
        return a - 2;
}

int main()
{
    int v;
    return func(4, &v);
}

$ g++-11 -g -gdwarf-5 a.cpp -gsplit-dwarf
$ valgrind ./a.out
...
==22491== Conditional jump or move depends on uninitialised value(s)
==22491==    at 0x400559: func(int, int*) (in /home/seli/tmp/a.out)
==22491==    by 0x400583: main (in /home/seli/tmp/a.out)

$ g++-11 -g -gdwarf-5 a.cpp
$ valgrind ./a.out
...
==22512== Conditional jump or move depends on uninitialised value(s)
==22512==    at 0x400559: func(int, int*) (a.cpp:3)
==22512==    by 0x400583: main (a.cpp:12)
Comment 11 Paul Floyd 2022-04-21 12:01:42 UTC
On FreeBSD 13 and64 with clang and clang devel (11 and 15 respectively) no problems building, no new regression test failures.

This will need some new regression tests.

I know little about DWARF so I'll leave that to Mark.
Comment 12 Lubos Lunak 2022-04-25 20:21:38 UTC
Created attachment 148366 [details]
read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated  by Clang14

Removed forgotten todo note in 2nd patch.
Comment 13 Lubos Lunak 2022-04-25 20:22:52 UTC
Created attachment 148367 [details]
read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx

Added one forgotten switch case in 4th patch.
Comment 14 Lubos Lunak 2022-04-25 20:23:47 UTC
Created attachment 148368 [details]
implement support for missing DW_LLE_* and DW_RLE_*  values

Removed forgotten todo comment in 5th patch.
Comment 15 Lubos Lunak 2022-04-26 06:41:54 UTC
Created attachment 148374 [details]
read properly unit headers depending on dwarf5 unit_type
Comment 16 Lubos Lunak 2022-04-26 06:42:28 UTC
Created attachment 148375 [details]
avoid warning about missing DW_AT_*_base in skip_DIE()
Comment 17 Lubos Lunak 2022-04-26 06:46:13 UTC
Created attachment 148376 [details]
treat DW_TAG_skeleton_unit like DW_TAG_compile_unit

Ok, I've figured out the split-dwarf debug line problem too, this fixes it for both GCC and Clang.

So these patches should be all.
Comment 18 Lubos Lunak 2022-04-26 12:00:57 UTC
Created attachment 148380 [details]
support DW_FORM_addrx3 and DW_FORM_strx3
Comment 19 Paul Floyd 2022-04-27 14:43:38 UTC
Hi Lubos

Do all these patches only affect llvm 14+?
Or will there be any improvement for older llvm versions?
Comment 20 Lubos Lunak 2022-04-27 21:06:15 UTC
They affect compilations does using Clang when it generates DWARF5. So presumably they also affect older versions if explicitly asked to generate DWARF5, but otherwise no.

BTW, since you're not going to actually handle this, maybe you shouldn't be the assignee for the bugreport?
Comment 21 Paul Floyd 2022-04-27 21:13:07 UTC
I'm keen to get these changes in and if Mark approves I'll push the changes.
Comment 22 Mark Wielaard 2022-04-27 21:20:31 UTC
I am really happy to see these patches. But have not read them yet. Also I am on vacation till next week. Sorry for being slow,
Comment 23 Nick Nethercote 2022-05-09 20:22:48 UTC
FYI: several people are experiencing this when running Valgrind on code compiled with the current nightly Rust compiler.
Comment 24 Mark Wielaard 2022-06-09 21:12:09 UTC
Patch 1 looks good, pushed as:

commit 61ddbc1fc395c787192e569d8f2238f713bdfd8e
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 10:58:44 2022 +0200

    read properly unit headers depending on dwarf5 unit_type
    
    There may be additional fields that need to be skipped over, otherwise
    further reading will interpret these incorrectly.

Thanks
Comment 25 Nick Nethercote 2022-06-10 03:44:48 UTC
(In reply to Nick Nethercote from comment #23)
> FYI: several people are experiencing this when running Valgrind on code
> compiled with the current nightly Rust compiler.

Unfortunately, this is still happening. E.g if you have an up-to-date rustc (e.g. updated via `rustup` with `rustup update nightly`).

> [gulf:~/dev/rustc-perf] vg1 --trace-children=yes rustc +nightly --version
> ==787430== Memcheck, a memory error detector
> ==787430== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==787430== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
> ==787430== Command: /home/njn/.rustup/toolchains/532be942ddf8f40d086e54d157453434b16e9647/bin/rustc
> ==787430==
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x23
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x25
> ### unhandled dwarf2 abbrev form code 0x23
> ==787430== Valgrind: debuginfo reader: ensure_valid failed:
> ==787430== Valgrind:   during call to ML_(img_get)
> ==787430== Valgrind:   request for range [1903633023, +4) exceeds
> ==787430== Valgrind:   valid image size of 1932688 for image:
> ==787430== Valgrind:   "/home/njn/.rustup/toolchains/532be942ddf8f40d086e54d157453434b16e9647/bin/rustc"
> ==787430==
> ==787430== Valgrind: debuginfo reader: Possibly corrupted debuginfo file.
> ==787430== Valgrind: I can't recover.  Giving up.  Sorry.
> ==787430==
Comment 26 Mark Wielaard 2022-06-13 14:48:52 UTC
Patch 2 doesn't handle DW_AT_GNU_addr_base, DW_FORM_GNU_addr_index and DW_FORM_GNU_str_index. But those are only used as extensions to DWARF4 with split-dwarf. Support could be added later if people really use pre-DWARF5 split-dwarf.

setup_cu_bases could check for cc->version >= 5 to prevent some work for older DWARF, but it is probably not noticeable.

Looks good. Pushed as:

commit 61dfba4232d795c00603c74e6d0573f2d16641e2
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 11:14:52 2022 +0200

    read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14
    
    DW_FORM_addrx* are offsets into .debug_addr containing addresses.
    DW_FORM_strx* are offsets into .debug_str_offsets, which contain
    offsets into .debug_str. Support for these also requires reading
    DW_AT_addr_base and DW_AT_str_offsets_base before any other field
    in the abbrev table entry, as those may use this form.
Comment 27 Mark Wielaard 2022-06-13 16:19:58 UTC
Patch3 looks correct on its own. So I'll push it as is:

commit 383f36462fdc25a0e849391cf845e22703a95c08
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Mon Apr 25 22:11:27 2022 +0200

    avoid warning about missing DW_AT_*_base in skip_DIE()
    
    Similarly to setup_cu_bases(), DW_FORM_addrx etc. may depend
    on DW_AT_addr_base etc. that have not been read yet.

But this also handles DW_FORM_rnglistx and DW_FORM_loclistx.
Shouldn't setup_cu_bases also handle these?
And if so, should we have an skip_Form_contents function that can be used in both places?
Comment 28 Lubos Lunak 2022-06-13 16:45:32 UTC
> But this also handles DW_FORM_rnglistx and DW_FORM_loclistx.
> Shouldn't setup_cu_bases also handle these?

The 2nd patch does that. The 7 patches should be applied in the order they are attached. You appear to be applying them based on their subject line, but those got meaningless as I was adding and updating the patches and I didn't realize somebody might look at those (I don't know if I'd ever submitted several patches as actual patches before).

> And if so, should we have an skip_Form_contents function that can be used in both places?

I don't know what you mean, I see no need for such a function.
Comment 29 Mark Wielaard 2022-06-13 17:10:03 UTC
(In reply to Mark Wielaard from comment #27)
> But this also handles DW_FORM_rnglistx and DW_FORM_loclistx.
> Shouldn't setup_cu_bases also handle these?

And that is what patch4 does.

> And if so, should we have an skip_Form_contents function that can be used in
> both places?

Still a good idea, but not super urgent.

Found once small issue with patch4:

+      case DW_FORM_rnglistx:
+      case DW_FORM_loclistx:
+         return VARSZ_FORM;
+         return VARSZ_FORM;

Removed the duplicate return and pushed as:

commit d19bbdf1200685ddbb6976cf915fba25f2097162
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 12:20:16 2022 +0200

    read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx
    
    The .debug_rnglists/.debug_loclists sections first have a list of offsets
    and then a list of the actual data that those offsets point to.
Comment 30 Mark Wielaard 2022-06-13 17:17:03 UTC
(In reply to Lubos Lunak from comment #28)
> The 7 patches should be applied in the order they
> are attached. You appear to be applying them based on their subject line,
> but those got meaningless as I was adding and updating the patches and I
> didn't realize somebody might look at those (I don't know if I'd ever
> submitted several patches as actual patches before).

Ah, sorry, I assumed because they were numbered one to seven that was the order to review them.
So I reviewed, tested and applied the first four numbered ones. What would be correct order for the next three patches/attachements?

> > And if so, should we have an skip_Form_contents function that can be used in both places?
> 
> I don't know what you mean, I see no need for such a function.

Just to make sure that whenever we skip an attribute we always do it the same way.
So skip_Form_contents would basically be:

         if(form == DW_FORM_addrx || form == DW_FORM_strx
            || form == DW_FORM_rnglistx || form == DW_FORM_loclistx) {
            /* Skip without interpreting them, they may depend on e.g.
               DW_AT_addr_base that has not been read yet. */
            (void) get_ULEB128(c_die);
         } else
            get_Form_contents( &cts, cc, c_die, False /*td3*/,
                               &abbv->nf[nf_i] );

And it could then also have a simplified get_Form_contents inlined that really just skips the values.
Comment 31 Lubos Lunak 2022-06-13 17:26:21 UTC
> So I reviewed, tested and applied the first four numbered ones. What would
> be correct order for the next three patches/attachements?

In the order they are listed in the attachment list here. If you managed to apply those 4 ones without conflicts, then apparently they can all be applied independently.

> Just to make sure that whenever we skip an attribute we always do it the
> same way.
> So skip_Form_contents would basically be:

I see. I did not find any need to do that while writing the patches.
Comment 32 Mark Wielaard 2022-06-13 20:00:34 UTC
For "implement support for missing DW_LLE_* and DW_RLE_* values" the code looks good.
But the error handling in get_debug_addr_entry looks off.
The second argument is DW_FORM form. But it is given either a DW_FORM, DW_LLE or DW_LRE.
I think it would be better if the second argument would be a const HChar*.
And we introduce new functions:
const HChar* ML_(pp_DW_LLE) ( DW_LLE loc )
const HChar* ML_(pp_DW_RLE) ( DW_RLE range )
So get_debug_addr_entry can print the actual operator in the error message.
Comment 33 Mark Wielaard 2022-06-13 20:38:31 UTC
treat DW_TAG_skeleton_unit like DW_TAG_compile_unit looks good, pushed as:

commit e95904b99c87606eae1372ee4661e9db03833f08
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 26 08:35:16 2022 +0200

    treat DW_TAG_skeleton_unit like DW_TAG_compile_unit
    
    It's basically the same, except for being split-dwarf. Handling
    it is required e.g. for reading line info.
Comment 34 Mark Wielaard 2022-06-13 21:10:29 UTC
For  support DW_FORM_addrx3 and DW_FORM_strx3  it would be easier/clearer imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick.

Is the big endian variant correct?

return c1 << 16 | c2 << 8 | c3;

Should that be:

return c1 << 32 | c2 << 16 | c3 << 8;
Comment 35 Nick Nethercote 2022-06-14 00:30:51 UTC
With the additional commits things are working again for Rust code, thanks!
Comment 36 Lubos Lunak 2022-06-14 06:16:20 UTC
Created attachment 149672 [details]
implement support for missing DW_LLE_* and DW_RLE_*  values
Comment 37 Lubos Lunak 2022-06-14 06:16:42 UTC
Created attachment 149673 [details]
support DW_FORM_addrx3 and DW_FORM_strx3
Comment 38 Lubos Lunak 2022-06-14 06:21:27 UTC
> But the error handling in get_debug_addr_entry looks off.
> The second argument is DW_FORM form. But it is given either a DW_FORM,
> DW_LLE or DW_LRE.
> I think it would be better if the second argument would be a const HChar*.

You're right, too much copy&paste. Updated (but not using const HChar*, as I find it inefficient to go through a large switch for every call even though it normally wouldn't be used).

> imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick.

Updated.

> Is the big endian variant correct?

I don't have actual big endian HW, but I think it is. The input is 3-byte bigendian that needs byte-by-byte handling, not the output.
Comment 39 Mark Wielaard 2022-06-14 12:25:01 UTC
(In reply to Lubos Lunak from comment #38)
> > But the error handling in get_debug_addr_entry looks off.
> > The second argument is DW_FORM form. But it is given either a DW_FORM,
> > DW_LLE or DW_LRE.
> > I think it would be better if the second argument would be a const HChar*.
> 
> You're right, too much copy&paste. Updated (but not using const HChar*, as I
> find it inefficient to go through a large switch for every call even though
> it normally wouldn't be used).

Yes, your introduction of an intermediary access function for the 3 cases is nicer than what I proposed. Thanks.

It looks almost correct. I would just remove the following debug (TD3) part from get_debug_addr_entry_common:

+   if (TD3) {
+      HChar* tmp = ML_(cur_read_strdup)(get_DiCursor_from_Cursor(&cur), "di.getFC.1");
+      TRACE_D3("(indirect address, offset: 0x%lx): %s", addr_pos, tmp);
+      ML_(dinfo_free)(tmp);
+   }

That only makes sense for strings, not addresses. We could instead print the address, but I think that is too noisy even for a debug run.

I missed this in the "read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14" where it was introduced for the get_Form_contents_addr function. sorry.
Comment 40 Mark Wielaard 2022-06-14 12:32:48 UTC
(In reply to Lubos Lunak from comment #38)
> > imho to use #if defined(VG_BIGENDIAN) instead of doing a runtime trick.
> 
> Updated.

Thanks, looks good.
 
> > Is the big endian variant correct?
> 
> I don't have actual big endian HW, but I think it is. The input is 3-byte
> bigendian that needs byte-by-byte handling, not the output.

It is a little tricky to know how to interpret endianness for these "3-byte" values.
But what I mean is that I believe others interpret this as a 4-byte value where the 4th byte is zero.
Which doesn't matter for little endian, but does for big endian.
With the assumption of the 4th byte is zero the code would look like:

   c1 = ML_(img_get_UChar)(c->sli.img, c->sli_next);
   c2 = ML_(img_get_UChar)(c->sli.img, c->sli_next+1);
   c3 = ML_(img_get_UChar)(c->sli.img, c->sli_next+2);
   c4 = 0;
   c->sli_next += 3;
#if defined(VG_BIGENDIAN)
   return c1 << 24 | c2 << 16 | c3 << 8 | c4;
#else
   return c1 | c2 << 8 | c3 << 16 | c4 << 24;
#endif
Comment 41 Lubos Lunak 2022-06-14 16:38:30 UTC
Created attachment 149692 [details]
implement support for missing DW_LLE_* and DW_RLE_*  values
Comment 42 Lubos Lunak 2022-06-14 16:49:40 UTC
> It looks almost correct. I would just remove the following debug (TD3) part
> from get_debug_addr_entry_common:

Updated.

> It is a little tricky to know how to interpret endianness for these "3-byte"
> values.
> But what I mean is that I believe others interpret this as a 4-byte value
> where the 4th byte is zero.
> Which doesn't matter for little endian, but does for big endian.
> With the assumption of the 4th byte is zero the code would look like:
> 
>    c1 = ML_(img_get_UChar)(c->sli.img, c->sli_next);
>    c2 = ML_(img_get_UChar)(c->sli.img, c->sli_next+1);
>    c3 = ML_(img_get_UChar)(c->sli.img, c->sli_next+2);
>    c4 = 0;
>    c->sli_next += 3;
> #if defined(VG_BIGENDIAN)
>    return c1 << 24 | c2 << 16 | c3 << 8 | c4;
> #else
>    return c1 | c2 << 8 | c3 << 16 | c4 << 24;
> #endif

I don't see the dwarf spec being specific on how to interpret this, but then I think my way is the only logical one. This code would be weird/broken, for the following reasons:
- It's inconsistent between little and big endian. On LE this would encode offsets 0-16777215, on BE it would encode offsets in multiples of 256 in the range 0-4278190080.
- It's inconsistent with the 1,2,4 variants. Those encode 1,2,4 lowest bytes (e.g. str2 doesn't mean 3rd and 4th byte being zero).
- It doesn't make much sense to encode multiples of 256. Although it's not spelled out, I think it's clear that it's meant to be a binary representation of a value using the minimal number of bytes needed.
Comment 43 Mark Wielaard 2022-06-14 19:10:55 UTC
(In reply to Lubos Lunak from comment #42)
> > It looks almost correct. I would just remove the following debug (TD3) part
> > from get_debug_addr_entry_common:
> 
> Updated.

Thanks, pushed as:

commit 4bb0164e6b2067376f085452a40137f13436384c
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Mon Apr 25 22:11:27 2022 +0200

    implement support for missing DW_LLE_* and DW_RLE_* values
Comment 44 Mark Wielaard 2022-06-14 19:24:59 UTC
(In reply to Lubos Lunak from comment #42)
> I don't see the dwarf spec being specific on how to interpret this, but then
> I think my way is the only logical one.

Yes, I keep getting confused. We only want the read the value in.
So pushed as is:

commit 026cda6c8111ef6ff79bd5c6e083ab50ccf3ded6
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 26 13:53:14 2022 +0200

    support DW_FORM_addrx3 and DW_FORM_strx3
    
    Apparently these may get used after all with large enough binaries,
    despite being somewhat tricky with regard to endianess.
Comment 45 Mark Wielaard 2022-06-14 19:27:23 UTC
(In reply to Nick Nethercote from comment #35)
> With the additional commits things are working again for Rust code, thanks!

And that was without some of the patches been pushed yet :)
Could you test again with top of tree and see whether --read-var-info=yes also works OK?
Comment 46 Nick Nethercote 2022-06-14 22:15:03 UTC
> Could you test again with top of tree and see whether --read-var-info=yes
> also works OK?

My normal workflow works well. It uses Cachegrind and DHAT, which don't need `--read-var-info=yes`.

I also tested Memcheck on a trivial Rust program and `--read-var-info=yes` had problems:

> ==19085== Memcheck, a memory error detector
> ==19085== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
> ==19085== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
> ==19085== Command: ./a
> ==19085==
>
> parse DIE(readdwarf3.c:3994): confused by:
>  <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit)
>      DW_AT_producer    : (indirect string, offset: 0x0): clang LLVM (rustc version 1.63.0-nightly (420c970cb 2022-06-09))
>      DW_AT_language    : 28
>      DW_AT_name        : (indirect string, offset: 0x41): library/std/src/lib.rs/@/std.85a61a41-cgu.0
>      DW_AT_stmt_list   : 0
>      DW_AT_comp_dir    : (indirect string, offset: 0x6d): /rustc/420c970cb1edccbf60ff2aeb51ca01e2300b09ef
>      DW_AT_GNU_pubnames: 1
>      DW_AT_low_pc      : 0x0
>      DW_AT_ranges      : 322832
> parse_type_DIE:
> --19085-- WARNING: Serious error when reading debug info
> --19085-- When reading debug info from /home/njn/dev/rustc-perf/a:
> --19085-- confused by the above DIE
> hello world
> ==19085==
> ==19085== HEAP SUMMARY:
> ==19085==     in use at exit: 0 bytes in 0 blocks
> ==19085==   total heap usage: 11 allocs, 11 frees, 3,181 bytes allocated
> ==19085==
> ==19085== All heap blocks were freed -- no leaks are possible
> ==19085==
> ==19085== For lists of detected and suppressed errors, rerun with: -s
> ==19085== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Comment 47 Nick Nethercote 2022-06-14 22:16:25 UTC
My test Rust program was this, in `a.rs`:

> fn main() {
>     println!("hello world");
> }

And I compiled with `rustc +nightly a.rs` to produce the executable `a`.
Comment 48 Mark Wielaard 2022-08-03 17:12:29 UTC
*** Bug 457412 has been marked as a duplicate of this bug. ***
Comment 49 Mark Wielaard 2023-01-31 15:41:22 UTC
Looks like I never properly closed this. Apologies.

All Luboš have been committed and made it into valgrind 3.20.0

commit 026cda6c8111ef6ff79bd5c6e083ab50ccf3ded6
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 26 13:53:14 2022 +0200

    support DW_FORM_addrx3 and DW_FORM_strx3
    
    Apparently these may get used after all with large enough binaries,
    despite being somewhat tricky with regard to endianess.

commit 4bb0164e6b2067376f085452a40137f13436384c
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Mon Apr 25 22:11:27 2022 +0200

    implement support for missing DW_LLE_* and DW_RLE_* values

commit e95904b99c87606eae1372ee4661e9db03833f08
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 26 08:35:16 2022 +0200

    treat DW_TAG_skeleton_unit like DW_TAG_compile_unit
    
    It's basically the same, except for being split-dwarf. Handling
    it is required e.g. for reading line info.

commit d19bbdf1200685ddbb6976cf915fba25f2097162
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 12:20:16 2022 +0200

    read dwarf5 DW_FORM_rnglistx and DW_FORM_loclistx
    
    The .debug_rnglists/.debug_loclists sections first have a list of offsets
    and then a list of the actual data that those offsets point to.

commit 383f36462fdc25a0e849391cf845e22703a95c08
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Mon Apr 25 22:11:27 2022 +0200

    avoid warning about missing DW_AT_*_base in skip_DIE()
    
    Similarly to setup_cu_bases(), DW_FORM_addrx etc. may depend
    on DW_AT_addr_base etc. that have not been read yet.

commit 61dfba4232d795c00603c74e6d0573f2d16641e2
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 11:14:52 2022 +0200

    read dwarf5 DW_FORM_addrx* and DW_FORM_strx* as generated by Clang14
    
    DW_FORM_addrx* are offsets into .debug_addr containing addresses.
    DW_FORM_strx* are offsets into .debug_str_offsets, which contain
    offsets into .debug_str. Support for these also requires reading
    DW_AT_addr_base and DW_AT_str_offsets_base before any other field
    in the abbrev table entry, as those may use this form.

commit 61ddbc1fc395c787192e569d8f2238f713bdfd8e
Author: Luboš Luňák <l.lunak@centrum.cz>
Date:   Tue Apr 19 10:58:44 2022 +0200

    read properly unit headers depending on dwarf5 unit_type
    
    There may be additional fields that need to be skipped over, otherwise
    further reading will interpret these incorrectly.