Bug 445916

Summary: Demangle Rust v0 symbols with .llvm suffix
Product: [Developer tools] valgrind Reporter: Mark Wielaard <mark>
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: n.nethercote
Priority: NOR    
Version: 3.19 GIT   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=445668
Latest Commit: Version Fixed In:

Description Mark Wielaard 2021-11-22 12:22:02 UTC
This was discussed as side issue in bug #445668
And is discussed in upstream rustc here: https://github.com/rust-lang/rust/issues/60705

Relevant comments from bug #445668:

"_RNvMs0_NtCs5l0EXMQXRMU_21rustc_data_structures17obligation_forestINtB5_16ObligationForestNtNtNtCsdozMG8X9FIu_21rustc_trait_selection6traits7fulfill26PendingPredicateObligationE22register_obligation_atB1v_.llvm.8517020237817239694" (*)

(*) BTW. What is this? It looks like a mangled Rust v0 symbol, but it isn't because it (I also checked with c++filt) cannot be demangled.

The Rust compiler currently sometimes generates symbols with a `.llvm.<numbers>` suffix. These violate the v0 spec, which doesn't allow '.' chars, and the libiberty/Valgrind demangler doesn't demangle them. It's a problem that needs to be fixed on the Rust side.

Going back to this (and I know it's a tangent), there's now some disagreement about whether this needs to be fixed on the Rust side, or the libiberty/Valgrind side: https://github.com/rust-lang/rust/issues/60705#issuecomment-974011409  :(

Interesting. I see gcc "fixed" this (for c++ symbols) by translating those suffixes to "[clone .constprop.2]", "[clone .isra.3]" or "[clone ._omp_fn.2]".
Which is nice if you want to know why there is a new (local) symbol for a code range. But I think it would actually be fine to simply cut off anything after a '.' or '$' and just demangle the part before it. For the user the important thing is knowing which (source) function an address is associated with, not which compiler transformation has been applied to it.
Comment 1 Nick Nethercote 2021-11-22 22:34:01 UTC
I agree that cutting off after `.` or `$` is fine for Rust v0 symbols. I personally never find that suffix useful.

AFAIK Valgrind uses the libiberty demangler. How does that work? Is there scope for a Valgrind-specific fix, or would it be better to get the suffix trimmed in libiberty and then import updated code into Valgrind?
Comment 2 Nick Nethercote 2021-11-23 08:09:30 UTC
https://github.com/rust-lang/rust/issues/60705#issuecomment-976064467 suggests that we should only cut off what comes after '.', *not* '$'.
Comment 3 Mark Wielaard 2021-11-23 09:30:51 UTC
(In reply to Nick Nethercote from comment #1)
> AFAIK Valgrind uses the libiberty demangler. How does that work? Is there
> scope for a Valgrind-specific fix, or would it be better to get the suffix
> trimmed in libiberty and then import updated code into Valgrind?

Best would be to fix it in libiberty (part of gcc). Then it also gets picked up by binutils/c++filt, gdb, perf, etc.
For valgrind we have the ./auxprogs/update-demangler script to help import a new version.
See comments at the top of the file how to run it (and which vars/refs to tweak).

Last libiberty demangler update was:

commit a3d42a88a6ad7bdca47b4553cfa7a7a058aac186
Author: Mark Wielaard <mark@klomp.org>
Date:   Sun Sep 26 14:47:17 2021 +0200

    Update libiberty demangler
    
    Update the libiberty demangler using the auxprogs/update-demangler
    script to gcc git commit b3585c0836e729bed56b9afd4292177673a25ca0.
    
    This update includes:
    
    - prevent null dereferencing on dlang_type
    - prevent buffer overflow when decoding user input
    - Add support for demangling local D template declarations
    - Add support for demangling D function literals as template
      value parameters
    - Add support for D `typeof(*null)' types
    - Fix -Wundef warnings in ansidecl.h
    - Fix endian bug in rust demangler
    - Adjust mangling of __alignof__
    - Avoid -Wstringop-truncation
Comment 4 Mark Wielaard 2021-12-02 17:21:13 UTC
Posted a patch to libiberty upstream:
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586058.html
Comment 5 Mark Wielaard 2022-02-17 23:10:12 UTC
Added to gcc/libiberty as:

commit d3b2ead595467166c849950ecd3710501a5094d9
Author: Mark Wielaard <mark@klomp.org>
Date:   Thu Dec 2 18:00:39 2021 +0100

    libiberty rust-demangle, ignore .suffix
    
    Rust symbols can have a .suffix because of compiler transformations.
    These can be ignored in the demangled name. Which is what this patch
    implements. By stopping at the first dot for v0 symbols and searching
    backwards to the ending 'E' for legacy symbols.
    
    An alternative implementation could be to follow what C++ does and
    represent these as [clone .suffix] tagged onto the demangled name.
    But this seems somewhat confusing since it results in a demangled
    name that cannot be mangled again. And it would mean trying to
    decode compiler internal naming.
    
    https://bugs.kde.org/show_bug.cgi?id=445916
    https://github.com/rust-lang/rust/issues/60705
    
    libiberty/Changelog
    
            * rust-demangle.c (rust_demangle_callback): Ignore everything
            after '.' char in sym for v0. For legacy symbols search
            backwards to find the last 'E' before any '.'.
            * testsuite/rust-demangle-expected: Add new .suffix testcases.


Merged into valgrind as:

commit e0b62fe05559003b731b4d786f3b71e9a66fb94d (origin/master, origin/HEAD)
Author: Mark Wielaard <mark@klomp.org>
Date:   Thu Feb 17 18:35:38 2022 +0100

    Update libiberty demangler
    
    Update the libiberty demangler using the auxprogs/update-demangler
    script to gcc git commit d3b2ead595467166c849950ecd3710501a5094d9.
    
    This update includes:
    
    - libiberty rust-demangle, ignore .suffix
    - libiberty: Fix infinite recursion in rust demangler
    - Update copyright years
    - libiberty: support digits in cpp mangled clone names
    - d-demangle: properly skip anonymous symbols
    - d-demangle: remove parenthesis where it is not needed