Bug 431306 - Update demangler to support Rust v0 name mangling.
Summary: Update demangler to support Rust v0 name mangling.
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Mark Wielaard
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-08 13:34 UTC by Amanieu d'Antras
Modified: 2021-09-26 13:10 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Update demangler to GCC git 01d92cfd79872e4cffc78bf233bb9b767336beb8 (320.26 KB, patch)
2021-01-08 13:34 UTC, Amanieu d'Antras
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Amanieu d'Antras 2021-01-08 13:34:20 UTC
Created attachment 134658 [details]
Update demangler to GCC git 01d92cfd79872e4cffc78bf233bb9b767336beb8

Rust is moving to a new name mangling scheme: https://github.com/rust-lang/rust/issues/60705. The libiberty demangler in GCC has already been updated to support the new scheme so just update the copy in Valgrind to match.

The update-demangler script has also been updated to use git instead of svn for GCC.

There is a slight regression in that very long symbol names (> 512 bytes) aren't demangled any more, but I don't feel that this is a big issue since all other tools (binutils, c++filt, anything using libiberty) also suffers from this issue.
Comment 1 Mark Wielaard 2021-04-26 13:00:27 UTC
Thanks for the update.

Would it be possible to split the libiberty update itself from the update of the valgrind update-demangler.sh script and m_demangle/demangle.c sources?

It isn't strictly necessary, but might make it easier to see what changes are just from syncing and which are needed for the Rust demangler update itself.
Comment 2 Amanieu d'Antras 2021-04-26 16:36:00 UTC
The only changes I made myself were:
- Updating the update-demangler script
- Updating memcheck/tests/long_namespace_xml.stderr.exp

Everything else (including the Rust demangling support) comes from running the script.
Comment 3 Michael Woerister 2021-04-27 14:02:26 UTC
I'd like to endorse getting Rust's new symbol mangling scheme supported in valgrind. I think people will be very interested in it because the new scheme finally allows to have demangled names that do not lose information about generic parameters. In the past, when running valgrind with name demangling enabled, all instances of a given generic function would be collapsed into a single function (e.g. "foo<u32>", "foo<String>", "foo<SomeStruct>" would all show up as just "foo"). In order to get rid of this inaccuracy one had to disable demangling, which can make things quite unreadable.
Comment 4 Mark Wielaard 2021-09-22 13:46:22 UTC
OK, so this patch contains:

- An update to the update-demangler script to use gcc git instead svn
- The result of running the updated script to get an updated demangler
- A change to long_namespace_xml.stderr.exp because two overly long symbols aren't demangled anymore, but just returned as is.

I have some trouble running the script:

$ auxprogs/update-demangler
Updating the demangler in /tmp/demangle
Initialized empty Git repository in /tmp/demangle/gcc/.git/
error: Server does not allow request for unadvertised object 01d92cfd79872e4cffc78bf233bb9b767336beb8
error: Server does not allow request for unadvertised object 7a312bbd41e190379d80b47e308a32c8bc4575c3

Which is odd. Both commits are in my local tree.
I am investigating what is going on.
Comment 5 Mark Wielaard 2021-09-22 15:17:58 UTC
So it seems the gcc.git server doesn't like doing such a shallow checkout.
It does work when doing "git fetch origin master" instead of "git fetch --depth 1 origin $old_gcc_revision $new_gcc_revision" But that certainly is not very shallow...
Comment 6 Mark Wielaard 2021-09-22 17:38:12 UTC
(In reply to Mark Wielaard from comment #4)
> OK, so this patch contains:
> 
> - An update to the update-demangler script to use gcc git instead svn
> - The result of running the updated script to get an updated demangler
> - A change to long_namespace_xml.stderr.exp because two overly long symbols
> aren't demangled anymore, but just returned as is.

And an update to the m_demangle/demangle.c source to deal with Rust demangling. Specificly these two changes:

-   Rust demangling (0) is only done if C++ demangling (1) succeeds
-   because Rust demangling is performed in-place, and it is difficult
-   to prove that we "own" the storage -- hence, that the in-place
-   operation is safe -- unless it is clear that it has come from the
-   C++ demangler, which returns its output in a heap-allocated buffer
-   which we can be sure we own.  In practice (Nov 2016) this does not
-   seem to be a problem, since the Rust compiler appears to apply C++
-   mangling after Rust mangling, so we never encounter symbols that
-   require Rust demangling but not C++ demangling.

-      if (demangled) {
-         /* Possibly undo (0).  This is the only place where it is
-            safe, from a storage management perspective, to
-            Rust-demangle the symbol.  That's because Rust demangling
-            happens in place, so we need to be sure that the storage
-            it is happening in is actually owned by us, and non-const.
-            In this case, the value returned by ML_(cplus_demangle)
-            does have that property. */
-         if (rust_is_mangled(demangled)) {
-            rust_demangle_sym(demangled);
-         }
-         *result = demangled;
-      } else {
-         *result = orig;
-      }

So I assume cplus_demangle now directly demangles old and new style rust symbols?
Comment 7 Amanieu d'Antras 2021-09-22 17:47:23 UTC
> So I assume cplus_demangle now directly demangles old and new style rust symbols?

Yes, cplus_demangle will automatically handle old and new style rust symbols.
Comment 8 Mark Wielaard 2021-09-22 17:59:51 UTC
(In reply to Amanieu d'Antras from comment #7)
> > So I assume cplus_demangle now directly demangles old and new style rust symbols?
> 
> Yes, cplus_demangle will automatically handle old and new style rust symbols.

Aha, it does indeed, but only when options include DGML_AUTO. We currently only include DMGL_ANSI | DMGL_PARAMS. Should DGML_AUTO be added?
Comment 9 Amanieu d'Antras 2021-09-22 23:00:52 UTC
I'm not sure why we're not using that in the first place. It might be safer to just add DMGL_RUST.
Comment 10 Mark Wielaard 2021-09-24 22:05:09 UTC
(In reply to Amanieu d'Antras from comment #9)
> I'm not sure why we're not using that in the first place.

It took me a while to realize how this really works.

If options doesn't contain any "style" then the current_demangling_style is added to the options in cplus_demangle:

  if ((options & DMGL_STYLE_MASK) == 0)
    options |= (int) current_demangling_style & DMGL_STYLE_MASK;

And by default current_demangling_style is DMGL_AUTO:
enum demangling_styles current_demangling_style = auto_demangling;
Comment 11 Mark Wielaard 2021-09-26 13:10:19 UTC
(In reply to Mark Wielaard from comment #5)
> So it seems the gcc.git server doesn't like doing such a shallow checkout.
> It does work when doing "git fetch origin master" instead of "git fetch
> --depth 1 origin $old_gcc_revision $new_gcc_revision" But that certainly is
> not very shallow...

It did work with a newer git client (version 2.31.1)

Pushed as:

commit c2e0ab8694f44b18a90ca9139506ffaf55d05a52
Author: Amanieu d'Antras <amanieu@gmail.com>
Date:   Sun Sep 26 12:42:26 2021 +0200

    Update libiberty demangler to support Rust v0 name mangling
    
    Update the libiberty demangler using the auxprogs/update-demangler
    script to the gcc git 01d92cfd79872e4cffc78bf233bb9b767336beb8.
    Updates rust demangling to support the new v0 mangling scheme.
    
    This includes the following changes:
    - Update the update-demangler script to use gcc git instead of svn.
    - The result of running the updated script to get an updated
      demangler and resolving the merge conflicts.
    - A change to long_namespace_xml.stderr.exp because two overly long
      symbols aren't demangled anymore, but just returned as is.
    - an update to the m_demangle/demangle.c source to deal with Rust
      demangling in cp_demangle, which now directly demangles old and
      new style rust symbols.

Then used the script again to updated to latest libiberty demangler:

commit a3d42a88a6ad7bdca47b4553cfa7a7a058aac186 (HEAD -> master)
Author: Mark Wielaard <mark@klomp.org>
Date:   Sun Sep 26 14:47:17 2021 +0200

    Update libiberty demangler
    
    Update the libiberty demangler using the auxprogs/update-demangler
    script to gcc git commit b3585c0836e729bed56b9afd4292177673a25ca0.
    
    This update includes:
    
    - prevent null dereferencing on dlang_type
    - prevent buffer overflow when decoding user input
    - Add support for demangling local D template declarations
    - Add support for demangling D function literals as template
      value parameters
    - Add support for D `typeof(*null)' types
    - Fix -Wundef warnings in ansidecl.h
    - Fix endian bug in rust demangler
    - Adjust mangling of __alignof__
    - Avoid -Wstringop-truncation