Bug 473745 - must-be-redirected function - strlen - for valgrind 3.22 but not 3.21
Summary: must-be-redirected function - strlen - for valgrind 3.22 but not 3.21
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (other bugs)
Version First Reported In: 3.22 GIT
Platform: Ubuntu Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-25 10:28 UTC by Filip Jorissen
Modified: 2023-09-02 21:43 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Explicitly load libc and any sonames that contain mandatory specs (6.79 KB, text/plain)
2023-09-01 17:45 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Filip Jorissen 2023-08-25 10:28:16 UTC
SUMMARY

Valgrind 3.22 crashes/stops on Docker/Ubuntu 22.10 with the error below.

STEPS TO REPRODUCE

Relevant parts of the Dockerfile that created the image that produces this error:

FROM ubuntu:22.04
RUN apt-get update && apt-get -y install g++ autoconf libtool libtool-bin 
RUN git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0 && cd valgrind && ./autogen.sh && ./configure --prefix=/usr/local && make -j8 && make install -j7
RUN apt-get update && apt-get install -y libc6-dbg


When replacing above code with

RUN git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0

the error is not produced and valgrind works correctly. It seems therefore that a bug was recently introduced?


OBSERVED RESULT


==2287== Memcheck, a memory error detector
==2287== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==2287== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info
==2287== Command: ./standalone -m ZoneWithAlgebraicLoopPostProcessing -s 0.000000 -e 1.000000 -t 5.000000e-04 -i20 -w
==2287== 
valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:  
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux-x86-64.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux-x86-64.so.2
valgrind:  
valgrind:  Possible fixes: (1, short term): install glibc's debuginfo
valgrind:  package on this machine.  (2, longer term): ask the packagers
valgrind:  for your Linux distribution to please in future ship a non-
valgrind:  stripped ld.so (or whatever the dynamic linker .so is called)
valgrind:  that exports the above-named function using the standard
valgrind:  calling conventions for this platform.  The package you need
valgrind:  to install for fix (1) is called
valgrind:  
valgrind:    On Debian, Ubuntu:                 libc6-dbg
valgrind:    On SuSE, openSuSE, Fedora, RHEL:   glibc-debuginfo
valgrind:  
valgrind:  Note that if you are debugging a 32 bit process on a
valgrind:  64 bit system, you will need a corresponding 32 bit debuginfo
valgrind:  package (e.g. libc6-dbg:i386).
valgrind:  
valgrind:  Cannot continue -- exiting now.  Sorry.

EXPECTED RESULT
No error

SOFTWARE/OS VERSIONS
Docker image ubuntu:22.04
Comment 1 Filip Jorissen 2023-08-25 10:33:17 UTC
There was an error in my original post. The clone command that does not work is "git clone git://sourceware.org/git/valgrind.git" instead of "git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0". At the time of writing that is commit dc6669cee7b557945fd41417bf531c7f5c9f1093 .
Comment 2 Paul Floyd 2023-08-25 14:40:26 UTC
The only thing that I can think of that could affect that is the delayed loading of debuginfo. What output do you get with --trace-redir=yes for the link loader?
For instance, RHEL7.9 just running pwd gives me this bit

==24139== Command: pwd
==24139== 
--24139-- <<
--24139--    ------ REDIR STATE after VG_(redir_initialise) ------
--24139--    TOPSPECS of soname (hardwired)
--24139--      ld-linux-x86-64.so.2      strlen                         RL-> (0000.0) 0x580b6302
--24139--      ld-linux-x86-64.so.2      index                          RL-> (0000.0) 0x580b631c
--24139--    ------ ACTIVE ------
--24139--     0xffffffffff600000 (???                 ) R-> (0000.0) 0x580b62e4 ???
--24139--     0xffffffffff600400 (???                 ) R-> (0000.0) 0x580b62ee ???
--24139--     0xffffffffff600800 (???                 ) R-> (0000.0) 0x580b62f8 ???
--24139-- >>
Comment 3 Filip Jorissen 2023-08-28 17:43:16 UTC
I'm getting:

==15== Memcheck, a memory error detector
==15== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==15== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info
==15== Command: pwd
==15== 
--15-- <<
--15--    ------ REDIR STATE after VG_(redir_initialise) ------
--15--    TOPSPECS of soname (hardwired)
--15--      ld-linux-x86-64.so.2      strlen                         RL-> (0000.0) 0x580bfe72
--15--      ld-linux-x86-64.so.2      index                          RL-> (0000.0) 0x580bfe8c
--15--    ------ ACTIVE ------
--15--     0xffffffffff600000 (???                 ) R-> (0000.0) 0x580bfe54 ???
--15--     0xffffffffff600400 (???                 ) R-> (0000.0) 0x580bfe5e ???
--15--     0xffffffffff600800 (???                 ) R-> (0000.0) 0x580bfe68 ???
--15-- >>
--15-- Reading syms from /usr/bin/pwd
--15--    svma 0x0000002710, avma 0x000010a710
--15-- <<
--15--    ------ REDIR STATE after VG_(redir_notify_new_DebugInfo) ------
--15--    TOPSPECS of soname NONE filename /usr/bin/pwd
--15--    TOPSPECS of soname (hardwired)
--15--      ld-linux-x86-64.so.2      strlen                         RL-> (0000.0) 0x580bfe72
--15--      ld-linux-x86-64.so.2      index                          RL-> (0000.0) 0x580bfe8c
--15--    ------ ACTIVE ------
--15--     0xffffffffff600000 (???                 ) R-> (0000.0) 0x580bfe54 ???
--15--     0xffffffffff600400 (???                 ) R-> (0000.0) 0x580bfe5e ???
--15--     0xffffffffff600800 (???                 ) R-> (0000.0) 0x580bfe68 ???
--15-- >>
--15-- Reading syms from /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
--15--    svma 0x0000002090, avma 0x0004002090


MWE for future reference:
echo FROM ubuntu:22.04 > Dockerfile
echo "RUN apt-get update && apt-get -y install g++ autoconf libtool libtool-bin git subversion gfortran cmake ant" >> Dockerfile
echo "RUN git clone git://sourceware.org/git/valgrind.git && cd valgrind && ./autogen.sh && ./configure --prefix=/usr/local && make -j8 && make install -j7" >> Dockerfile
echo "RUN apt-get update && apt-get install -y libc6-dbg" >> Dockerfile
docker image build -t testvalgrind .
docker run -t testvalgrind valgrind --trace-redir=yes pwd
Comment 4 Paul Floyd 2023-08-29 06:24:58 UTC
Hmm I need to think about this a bit more.

I thought that

--15--      ld-linux-x86-64.so.2      strlen                         RL-> (0000.0) 0x580bfe72

would mean it is OK.
Comment 5 Mark Wielaard 2023-08-29 13:49:10 UTC
Would you be able to bisect this?
It would really help to know which commit between 
VALGRIND_3_21_0 and dc6669cee7b557945fd41417bf531c7f5c9f1093 caused this.
Comment 6 Mark Wielaard 2023-08-29 13:52:38 UTC
(In reply to Paul Floyd from comment #2)
> The only thing that I can think of that could affect that is the delayed
> loading of debuginfo. 

That might be it. If I remember correctly Debian/Ubuntu doesn't keep symtab in glibc, so always need the glibc dbg package to be installed.

If so, this might also be resolved if you follow the packaging recommendation for ld.so from README_PACKAGERS.

-- Do not ship your Linux distro with a completely stripped
   /lib/ld.so.  At least leave the debugging symbol names on -- line
   number info isn't necessary.  If you don't want to leave symbols on
   ld.so, alternatively you can have your distro install ld.so's
   debuginfo package by default, or make ld.so.debuginfo be a
   requirement of your Valgrind RPM/DEB/whatever.

   Reason for this is that Valgrind's Memcheck tool needs to intercept
   calls to, and provide replacements for, some symbols in ld.so at
   startup (most importantly strlen).  If it cannot do that, Memcheck
   shows a large number of false positives due to the highly optimised
   strlen (etc) routines in ld.so.  This has caused some trouble in
   the past.  As of version 3.3.0, on some targets (ppc32-linux,
   ppc64-linux), Memcheck will simply stop at startup (and print an
   error message) if such symbols are not present, because it is
   infeasible to continue.

   It's not like this is going to cost you much space.  We only need
   the symbols for ld.so (a few K at most).  Not the debug info and
   not any debuginfo or extra symbols for any other libraries.
Comment 7 Filip Jorissen 2023-08-29 17:22:06 UTC
Bisection results:

This commit works:
6ce0979884a8f246c80a098333ceef1a7b7f694d
This is the first one that fails:
60f7e89ba32b54d73b9e36d49e28d0f559ade0b9
Comment 8 Paul Floyd 2023-08-29 17:33:34 UTC
(In reply to Filip Jorissen from comment #7)
> Bisection results:
> 
> This commit works:
> 6ce0979884a8f246c80a098333ceef1a7b7f694d
> This is the first one that fails:
> 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9

Thanks!

That's 

commit 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9
Author: Aaron Merey <amerey@redhat.com>
Date:   Fri Jun 30 18:31:42 2023 -0400

    Support lazy reading and downloading of DWARF debuginfo
Comment 9 Mark Wielaard 2023-09-01 17:45:56 UTC
Created attachment 161329 [details]
Explicitly load libc and any sonames that contain mandatory specs

We cannot really be lazy loading glibc debuginfo and specifically ld.so, which contains mandatory hardwires.
Comment 10 Mark Wielaard 2023-09-02 00:04:28 UTC
commit 8228fe7f696b30c7b6b6daf576fc189bf8d6f8c2
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Sep 1 19:10:17 2023 +0200

    Explicitly load libc and any sonames that contain mandatory specs
    
    We really need symtab for glibc and ld.so libraries early for redir.
    Some distros move these into separate debuginfo files, which means
    we need to fully load them early.
    
    https://bugs.kde.org/show_bug.cgi?id=473745
Comment 11 Filip Jorissen 2023-09-02 21:43:28 UTC
I can confirm that the current HEAD works again for me :) Thanks!