SUMMARY Valgrind 3.22 crashes/stops on Docker/Ubuntu 22.10 with the error below. STEPS TO REPRODUCE Relevant parts of the Dockerfile that created the image that produces this error: FROM ubuntu:22.04 RUN apt-get update && apt-get -y install g++ autoconf libtool libtool-bin RUN git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0 && cd valgrind && ./autogen.sh && ./configure --prefix=/usr/local && make -j8 && make install -j7 RUN apt-get update && apt-get install -y libc6-dbg When replacing above code with RUN git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0 the error is not produced and valgrind works correctly. It seems therefore that a bug was recently introduced? OBSERVED RESULT ==2287== Memcheck, a memory error detector ==2287== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==2287== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info ==2287== Command: ./standalone -m ZoneWithAlgebraicLoopPostProcessing -s 0.000000 -e 1.000000 -t 5.000000e-04 -i20 -w ==2287== valgrind: Fatal error at startup: a function redirection valgrind: which is mandatory for this platform-tool combination valgrind: cannot be set up. Details of the redirection are: valgrind: valgrind: A must-be-redirected function valgrind: whose name matches the pattern: strlen valgrind: in an object with soname matching: ld-linux-x86-64.so.2 valgrind: was not found whilst processing valgrind: symbols from the object with soname: ld-linux-x86-64.so.2 valgrind: valgrind: Possible fixes: (1, short term): install glibc's debuginfo valgrind: package on this machine. (2, longer term): ask the packagers valgrind: for your Linux distribution to please in future ship a non- valgrind: stripped ld.so (or whatever the dynamic linker .so is called) valgrind: that exports the above-named function using the standard valgrind: calling conventions for this platform. The package you need valgrind: to install for fix (1) is called valgrind: valgrind: On Debian, Ubuntu: libc6-dbg valgrind: On SuSE, openSuSE, Fedora, RHEL: glibc-debuginfo valgrind: valgrind: Note that if you are debugging a 32 bit process on a valgrind: 64 bit system, you will need a corresponding 32 bit debuginfo valgrind: package (e.g. libc6-dbg:i386). valgrind: valgrind: Cannot continue -- exiting now. Sorry. EXPECTED RESULT No error SOFTWARE/OS VERSIONS Docker image ubuntu:22.04
There was an error in my original post. The clone command that does not work is "git clone git://sourceware.org/git/valgrind.git" instead of "git clone git://sourceware.org/git/valgrind.git -b VALGRIND_3_21_0". At the time of writing that is commit dc6669cee7b557945fd41417bf531c7f5c9f1093 .
The only thing that I can think of that could affect that is the delayed loading of debuginfo. What output do you get with --trace-redir=yes for the link loader? For instance, RHEL7.9 just running pwd gives me this bit ==24139== Command: pwd ==24139== --24139-- << --24139-- ------ REDIR STATE after VG_(redir_initialise) ------ --24139-- TOPSPECS of soname (hardwired) --24139-- ld-linux-x86-64.so.2 strlen RL-> (0000.0) 0x580b6302 --24139-- ld-linux-x86-64.so.2 index RL-> (0000.0) 0x580b631c --24139-- ------ ACTIVE ------ --24139-- 0xffffffffff600000 (??? ) R-> (0000.0) 0x580b62e4 ??? --24139-- 0xffffffffff600400 (??? ) R-> (0000.0) 0x580b62ee ??? --24139-- 0xffffffffff600800 (??? ) R-> (0000.0) 0x580b62f8 ??? --24139-- >>
I'm getting: ==15== Memcheck, a memory error detector ==15== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==15== Using Valgrind-3.22.0.GIT and LibVEX; rerun with -h for copyright info ==15== Command: pwd ==15== --15-- << --15-- ------ REDIR STATE after VG_(redir_initialise) ------ --15-- TOPSPECS of soname (hardwired) --15-- ld-linux-x86-64.so.2 strlen RL-> (0000.0) 0x580bfe72 --15-- ld-linux-x86-64.so.2 index RL-> (0000.0) 0x580bfe8c --15-- ------ ACTIVE ------ --15-- 0xffffffffff600000 (??? ) R-> (0000.0) 0x580bfe54 ??? --15-- 0xffffffffff600400 (??? ) R-> (0000.0) 0x580bfe5e ??? --15-- 0xffffffffff600800 (??? ) R-> (0000.0) 0x580bfe68 ??? --15-- >> --15-- Reading syms from /usr/bin/pwd --15-- svma 0x0000002710, avma 0x000010a710 --15-- << --15-- ------ REDIR STATE after VG_(redir_notify_new_DebugInfo) ------ --15-- TOPSPECS of soname NONE filename /usr/bin/pwd --15-- TOPSPECS of soname (hardwired) --15-- ld-linux-x86-64.so.2 strlen RL-> (0000.0) 0x580bfe72 --15-- ld-linux-x86-64.so.2 index RL-> (0000.0) 0x580bfe8c --15-- ------ ACTIVE ------ --15-- 0xffffffffff600000 (??? ) R-> (0000.0) 0x580bfe54 ??? --15-- 0xffffffffff600400 (??? ) R-> (0000.0) 0x580bfe5e ??? --15-- 0xffffffffff600800 (??? ) R-> (0000.0) 0x580bfe68 ??? --15-- >> --15-- Reading syms from /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 --15-- svma 0x0000002090, avma 0x0004002090 MWE for future reference: echo FROM ubuntu:22.04 > Dockerfile echo "RUN apt-get update && apt-get -y install g++ autoconf libtool libtool-bin git subversion gfortran cmake ant" >> Dockerfile echo "RUN git clone git://sourceware.org/git/valgrind.git && cd valgrind && ./autogen.sh && ./configure --prefix=/usr/local && make -j8 && make install -j7" >> Dockerfile echo "RUN apt-get update && apt-get install -y libc6-dbg" >> Dockerfile docker image build -t testvalgrind . docker run -t testvalgrind valgrind --trace-redir=yes pwd
Hmm I need to think about this a bit more. I thought that --15-- ld-linux-x86-64.so.2 strlen RL-> (0000.0) 0x580bfe72 would mean it is OK.
Would you be able to bisect this? It would really help to know which commit between VALGRIND_3_21_0 and dc6669cee7b557945fd41417bf531c7f5c9f1093 caused this.
(In reply to Paul Floyd from comment #2) > The only thing that I can think of that could affect that is the delayed > loading of debuginfo. That might be it. If I remember correctly Debian/Ubuntu doesn't keep symtab in glibc, so always need the glibc dbg package to be installed. If so, this might also be resolved if you follow the packaging recommendation for ld.so from README_PACKAGERS. -- Do not ship your Linux distro with a completely stripped /lib/ld.so. At least leave the debugging symbol names on -- line number info isn't necessary. If you don't want to leave symbols on ld.so, alternatively you can have your distro install ld.so's debuginfo package by default, or make ld.so.debuginfo be a requirement of your Valgrind RPM/DEB/whatever. Reason for this is that Valgrind's Memcheck tool needs to intercept calls to, and provide replacements for, some symbols in ld.so at startup (most importantly strlen). If it cannot do that, Memcheck shows a large number of false positives due to the highly optimised strlen (etc) routines in ld.so. This has caused some trouble in the past. As of version 3.3.0, on some targets (ppc32-linux, ppc64-linux), Memcheck will simply stop at startup (and print an error message) if such symbols are not present, because it is infeasible to continue. It's not like this is going to cost you much space. We only need the symbols for ld.so (a few K at most). Not the debug info and not any debuginfo or extra symbols for any other libraries.
Bisection results: This commit works: 6ce0979884a8f246c80a098333ceef1a7b7f694d This is the first one that fails: 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9
(In reply to Filip Jorissen from comment #7) > Bisection results: > > This commit works: > 6ce0979884a8f246c80a098333ceef1a7b7f694d > This is the first one that fails: > 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9 Thanks! That's commit 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9 Author: Aaron Merey <amerey@redhat.com> Date: Fri Jun 30 18:31:42 2023 -0400 Support lazy reading and downloading of DWARF debuginfo
Created attachment 161329 [details] Explicitly load libc and any sonames that contain mandatory specs We cannot really be lazy loading glibc debuginfo and specifically ld.so, which contains mandatory hardwires.
commit 8228fe7f696b30c7b6b6daf576fc189bf8d6f8c2 Author: Mark Wielaard <mark@klomp.org> Date: Fri Sep 1 19:10:17 2023 +0200 Explicitly load libc and any sonames that contain mandatory specs We really need symtab for glibc and ld.so libraries early for redir. Some distros move these into separate debuginfo files, which means we need to fully load them early. https://bugs.kde.org/show_bug.cgi?id=473745
I can confirm that the current HEAD works again for me :) Thanks!