Hi, I have investigated a valgrind false positive I had after upgrading my toolchain. I have a gnu toolchain built from scratch, using glibc 2.37 (current master branch) and a gcc 13, configured with --with-arch-64=x86-64-v2. It looks like the resulting glibc also uses x86-64-v2 by default, and apparently uses some vectorized instruction by default, especially in strcmp. When running memcheck on a dynamically linked program, valgrind is actually reporting two "cond" errors during the glibc startup, in stack traces like this: ==16638== Conditional jump or move depends on uninitialised value(s) ==16638== at 0x40217E1: strcmp (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x400AD64: _dl_name_match_p (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x4008353: _dl_map_object (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x4019F94: map_doit (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x4001488: _dl_catch_exception (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x40015AE: _dl_catch_error (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x401A47F: do_preload (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x401B206: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x401E022: dl_main (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x401989E: _dl_sysdep_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x401AFAB: _dl_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== by 0x4019E77: ??? (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) ==16638== Uninitialised value was created by a stack allocation ==16638== at 0x401B1AA: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.4/lib/ld-linux-x86-64.so.2) So while trying to LD_PRELOAD the valgrind strcmp symbol replacements, we call the glibc strcmp, which causes some expected errors (most likely generated by the vectorization code). Checking the default suppression file, I can see things like: { dl-hack3-cond-1 Memcheck:Cond obj:*/lib*/ld-2.36*.so* obj:*/lib*/ld-2.36*.so* obj:*/lib*/ld-2.36*.so* } However since glibc 2.34, these files ld-2.36*.so* no long exist. According to glibc NEWS file: * Previously, glibc installed its various shared objects under versioned file names such as libc-2.33.so. The ABI sonames (e.g., libc.so.6) were provided as symbolic links. Starting with glibc 2.34, the shared objects are installed under their ABI sonames directly, without symbolic links. This increases compatibility with distribution package managers that delete removed files late during the package upgrade or downgrade process. Indeed comparing a glibc 2.23 vs 2.37, here is what we see for ld.so: rgeissler@ncerndobedev6097:~> ls -la /remote/tools/Linux/2.6/1A/toolchain/x86_64-2.6.32-v4/lib*|grep ld- -rwxr-xr-x 1 otfdelde otfdelde 185520 Dec 19 14:36 ld-2.23.so lrwxrwxrwx 1 otfdelde otfdelde 10 Dec 19 14:36 ld-linux-x86-64.so.2 -> ld-2.23.so rgeissler@ncerndobedev6097:~> ls -la /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23/lib*|grep ld- -rwxr-xr-x 1 otfdelde otfdelde 227256 Jan 16 22:15 ld-linux-x86-64.so.2 So the file glibc-2.X.supp.in in valgrind source code https://sourceware.org/git/?p=valgrind.git;a=blob;f=glibc-2.X.supp.in;h=eeefa393519476120f6f0ce1320163b216339354;hb=HEAD seems to be wrong for recent glibc. Would it be possible to edit it to take into account the new glibc file naming convention ? I tried changing all */lib*/ld-2.36*.so* patterns by */lib*/ld-linux*.so* locally, and it worked (ie valgrind didn't report any false positive anymore). Cheers, Romain
Seems related to https://bugs.kde.org/show_bug.cgi?id=439590 which resulted in this commit: https://sourceware.org/git/?p=valgrind.git;a=commitdiff;h=a1364805fc74b5690f763033c0c9b43f27613572#patch3 In particular in glibc-2.X-drd.supp.in there was this diff: @@ -6,7 +8,7 @@ { drd-ld drd:ConflictingAccess - obj:*/lib*/ld-*.so + obj:*/lib*/ld*.so* } Applying similar changes to glibc-2.X.supp.in would fix the problem.
I need to upgrade to Fedora 37 so that I can test this. I'll do that soon.
Hi, On my side I will test this, which isn't good in general, but shall be ok for people like me using glibc >= 2.34. I have replaced libdl by libc as in glibc 2.34 libdl is now an empty library, all its symbols have been moved into libc. Cheers, Romain
Created attachment 155802 [details] Pach (doesn't work for glibc < 2.34)
I'd rather leave this to Mark as I only have 1 Fedora machine to test on (with glibc 2.36)
I'd offer but I don't have access to older glibcs easily either. I'd love for this to get into the next release though. I can probably try with Docker if required.
Sorry I didn't look at this earlier. The original commit that adapted things to the glibc 2.34 naming change did say: The same could be done for the glibc-2.X.supp.in file, but hasn't yet because it looks like most suppressions in that file are obsolete. And some of those suppressions are horribly broad. Is this just for strcmp in ld.so? In that case the intercept in shared/vg_replace_strmem.c should in theory fix that. But it looks like it is too early. Which means we might want to create a hardwire for it in coregrind/m_redir.c
(In reply to Mark Wielaard from comment #7) > Is this just for strcmp in ld.so? > In that case the intercept in shared/vg_replace_strmem.c should in theory > fix that. Hi, Yes in my case it happens only for strcmp in ld.so, I tried again just now with an unpatched toolchain on our side, using a unittest binary that doesn't depend on many libraries, and here is the error showed at the end of the report: ==87482== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) ==87482== ==87482== 1 errors in context 1 of 2: ==87482== Conditional jump or move depends on uninitialised value(s) ==87482== at 0x4021881: strcmp (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x400ADAB: _dl_name_match_p (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4008383: _dl_map_object (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401A034: map_doit (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4001488: _dl_catch_exception (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x40015AE: _dl_catch_error (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401A51F: do_preload (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401B2A6: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401E0D2: dl_main (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401993E: _dl_sysdep_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401B04B: _dl_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4019F17: ??? (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== Uninitialised value was created by a stack allocation ==87482== at 0x401B24A: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== { <insert_a_suppression_name_here> Memcheck:Cond fun:strcmp fun:_dl_name_match_p fun:_dl_map_object fun:map_doit fun:_dl_catch_exception fun:_dl_catch_error fun:do_preload fun:handle_preload_list fun:dl_main fun:_dl_sysdep_start fun:_dl_start obj:/remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2 } ==87482== ==87482== 1 errors in context 2 of 2: ==87482== Conditional jump or move depends on uninitialised value(s) ==87482== at 0x4021881: strcmp (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x400AD84: _dl_name_match_p (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4008383: _dl_map_object (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401A034: map_doit (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4001488: _dl_catch_exception (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x40015AE: _dl_catch_error (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401A51F: do_preload (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401B2A6: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401E0D2: dl_main (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401993E: _dl_sysdep_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x401B04B: _dl_start (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== by 0x4019F17: ??? (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== Uninitialised value was created by a stack allocation ==87482== at 0x401B24A: handle_preload_list (in /remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2) ==87482== { <insert_a_suppression_name_here> Memcheck:Cond fun:strcmp fun:_dl_name_match_p fun:_dl_map_object fun:map_doit fun:_dl_catch_exception fun:_dl_catch_error fun:do_preload fun:handle_preload_list fun:dl_main fun:_dl_sysdep_start fun:_dl_start obj:/remote/tools/Linux/2.6/1A/toolchain/x86_64-v23.0.8/lib/ld-linux-x86-64.so.2 } ==87482== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0) > But it looks like it is too early. Which means we might want to create a > hardwire for it in coregrind/m_redir.c Yes this happens very early in the dynamic library loading process, so even the .so shared library provided by valgrind aren't loaded just yet (actually from what I recall from my initial investigation months ago is that the compared string came from the LD_PRELOAD variable set by valgrind to preload the valgrind .so replacement containing strcmp and the likes. We might want to start from scratch and use a new suppression file containing just this specific case. For now in our toolchain we keep using my patch until an official solution is pushed in the valgrind project. Cheers, Romain
Time to revisit this.