Bug 286864 - strlen function redirection error
Summary: strlen function redirection error
Status: CONFIRMED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.8.0
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-17 17:30 UTC by xpucmoc
Modified: 2017-03-16 06:26 UTC (History)
12 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
A temporary bandage fix to ignore strlen() redirection for amd86 linux architecture (959 bytes, patch)
2013-08-30 21:33 UTC, zephyrus00jp
Details

Note You need to log in before you can comment on or make changes to this bug.
Description xpucmoc 2011-11-17 17:30:28 UTC
Version:           3.7.0 (using KDE 4.4.3) 
OS:                Linux

valgrind has been compiled from source. The system is Linux with glibc 2.14.1. The /lib/ld-2.14.1.so  is NOT striped:
/usr/src/valgrind-3.7.0> file /lib/ld-2.14.1.so  
/lib/ld-2.14.1.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped
However valgrind bombs out still:
/usr/src/valgrind-3.7.0> valgrind ls
==11831== Memcheck, a memory error detector
==11831== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==11831== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==11831== Command: ls
==11831== 

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:  
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux.so.2
valgrind:  
valgrind:  Possible fixes: (1, short term): install glibc's debuginfo
valgrind:  package on this machine.  (2, longer term): ask the packagers
valgrind:  for your Linux distribution to please in future ship a non-
valgrind:  stripped ld.so (or whatever the dynamic linker .so is called)
valgrind:  that exports the above-named function using the standard
valgrind:  calling conventions for this platform.  The package you need
valgrind:  to install for fix (1) is called
valgrind:  
valgrind:    On Debian, Ubuntu:                 libc6-dbg
valgrind:    On SuSE, openSuSE, Fedora, RHEL:   glibc-debuginfo
valgrind:  
valgrind:  Cannot continue -- exiting now.  Sorry.


Reproducible: Always

Steps to Reproduce:
repeat the command

Actual Results:  
the same

Expected Results:  
the same

the same
Comment 1 Hubert Kowalski 2011-11-23 22:37:00 UTC
I can confirm this error, as I wanted to report it myself.

I use Gentoo GNU/Linux with glibc 2.14.1 compiled with splitdebug enabled and even with basic optimizations strlen gets stripped out:

$ nm /usr/lib/debug/lib/ld-2.14.1.so.debug | grep "\bstr"
00016eb0 t strchr
00017020 t strcmp
00017048 t strcpy
00017070 t strnlen
00014e97 t strsep

After a lot of investigations I came up with conclusion, that strlen shouldn't be looked at by valgrind at all, as this causes more trouble than it's worth, as evident not only by this bug, but by false-positives like https://bugzilla.redhat.com/show_bug.cgi?id=678518


Also: because my system benefits heavily from SSE enhancements, forcing no builtin strlen causes huge penalty on overall performance of my apps (dunno about others, perhaps whole system benefits), so currently I am left with unusable valgrind. This bug should have higher priority as it is absolute blocker for me.
Comment 2 Julian Seward 2012-06-30 15:42:31 UTC
> After a lot of investigations I came up with conclusion, that strlen
> shouldn't be looked at by valgrind at all, as this causes more trouble than
> it's worth, as evident not only by this bug, but by false-positives like
> https://bugzilla.redhat.com/show_bug.cgi?id=678518

That bug is completely unrelated to this one.  Red herring.

The requirement for intercepting strlen in ld.so was to avoid lots of
false positives from Memcheck at startup.  I believe it is still
necessary, and you haven't produced any evidence in to the contrary:
you don't show any results from your investigations.
Comment 3 ppaalanen 2012-07-01 08:30:28 UTC
Hi, apparently I suffer from the same problem on Gentoo, x86_64, with
dev-util/valgrind-3.6.1-r3
sys-libs/glibc-2.14.1-r3
gcc (Gentoo 4.5.3-r2 p1.1, pie-0.4.7) 4.5.3
and debug info is installed:

$ valgrind -v ./solve tek-2012-4.sudoku
==10575== Memcheck, a memory error detector
==10575== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==10575== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==10575== Command: ./solve tek-2012-4.sudoku
==10575== 
--10575-- Valgrind options:
--10575--    -v
--10575-- Contents of /proc/version:
--10575--   Linux version 3.1.6-gentoo (root@farn) (gcc version 4.3.4 (Gentoo 4.3.4 p1.3, pie-10.1.5) ) #1 SMP PREEMPT Sat Jan 21 21:50:59 EET 2012
--10575-- Arch and hwcaps: AMD64, amd64-sse3-cx16
--10575-- Page sizes: currently 4096, max supported 4096
--10575-- Valgrind library directory: /usr/lib64/valgrind
--10575-- Reading syms from /home/pq/c/sudoku/solve (0x400000)
--10575-- Reading syms from /lib64/ld-2.14.1.so (0x4000000)
--10575--   Considering /usr/lib/debug/lib64/ld-2.14.1.so.debug ..
--10575--   .. CRC is valid

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:  
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux-x86-64.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux-x86-64.so.2


Indeed: $ nm /usr/lib/debug/lib64/ld-2.14.1.so.debug | grep '\bstr'
00000000000160a0 t strchr
0000000000016120 t strcmp
0000000000016150 t strcpy
0000000000016230 t strnlen
00000000000145f3 t strsep
and no strlen to be found.

I have read
http://forums.gentoo.org/viewtopic-t-814674.html
https://bugs.gentoo.org/show_bug.cgi?id=214065
https://bugs.gentoo.org/show_bug.cgi?id=390323
https://bugs.kde.org/show_bug.cgi?id=190429
and this bug.

What solution do the Valgrind developers suggest for getting Valgrind working again?
Should I compile my whole glibc with -fno-builtin-strlen?
Do you even think there is a problem?
Comment 4 xpucmoc 2012-09-11 18:25:22 UTC
This strlen re-direct bug exists in valgrind 3.8.0. Apparently valgrind developers are in defensive mode on that judging from their comments. I have glibc 2.16.0 gcc 4.7.1 and compiled valgrind 3.8.0 from source. The /lib/ld.so is not striped, however it does not contain strlen symbol. It does contain strnlen symbol.
Comment 5 Thomas Fischer 2012-11-09 19:58:18 UTC
I was facing the same or a very similar problem: valgrind was complaining about a missing strlen in glibc despite having debug symbols installed (splitdebug in Gentoo).
The solution/work around I found was due to Christian Kruse [1]. It seems that during glibc compilation, strlen gets remove/rewritten due to code optimization and thus no longer exists as valgrind expects. The work around is to disable any change on strlen.
For documentation in case Christian's blog entry becomes unavailable, I am citing from his posting:

"I already was familiar with this problem and I knew using FEATURES="nostrip" and a re-emerge of sys-libs/glibc should help. But after doing that the problem was not gone. After some research I found out that with -O2 the GCC inlines strlen(). It is a code optimization, but in this case it causes valgrind to stop working. The solution is somewhat messy: you have to create a portage overlay of sys-libs/glibc and patch files/eblits/common.eblit: find the line containing append-flags -O2 -fno-strict-aliasing and append -fno-builtin-strlen. Then re-emerge glibc (do not forget to run ebuild glibc-ver.ebuild digest) and everything should work." (cited from [1]).

I can confirm that after a recompile of glibc, valgrind works again.

[1] http://ck.kennt-wayne.de/2012/jan/valgrind-again%3A-strlen-redirection
Comment 6 Cody P Schafer 2013-04-27 07:00:47 UTC
*** This bug has been confirmed by popular vote. ***
Comment 7 zephyrus00jp 2013-08-30 02:36:05 UTC
Hi,

Recently I switched from 32bits Debian GNU/Linux to 64bits Debian GNU/Linux.
Suddenly valgrind stops working exactly the same symptom.
(Valgrind worked in 32bits Debian GNU/Linux just fine.)

uname -a
Linux vm-debian-amd64 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux
I am debugging mozilla thunderbird client.
The binary is linked to dynamic libraries as follows.

ishikawa@vm-debian-amd64:/REF-OBJ-DIR/objdir-tb3$ ldd mozilla/dist/bin/thunderbird-bin 
	linux-vdso.so.1 (0x00007fffa73ff000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f60a6449000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f60a6245000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f60a5f3d000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f60a5c3e000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f60a5a28000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f60a567c000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f60a6680000)
ishikawa@vm-debian-amd64:/REF-OBJ-DIR/objdir-tb3$ 

I thought initially that the problem was caused by the mixing of 
64bits and 32bits libraries (My long term goal was to create 32bits binary and
debug it, so I installed 32bits libraries.)

But right now, 64bits native binary cannot be run under valgrind.

==15417== Memcheck, a memory error detector
==15417== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==15417== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==15417== Command: /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin -profile /REF-OBJ-DIR/objdir-tb3/mozilla/_tests/mozmill/mozmillprofile -jsbridge 24242 -foreground
==15417==

valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination
valgrind:  cannot be set up.  Details of the redirection are:
valgrind:
valgrind:  A must-be-redirected function
valgrind:  whose name matches the pattern:      strlen
valgrind:  in an object with soname matching:   ld-linux-x86-64.so.2
valgrind:  was not found whilst processing
valgrind:  symbols from the object with soname: ld-linux-x86-64.so.2
valgrind:
valgrind:  Possible fixes: (1, short term): install glibc's debuginfo
valgrind:  package on this machine.  (2, longer term): ask the packagers
valgrind:  for your Linux distribution to please in future ship a non-
valgrind:  stripped ld.so (or whatever the dynamic linker .so is called)
valgrind:  that exports the above-named function using the standard
valgrind:  calling conventions for this platform.  The package you need
valgrind:  to install for fix (1) is called
valgrind:
valgrind:    On Debian, Ubuntu:                 libc6-dbg
valgrind:    On SuSE, openSuSE, Fedora, RHEL:   glibc-debuginfo
valgrind:
valgrind:  Cannot continue -- exiting now.  Sorry.

I have already installed libc6-dbg installed.

From the posts above, do I have to compile the source again using a different
GCC compile description (the source files of every single program that I want to run under valgrind) ? That is tough :-(

An easy way out will be preferred. Where can I tweak valgrind so that it won't
look for strlen in cases like this (like adding -no-search-strlen as command option)?

TIA
Comment 8 zephyrus00jp 2013-08-30 21:33:23 UTC
Created attachment 82043 [details]
A temporary bandage fix to ignore strlen() redirection for amd86 linux architecture

By applying the patch, I could run valgrind on Debian GNU/Linux amd64 64bits) version.
The patch basically disable's the request for catching the call to the native strlen()
and monitor it with valgrind's internal version.

Julian, it seems that, at least on Debian GNU/Linux amd64, valgrind does not bomb out even if strlen is not traced by memcheck. After all, gcc-4.8 which I use to compile mozilla's thunderbird seems to replace all calls to strlen() with an inlined version and so there is no loss (and no gain) by not redirecting the call to strlen().

People's mileage may vary depending on
 - the OS version
 - the versions of the library (especially   /lib64/ld-2.17.so, /lib/ld-n.nn.so,
  and friends.), and
 - the compiler used.

I have patched 3.9.0 SVN version (obtained by svn command from the repository) and compiled it myself using the patch.

Now that valgrind can run without having to change libc (which I was not so sure how to do easily)  on my Debian GNU/Linux amd64, I have to figure out though why  the following command results in crashing thunderbird :
BAD: valgrind --trace-children=yes --smc-check=all-non-file --gen-suppressions=all --track-origins=yes --read-var-info=yes --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50 --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/cross-architecture.sup --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/i386-redhat-linux-gnu.sup --suppressions=$HOME/Dropbox/myown.sup --show-possibly-lost=no  /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin 

while the following command run thunderbird successfully
GOOD:
 valgrind --read-var-info=yes --trace-children=yes --smc-check=all-non-file --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50 --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/cross-architecture.sup --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/i386-redhat-linux-gnu.sup --suppressions=$HOME/Dropbox/myown.sup --show-possibly-lost=no  /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin 

Note the addition of --gen-suppression=all on the BAD command line. It seems that
valgrind itself is crashing, but the stacktrace shown is that of thunderbird (but without proper symbolic trace.) I will file a separate entry after more investigation.

TIA
Comment 9 Matt 2013-09-09 00:39:11 UTC
(In reply to comment #8)
> Created attachment 82043 [details]
> A temporary bandage fix to ignore strlen() redirection for amd86 linux
> architecture
> 
> By applying the patch, I could run valgrind on Debian GNU/Linux amd64
> 64bits) version.
> The patch basically disable's the request for catching the call to the
> native strlen()
> and monitor it with valgrind's internal version.
> 
> Julian, it seems that, at least on Debian GNU/Linux amd64, valgrind does not
> bomb out even if strlen is not traced by memcheck. After all, gcc-4.8 which
> I use to compile mozilla's thunderbird seems to replace all calls to
> strlen() with an inlined version and so there is no loss (and no gain) by
> not redirecting the call to strlen().
> 
> People's mileage may vary depending on
>  - the OS version
>  - the versions of the library (especially   /lib64/ld-2.17.so,
> /lib/ld-n.nn.so,
>   and friends.), and
>  - the compiler used.
> 
> I have patched 3.9.0 SVN version (obtained by svn command from the
> repository) and compiled it myself using the patch.
> 
> Now that valgrind can run without having to change libc (which I was not so
> sure how to do easily)  on my Debian GNU/Linux amd64, I have to figure out
> though why  the following command results in crashing thunderbird :
> BAD: valgrind --trace-children=yes --smc-check=all-non-file
> --gen-suppressions=all --track-origins=yes --read-var-info=yes
> --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50
> --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/cross-
> architecture.sup
> --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/i386-
> redhat-linux-gnu.sup --suppressions=$HOME/Dropbox/myown.sup
> --show-possibly-lost=no 
> /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin 
> 
> while the following command run thunderbird successfully
> GOOD:
>  valgrind --read-var-info=yes --trace-children=yes --smc-check=all-non-file
> --malloc-fill=0xA5 --free-fill=0xC3 --leak-check=full --num-callers=50
> --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/cross-
> architecture.sup
> --suppressions=$HOME/TB-NEW/TB-3HG/new-src/mozilla/build/valgrind/i386-
> redhat-linux-gnu.sup --suppressions=$HOME/Dropbox/myown.sup
> --show-possibly-lost=no 
> /REF-OBJ-DIR/objdir-tb3/mozilla/dist/bin/thunderbird-bin 
> 
> Note the addition of --gen-suppression=all on the BAD command line. It seems
> that
> valgrind itself is crashing, but the stacktrace shown is that of thunderbird
> (but without proper symbolic trace.) I will file a separate entry after more
> investigation.
> 
> TIA

This patch was a big help to me, thanks.  I can't develop without valgrind. 

 It does expose an issue on Ubuntu/Debian 64 bit machines and maybe others.  Is it reasonable to have no entry point for strlen(),  which is an ANSI C library function? 

It would certainly be 'easier' in some sense if valgrind moved to another means of detecting a library with proper debug symbols.