Bug 193413 - valgrind doesn't work when linked with gold instead of ld
Summary: valgrind doesn't work when linked with gold instead of ld
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Unlisted Binaries Linux
: NOR major
Target Milestone: wanted3.6.0
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-20 22:00 UTC by Dan Kegel
Modified: 2010-06-03 17:45 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Minimally disruptive script to build and install gold (1.59 KB, application/x-shellscript)
2009-05-20 22:01 UTC, Dan Kegel
Details
proposed fix (21.46 KB, patch)
2010-05-30 22:06 UTC, Julian Seward
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Kegel 2009-05-20 22:00:07 UTC
Any version of valgrind (I tried 3.4.1, current svn, and some older svn revs
before I figured this out) fails with a message like
  valgrind: mmap(0x8048000, 90112) failed in UME with error 22 (Invalid argument).
  valgrind: this can be caused by executables with very large text, data or bss segments.
if valgrind was linked using gold instead of ld.

This happens very quickly, before any other output, and happens even with 
--tool=none.  No other output is shown even with -v.

Easiest way with recent debian or Ubuntu Koala is "sudo apt-get install binutils-gold".  On older systems, you have to build gold yourself. 
I'll attach an easy script that builds gold and installs it as the
system linker (and renames the old linker to ld.orig).
Comment 1 Dan Kegel 2009-05-20 22:01:21 UTC
Created attachment 33872 [details]
Minimally disruptive script to build and install gold
Comment 2 Dan Kegel 2009-05-20 22:09:36 UTC
This was with gold built from the binutils-2.19.1 source tarball.
Comment 3 Julian Seward 2009-05-21 01:04:09 UTC
Err, maybe our top level linker script screwed up somehow, and
so our executables are linked with the text segment being in the
wrong place.  It should be at 0x38000000.  What sayeth
readelf -l .../memcheck-amd64-linux ?

Anyway .. if gold can't link a program which older binutils links
without difficulty, isn't that a bug in gold?
Comment 4 Dan Kegel 2009-05-21 01:10:35 UTC
$ readelf -l valgrind-3.4.1*/memcheck/memcheck-x86-linux

File: valgrind-3.4.1.gold/memcheck/memcheck-x86-linux

Elf file type is EXEC (Executable file)
Entry point 0x80709d0
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x1b9df8 0x1b9df8 R E 0x1000
  LOAD           0x1ba000 0x08202000 0x08202000 0x005a0 0x71e848 RW  0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0

 Section to Segment mapping:
  Segment Sections...
   00     .text .rodata .eh_frame 
   01     .data .bss 
   02     

File: valgrind-3.4.1/memcheck/memcheck-x86-linux

Elf file type is EXEC (Executable file)
Entry point 0x380289d0
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x38000000 0x38000000 0x1b8238 0x1b8238 R E 0x1000
  LOAD           0x1b8240 0x381b9240 0x381b9240 0x005a0 0x71e860 RW  0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

 Section to Segment mapping:
  Segment Sections...
   00     .text .rodata .eh_frame 
   01     .data .bss 
   02     

The program links just fine.  It just doesn't run :-)

I did file an FYI bug against gold,
http://sourceware.org/bugzilla/show_bug.cgi?id=10178
but if it's a gold bug, they will probably need some
analysis from a valgrind developer to show them where
they screwed up.
Comment 5 Julian Seward 2009-05-21 01:51:46 UTC
Yep, gold ignored our request to put the text segment at 0x38000000:
w/gold

LOAD           0x000000 0x08048000 0x08048000 0x1b9df8 0x1b9df8 R E 0x1000

w/std binutils
LOAD           0x000000 0x38000000 0x38000000 0x1b8238 0x1b8238 R E 0x1000

Result is, V is linked at the default address (0x8048000) and so when
it comes to load the executable to be run, which also wants to be at
0x8048000, it can't.  So it just gives up, which is what you saw.

Next step is to figure out why gold doesn't like our linker script,
valt_load_address_x86_linux.lds.

Funny thing is though, I'm sure I remember trying gold on V some time
last year, and it worked fine.
Comment 6 Dan Kegel 2009-06-24 22:47:14 UTC
In http://sourceware.org/bugzilla/show_bug.cgi?id=10178#c2 
Ian (the author of both ld and gold) said
"The core issue is that valgrind runs the linker with --verbose to extract the
linker script.  It then seds the linker script to adjust the start address of
the text segment.  This works OK with the GNU linker, but it fails with gold,
since gold does not have a default linker script.
With gold it ought to work to simply use the -Ttext option.
In any case, gold will never have a default linker script, so this is not
something that can be fixed in gold."

So I guess the ball is in valgrind's court.
Comment 7 Ian Lance Taylor 2009-06-24 22:49:57 UTC
I took a look at building valgrind 3.4.1 with gold.  The valgrind build works
by running the linker with the --verbose option to extract the linker script. 
It then edits the linker script to set the address of the start of the text
segment.  This approach works with the GNU linker, but it will never work with
gold, as gold does not have a default linker script.

With gold it will almost certainly work to simply link with a -Ttext option
specifying where you want the text segment to start.  No linker script should
be necessary if that is all you need to do.
Comment 8 Julian Seward 2010-05-26 13:42:05 UTC
I think I have a plan to fix this.  Can anyone tell me what an easy
way to try out gold is, without trashing/rebuilding my existing tool
chain?  That's what I need to know to get started.
Comment 9 Ian Lance Taylor 2010-05-26 15:02:15 UTC
Here is one approach.  After you download the GNU binutils, built gold using
    mkdir objdir
    cd objdir
    ../binutilssrc/configure --enable-gold --prefix=/my/private/installdir
    make all-gold
    make install-gold

That will give you a binary /my/private/installdir/bin/ld.  Temporarily put that bin directory first on your PATH.  Make sure that when valgrind is built it uses that ld rather than the default one.
Comment 10 Julian Seward 2010-05-28 20:21:21 UTC
(In reply to comment #9)
 > That will give you a binary /my/private/installdir/bin/ld.  Temporarily put
> that bin directory first on your PATH.  Make sure that when valgrind is built
> it uses that ld rather than the default one.

This doesn't work, alas.  I can build gold OK:

$ ld --version
GNU gold (GNU Binutils 2.20.1.20100303) 1.9

but the distro gcc still links using 
/usr/lib64/gcc/x86_64-suse-linux/4.3/collect2

Is there a way to tell gcc to use a different linker? -Bsomething ?
Comment 11 Julian Seward 2010-05-28 20:24:00 UTC
But there's something broader I don't understand.  It appears
that (at least according to ld --help) both gold and the old
linker understand "-Ttext".  So if the old linker understands
-Ttext, why are we arsing around with linker scripts and sed?
-Ttext should work for both old and new linkers.
Comment 12 Julian Seward 2010-05-28 20:46:38 UTC
FWIW, linking using the old linker, no linker script, and simply
-Wl,-Ttext=0x38000000 gives a tool executable that works just fine.
Urr.  And passing the same command line to gold as is passed to
collect2 also produces a working executable.

So -- unless there's a good reason not to do so -- I propose to
simply nuke the linker scripts, and use -Ttext, and that should
Just Work (tm).
Comment 13 Julian Seward 2010-05-30 22:06:33 UTC
Created attachment 47498 [details]
proposed fix

Here's a proposed fix.  I think it should work with gold
but as per comments above I so far have been unable to get
gcc to run gold.  Can someone with a gold-enabled system
please try it?
Comment 14 Ian Lance Taylor 2010-05-31 06:07:04 UTC
gcc always uses the collect2 driver program to link.  That is true whether you are using GNU ld or gold.  To see which linker the collect2 driver runs, use -Wl,-debug when you link.
Comment 15 Julian Seward 2010-06-02 02:34:20 UTC
Fixed, r11141.  Although as per comments above, have been unable to
test this with gold.  Closing.  If it still doesn't work with gold,
please re-open.
Comment 16 Alexander Potapenko 2010-06-02 16:08:27 UTC
Looks like this is still reproducible in r11144 for me. I can submit the build log, if necessary.
Comment 17 Julian Seward 2010-06-02 16:48:59 UTC
Is that with a clean (eg, from distclean) build?  This fix involves
messing with the build system, so from-scratch build is really
necessary.
Comment 18 Julian Seward 2010-06-02 23:09:35 UTC
(In reply to comment #16)

Yes, confirming that linking on 64-bit Ubuntu 10.04 also produces a
non-working Valgrind.  This despite 10.04 using traditional ld.

This happens because ld-2.20.1 (on Ubuntu 10.04) inserts an extra
loadable r-- segment which is mapped at the default load address
(0x400000) despite -Ttext=0x38000000 being specified (and honoured)
for text:


Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align

  ** THIS ONE **
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000001b4 0x00000000000001b4  R      200000

  LOAD           0x0000000000200000 0x0000000038000000 0x0000000038000000
                 0x00000000001cad88 0x00000000001cad88  R E    200000
  LOAD           0x00000000003cafd0 0x00000000383cafd0 0x00000000383cafd0
                 0x0000000000001bf0 0x00000000009812b0  RW     200000
  NOTE           0x0000000000000190 0x0000000000400190 0x0000000000400190
                 0x0000000000000024 0x0000000000000024  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8
  GNU_RELRO      0x00000000003cafd0 0x00000000383cafd0 0x00000000383cafd0
                 0x0000000000000030 0x0000000000000030  R      1


The older ld-2.18.50.20080409-11.1 (openSUSE 11.0) on which I
developed this patch did not do that, and so it works fine:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000200000 0x0000000038000000 0x0000000038000000
                 0x00000000001d6928 0x00000000001d6928  R E    200000
  LOAD           0x00000000003d6fd0 0x00000000383d6fd0 0x00000000383d6fd0
                 0x0000000000001bb0 0x0000000000981270  RW     200000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8
  GNU_RELRO      0x00000000003d6fd0 0x00000000383d6fd0 0x00000000383d6fd0
                 0x0000000000000030 0x0000000000000030  R      1


Ian, is there any surefire way to get ld to map *all* loadable
sections at an address at or above that specified by -Ttext ?

FWIW we never had this problems when using linker scripts to
specify an alternative load address.
Comment 19 Ian Lance Taylor 2010-06-03 06:37:44 UTC
You could try using -Ttext-segment instead of -Ttext with GNU ld.

-Ttext should work reliably with gold.  Unfortunately, gold does not currently have a -Ttext-segment option.
Comment 20 Julian Seward 2010-06-03 10:22:16 UTC
> Unfortunately, gold does not currently have a -Ttext-segment option.

Neither do older versions of ld (2.18), it seems.  --build-id=none
seems a better bet.
Comment 21 Julian Seward 2010-06-03 10:24:11 UTC
(In reply to comment #16)
> Looks like this is still reproducible in r11144 for me.

Alexander, svn up (r11146 or later) and try again.
Comment 22 Alexander Potapenko 2010-06-03 17:45:58 UTC
The problem has gon for me in r11146. I'm using gold 1.9 (binutils 2.20)