Summary: | MacOSX: 64-bit valgrind segfaults on launch when built with Xcode 4.0.1 | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Sascha Kratky <skratky> |
Component: | general | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | bart.vanassche+kde, clattner, fons.rademakers, gparker, grassi, james.abley, jpeter, mike, mldgodard, nireon, Panajev, snc, sylvain.faychatelard |
Priority: | NOR | ||
Version: | 3.6.0 | ||
Target Milestone: | --- | ||
Platform: | Unlisted Binaries | ||
OS: | macOS | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
program that does the transformation listed in comment #8
proposed patch |
Description
Sascha Kratky
2011-03-08 20:11:01 UTC
Also occurs on Mac OS X Server 10.6.7 (10J869) I was able to work around the issue by building 32-bit valgrind, instead of the default 64-bit for my system. According to raim: building valgrind 64-bit version runs fine on 32-bit kernel. Perhaps the issue is only when building 64-bit on 64-bits. Looks like a dupe of http://bugs.kde.org/show_bug.cgi?id=267997, can you reproduce the same output with the same flags? *** Bug 269641 has been marked as a duplicate of this bug. *** Some initial results: * I can reproduce this with Xcode 4.0.1. * AFAICS it only affects the valgrinding of 64-bit processes; 32-bits is OK * The tool executables (big files of the form memcheck-amd64-darwin, etc) segfault within a few instructions of gaining control from the kernel. * My initial impression is that this is due to a bug in the linker (/usr/bin/ld), which is perhaps a new implementation in 4.0.x ? $ /usr/bin/ld -v @(#)PROGRAM:ld PROJECT:ld64-123.2 llvm version 2.9svn, from Apple Clang 2.0 (build 138) Comparing the MachO load commands vs a (working) tool executable that was created by Xcode 3.2.x, it appears that the new linker has partially ignored the build system's request to place the tool executable's stack at a non standard location. The build system tells the linker "-stack_addr 0x134000000 -stack_size 0x800000". With the Xcode 3.2 linker those flags produce two results: (1) A load command to allocate the stack at the said location: Load command 3 cmd LC_SEGMENT_64 cmdsize 72 segname __UNIXSTACK vmaddr 0x0000000133800000 vmsize 0x0000000000800000 fileoff 2285568 filesize 0 maxprot 0x00000007 initprot 0x00000003 nsects 0 flags 0x0 (2) A request (in LC_UNIXTHREAD) to set %rsp to the correct value at process startup, 0x134000000. With Xcode 4.0.1, (1) is missing but (2) is still present. The tool executable therefore starts up with %rsp pointing to unmapped memory and faults almost instantly. * Xcode 4.0.1 linking a 32 bit tool executable does not omit (1), and so works correctly. I also see the same situation with it only affecting 64-bit binaries. One really sick workaround is to observe that the executables contain a redundant MachO load command: Load command 2 cmd LC_SEGMENT_64 cmdsize 72 segname __LINKEDIT vmaddr 0x0000000138dea000 vmsize 0x00000000000ad000 fileoff 2658304 filesize 705632 maxprot 0x00000007 initprot 0x00000001 nsects 0 flags 0x0 The described section presumably contains information intended for the dynamic linker, but is irrelevant because this is a statically linked executable. Hence it might be possible to postprocess the executables after linking, to overwrite this entry with the information that would have been in the missing __UNIXSTACK entry. I tried this by hand (with a binary editor) earlier and got something that worked. Created attachment 58477 [details] program that does the transformation listed in comment #8 Here's a program that does the transformation listed in comment 8. Using it I can transform my segfaulting tool executables (eg, memcheck-amd64-darwin) into ones that work properly. I would be interested to hear whether it works for other people. WARNING: this program will silently and irreversibly modify 64-bit Mach-O executables, in a way that will cause (ordinary ones) to no longer work. Do not use it unless you understand the discussion above. Program is pretty rough, magic values are hardwired, error checking is inadequate, etc, but it seems to work. It will refuse to modify a 32 bit executable on the basis that the xcode 4.0.1 linker doesn't have problems with them (so it's not necessary). How to use (eg): $ ./vg-in-place date ./vg-in-place: line 31: 82073 Segmentation fault VALGRIND_LIB="$vgbasedir/.in_place" VALGRIND_LIB_INNER="$vgbasedir/.in_place" "$vgbasedir/coregrind/valgrind" "$@" $ gcc -m64 -Wall -g -O -o fixup_macho_loadcmds fixup_macho_loadcmds.c $ ./fixup_macho_loadcmds ./memcheck/memcheck-amd64-darwin size 3580824 fd 3 load cmd: offset 32 size 392 kind 25 = LC_SEGMENT_64 load cmd: offset 424 size 472 kind 25 = LC_SEGMENT_64 load cmd: offset 896 size 72 kind 25 = LC_SEGMENT_64 modification begins modification done load cmd: offset 968 size 24 kind 2 = LC_SYMTAB load cmd: offset 992 size 24 kind 27 = LC_UUID load cmd: offset 1016 size 184 kind 5 = LC_UNIXTHREAD UnixThread: flavor 4 = x86_THREAD_STATE64 rsp = 0x134000000 $ ./vg-in-place date ==82086== Memcheck, a memory error detector [... it at least starts up without dying ...] Note that this example is for modifying the tool executables in the build tree. Normally you'd want to modify them in the installation tree, eg, $prefix/lib/valgrind/memcheck-amd64-darwin, etc. I can't reproduce this with a test app. What exactly is the compile or link line for an executable that gets the bad stack? (In reply to comment #10) Standard co & build of valgrind-trunk on 10.6.x w/ xcode-4.0.1 Then: cd none /usr/bin/ld -static -arch x86_64 -macosx_version_min 10.5 -o none-amd64-darwin -u __start -e __start -image_base 0x138000000 -stack_addr 0x134000000 -stack_size 0x800000 none_amd64_darwin-nl_main.o ../coregrind/libcoregrind-amd64-darwin.a ../VEX/libvex-amd64-darwin.a gdb ./none-amd64-darwin (gdb) run Starting program: /Users/macuser/VgTRUNK/trunk/none/none-amd64-darwin Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0x0000000134000000 _start_in_C_darwin (pArgc=0x134000000) at m_main.c:3107 3107 Int argc = *(Int *)pArgc; // not pArgc[0] on LP64 (gdb) quit ../fixup_macho_loadcmds ./none-amd64-darwin gdb ./none-amd64-darwin (gdb) run Starting program: /Users/macuser/VgTRUNK/trunk/none/none-amd64-darwin valgrind: You cannot run '/Users/macuser/VgTRUNK/trunk/none/none-amd64-darwin' directly. valgrind: You should use $prefix/bin/valgrind. Reproduced. Looks like it fails in the presence of -static. A cleaner workaround would be to use -sectcreate to insert the stack data, rather than hijacking the __LINKEDIT. (In reply to comment #12) > Reproduced. Looks like it fails in the presence of -static. Thanks for the confirmation. Can this get fixed for xcode 4.0.2 ? If so, is there some kind of bug number or tag that we can track it via? As per discussion above I can work around it in the build system for the time being, but it's not a good permanent solution. I filed rdar://9216420. I don't know what the fix schedule will be. Created attachment 58577 [details]
proposed patch
Here's a complete proposed patch. It's against the SVN trunk but will
probably apply and work for 3.6.x as well.
After applying it, you will need to rebuild Valgrind from distclean
(iow, make distclean ; ./autogen.sh ; then configure and build as
normal.) The build system then automatically post-processes the tool
executables as discussed above, so they should Just Work (tm).
This works for me for OSX 10.6.x using Xcode 4.0.1. I would
appreciate people testing the following two combinations
OSX 10.6.x, Xcode 3.2.x (to check it doesn't break w/ the old
Xcode)
OSX 10.5.x, Xcode 3.2.x (to check it doesn't break Leopard)
since I don't want to check in something that breaks older setups, but
I can't check either of those easily myself.
Committed, r11686. I am assuming it does not cause breakage on for the untested combinations listed in comment #15. If it does, please re-open. I'll assume the same and add it to Homebrew. If it makes things break, I'll be sure to push the issues upstream. Fails to build against 3.6.1 with: ranlib: file: libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-elf.o) has no symbols "my" variable $r masks earlier declaration in same scope at ../coregrind/link_tool_exe_darwin line 181. Can't exec "../coregrind/fixup_macho_loadcmds": No such file or directory at ../coregrind/link_tool_exe_darwin line 181. make[3]: *** [memcheck-amd64-darwin] Error 1 make[2]: *** [install-recursive] Error 1 make[1]: *** [install-recursive] Error 1 make: *** [install] Error 2 And non-parallel version: link_tool_exe_darwin: /usr/bin/ld -static -arch x86_64 -macosx_version_min 10.5 -o memcheck-amd64-darwin -u __start -e __start -image_base 0x138000000 -stack_addr 0x134000000 -stack_size 0x800000 memcheck_amd64_darwin-mc_leakcheck.o memcheck_amd64_darwin-mc_malloc_wrappers.o memcheck_amd64_darwin-mc_main.o memcheck_amd64_darwin-mc_translate.o memcheck_amd64_darwin-mc_machine.o memcheck_amd64_darwin-mc_errors.o ../coregrind/libcoregrind-amd64-darwin.a ../VEX/libvex-amd64-darwin.a link_tool_exe_darwin: ../coregrind/fixup_macho_loadcmds 0x134000000 0x800000 memcheck-amd64-darwin Can't exec "../coregrind/fixup_macho_loadcmds": No such file or directory at ../coregrind/link_tool_exe_darwin line 181. Comment #18 and #19: are these from-distclean builds? You need to make distclean, since the change updates the Makefile.am's. These are from clean builds from applying that revision's patch to the 3.6.1 tarball and doing configure;make; make install. There's no autogen.sh in the release tarballs, should I run autoreconf or something instead? Sorry for failing :( *** Bug 270309 has been marked as a duplicate of this bug. *** *** Bug 271337 has been marked as a duplicate of this bug. *** *** Bug 267342 has been marked as a duplicate of this bug. *** Nice! Thanks a lot! Le 22 avr. 2011 à 09:46, Julian Seward <jseward@acm.org> a écrit : > https://bugs.kde.org/show_bug.cgi?id=267997 > > > Julian Seward <jseward@acm.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |mldgodard@gmail.com > > > > > --- Comment #24 from Julian Seward <jseward acm org> 2011-04-22 09:37:53 --- > *** Bug 267342 has been marked as a duplicate of this bug. *** > > -- > Configure bugmail: https://bugs.kde.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug. *** Bug 274784 has been marked as a duplicate of this bug. *** *** Bug 267769 has been marked as a duplicate of this bug. *** *** Bug 268792 has been marked as a duplicate of this bug. *** *** Bug 270311 has been marked as a duplicate of this bug. *** *** Bug 283325 has been marked as a duplicate of this bug. *** *** Bug 276637 has been marked as a duplicate of this bug. *** |