Bug 339045 - Getting valgrind to compile and run on OS X Yosemite (10.10)
Summary: Getting valgrind to compile and run on OS X Yosemite (10.10)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: unspecified
Platform: Other macOS
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
: 340232 340252 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-09-13 09:51 UTC by FX
Modified: 2014-11-11 12:55 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Tentative patch (8.06 KB, patch)
2014-09-13 09:51 UTC, FX
Details
Second version of a patch (13.34 KB, patch)
2014-09-13 12:42 UTC, FX
Details
Amend and extend 10.10 syscalls with stubs and increase MAXSYSCALL to 490 (4.87 KB, patch)
2014-10-14 12:26 UTC, Rhys Kidd
Details
xnu-source-diff-10.9.5-to-10.10 (6.98 KB, text/plain)
2014-10-31 08:30 UTC, Rhys Kidd
Details
Proof of concept: Mixup PAGEZERO vmaddr finishing script (1.51 KB, text/x-python-script)
2014-11-05 11:50 UTC, Rhys Kidd
Details
A fake voucher_mach_msg_set for OSX 10.10 64-bit (1.16 KB, patch)
2014-11-05 17:34 UTC, Julian Seward
Details
Extend fixup_macho_loadcmds.c to do __PAGEZERO.vmaddr mashing (1.74 KB, patch)
2014-11-05 17:35 UTC, Julian Seward
Details
Amend and extend 10.10 syscalls with stubs and increase MAXSYSCALL to 490 (v2) (4.86 KB, patch)
2014-11-06 10:14 UTC, Rhys Kidd
Details
A fake voucher_mach_msg_set for OSX 10.10 (v2) (1.13 KB, patch)
2014-11-06 10:42 UTC, Rhys Kidd
Details

Note You need to log in before you can comment on or make changes to this bug.
Description FX 2014-09-13 09:51:01 UTC
Created attachment 88690 [details]
Tentative patch

With the help of the attached patch, I've tried building valgrind for Mac OS X "10.10" Yosemite. The patch adjusts the configure script, adds a darwin14.supp file that is a copy of darwin13.supp (for now), and extends version checking somewhat in syswrap-amd64-darwin.c.

I manage to build valgrind now, but the resulting executable dies at launch from a SIGKILL. I get this message in the Console:

kernel[0]: Cannot enforce a hard page-zero for memcheck-amd64-darwin

I'm interested in getting valgrind to run on Yosemite, but I don't know how to proceed further than this. If given instructions to debug, I am willing to perform tests.
Comment 1 FX 2014-09-13 12:42:09 UTC
Created attachment 88694 [details]
Second version of a patch

Second version of my patch, including some cases I hadn't caught before. Still fails with same symptoms, though.
Comment 2 Rhys Kidd 2014-10-14 11:34:35 UTC
FYI: Listing of pre-release +/- in the Kernel.framework API.

https://developer.apple.com/librarY/prerelease/mac/documentation/General/Reference/APIDiffsMacOSX10_10SeedDiff/frameworks/Kernel.html
Comment 3 Rhys Kidd 2014-10-14 12:26:38 UTC
Created attachment 89127 [details]
Amend and extend 10.10 syscalls with stubs and increase MAXSYSCALL to 490
Comment 4 Rhys Kidd 2014-10-14 12:30:13 UTC
I've applied FX's second patch but experience compiler issues with a new kernel feature, when 'mach_voucher' is linked in from the MIG stub generator for Mach IPC.

===
link_tool_exe_darwin: /usr/bin/ld -static -arch x86_64 -macosx_version_min 10.5 -o memcheck-amd64-darwin -u __start -e __start -image_base 0x138000000 -stack_addr 0x134000000 -stack_size 0x800000 memcheck_amd64_darwin-mc_leakcheck.o memcheck_amd64_darwin-mc_malloc_wrappers.o memcheck_amd64_darwin-mc_main.o memcheck_amd64_darwin-mc_translate.o memcheck_amd64_darwin-mc_machine.o memcheck_amd64_darwin-mc_errors.o ../coregrind/libcoregrind-amd64-darwin.a ../VEX/libvex-amd64-darwin.a
Undefined symbols for architecture x86_64:
  "_voucher_mach_msg_set", referenced from:
      __kernelrpc_mach_vm_allocate in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_deallocate in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_protect in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_inherit in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_read in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_read_list in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_write in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      ...
ld: symbol(s) not found for architecture x86_64
make[3]: *** [memcheck-amd64-darwin] Error 1
====

In the interim I've added in stubs for the additional 10.10 syscalls and bumped MAXSYSCALL.
The patch should be applied after FX's second one and any merger conflicts cleaned up.
Comment 5 Rhys Kidd 2014-10-16 11:39:41 UTC
The MIG linking problem above can be avoided in the interim by temporarily renaming the following system header: /usr/include/mach/mig_voucher_support.h

It is a dummy header file created for MIG to check when to include voucher code.

This voucher code, guarded by an ifdef# USING_VOUCHERS in previous released of OS X is now exposed in Yosemite 10.10.

With this amendment made, I've been able to build successfully:

$ ./autogen.sh
$ ./configure --disable-tls // This is required on OS X 10.9 currently as well
$ make
$ sudo make install

Now to address the 'Cannot enforce a hard page-zero for ... /memcheck-amd64-darwin' error when running simple programs like 'valgrind ls'
Comment 6 Rhys Kidd 2014-10-16 13:26:52 UTC
Some small success to report. Valgrind (compiled for 32 bit only) runs simple 32bit programs on OS X 10.10 Yosemite.

// Instructions to compile 32 bit only Valgrind
// 
// May need to initially run 'sudo make uninstall', or manually clear the /usr/local/lib/valgrind folder first of any stray 64 bit compiled valgrind components
//
$ patch -p0 -i <13 Sep and 14 Oct patches on this report>
$ sudo mv  /usr/include/mach/mig_voucher_support.h  /usr/include/mach/mig_voucher_support.h.BACKUP
$ make clean
$ ./autogen.sh
$ ./configure --disable-tls --enable-only32bit
$ make
$ sudo make install
$ echo "int main() { free(1); return 0; }" | gcc -xc - -g -o a.out32 -m32
$ valgrind ./a.out32
==63512== Memcheck, a memory error detector
==63512== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==63512== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==63512== Command: ./a.out32
==63512== 
==63512== Invalid free() / delete / delete[] / realloc()
==63512==    at 0x7D25: free (vg_replace_malloc.c:477)
==63512==    by 0x1F88: main (<stdin>:1)
==63512==  Address 0x1 is not stack'd, malloc'd or (recently) free'd
==63512== 
==63512== 
==63512== HEAP SUMMARY:
==63512==     in use at exit: 2,967 bytes in 74 blocks
==63512==   total heap usage: 75 allocs, 2 frees, 2,971 bytes allocated
==63512== 
==63512== LEAK SUMMARY:
==63512==    definitely lost: 0 bytes in 0 blocks
==63512==    indirectly lost: 0 bytes in 0 blocks
==63512==      possibly lost: 0 bytes in 0 blocks
==63512==    still reachable: 0 bytes in 0 blocks
==63512==         suppressed: 2,967 bytes in 74 blocks
==63512== 
==63512== For counts of detected and suppressed errors, rerun with: -v
==63512== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Comment 7 Tom Hughes 2014-10-22 17:13:03 UTC
*** Bug 340232 has been marked as a duplicate of this bug. ***
Comment 8 Tom Hughes 2014-10-23 11:13:50 UTC
*** Bug 340252 has been marked as a duplicate of this bug. ***
Comment 9 Dirkjan Ochtman 2014-10-24 16:27:50 UTC
I tried to compile valgrind with both of the patches applied to current trunk, and it seems to fail. First I had to fix up the patches a bit, it seems they have bitrotten a little, or perhaps there is some overlap between the patches. I can attach my current changes if helpful.

However, while trying to compile valingrind, I now get this error:

Undefined symbols for architecture x86_64:
  "_voucher_mach_msg_set", referenced from:
      __kernelrpc_mach_vm_allocate in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_deallocate in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_protect in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_inherit in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      __kernelrpc_mach_vm_read in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_read_list in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      _mach_vm_write in libcoregrind-amd64-darwin.a(libcoregrind_amd64_darwin_a-mach_vmUser.o)
      ...
ld: symbol(s) not found for architecture x86_64
Comment 10 Rhys Kidd 2014-10-24 23:48:12 UTC
Hi Dirkjan,

From that compile error, please ensure you have moved the usr/include/mach/mig_voucher_support.h file to a temporary other location, then re-run ./autogen.sh and ./configure --disable-tls --enable-32bitonly.

For now V only works in 32bit mode on OS X 10.10 and that last configure option ensures that.
Comment 11 Rhys Kidd 2014-10-31 08:30:47 UTC
Created attachment 89391 [details]
xnu-source-diff-10.9.5-to-10.10

Unified diff between the OS X 10.9.5 and 10.10 xnu kernel source code for bsd/kern/mach_loader.c
Comment 12 Rhys Kidd 2014-11-02 05:39:52 UTC
I've written up a summary of why valgrind is not running on OS X 10.10 in 64 bit mode on StackOverflow (re a related issue). https://stackoverflow.com/questions/26351831/os-x-app-that-runs-on-10-6-to-10-9-doesnt-run-on-10-10-yosemite-why/26696361#26696361

In short,
    - OS X 10.10 newly enforces a hard page zero, of at least size of 0x1000, on 64 bit only.
    - In order to enforce this, the __PAGEZERO segment in MachO executables must have a vmaddr of 0x0, a vmsize of >= 0x1000, a non-zero segment size and minimal initprot and maxprot memory protections.
    - V falls afoul of the first of these requirements, as we manually set the vmaddr of the __PAGEZERO segment to 0x138000000.

Note we can't simply change the setting in configure.ac from 0x138000000 to 0x0 as there is certain stack location math which takes that setting and then subtracts from it. A negative answer is not valid.

Two alternative solutions are to use macholib or similar to:
* add a new finishing step to manually reset the vmaddr of the __PAGEZERO segment on the Valgrind binaries (memcheck-amd64-darwin, ... etc), or
* create a new dummy segment (eg "__VG_PAGEZERO") with the right settings. 

Note the segment name is not checked by the xnu kernel so we can call it whatever we want.
Comment 13 Rhys Kidd 2014-11-05 11:50:05 UTC
Created attachment 89453 [details]
Proof of concept: Mixup PAGEZERO vmaddr finishing script

# Commandline utility to set the PAGEZERO vmaddr of a set of files to 0x0.
# This may partially alleviate the OS X 10.10 Yosemite Valgrind issues with 64bit.
#
# Supply the folder location as first, and only, argument. For instance:
# $ sudo ./fix_pagezero_vmaddr.py /usr/local/lib/valgrind/
Comment 14 Rhys Kidd 2014-11-05 11:53:58 UTC
I've added a proof of concept Python script to address the first of the two identified potential fixes "add a new finishing step to manually reset the vmaddr of the __PAGEZERO segment on the Valgrind binaries (memcheck-amd64-darwin, ... etc)".

The script should run fine on vanilla OS X, as Python and the macholib library come pre-installed.

Whilst not a perfect fix, as I'm now getting errors with unrecognised amd64 IR conversion, the 64 bit executables do now run.

-------
# Copy out that vouchers related system header.
# ./autogen.sh && ./configure --disable-tls
# make && sudo make install
$ valgrind
Killed: 9
$ sudo ./fix_pagezero_vmaddr.py /usr/local/lib/valgrind/
Password:
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/cachegrind-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/cachegrind-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/callgrind-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/callgrind-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/drd-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/drd-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/exp-bbv-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/exp-bbv-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/exp-dhat-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/exp-dhat-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/exp-sgcheck-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/exp-sgcheck-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/helgrind-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/helgrind-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/lackey-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/lackey-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/massif-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/massif-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/memcheck-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/memcheck-amd64-darwin
PRE  vmaddr = 0x138000000L   /usr/local/lib/valgrind/none-amd64-darwin
POST vmaddr = 0x0L   /usr/local/lib/valgrind/none-amd64-darwin
$ valgrind
valgrind: no program specified
valgrind: Use --help for more information.
Comment 15 Rhys Kidd 2014-11-05 11:56:32 UTC
With the above finishing Python script, the error I get running simple 64bit programs is as below. There are no issues running 32bit programs.

----------
$ valgrind /bin/ls
==2134== Memcheck, a memory error detector
==2134== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2134== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==2134== Command: /bin/ls
==2134== 
vex amd64->IR: unhandled instruction bytes: 0x6F 0x66 0x66 0x0 0x65 0x6C 0x69 0x64
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==2134== Invalid read of size 4
==2134==    at 0x10053B037: _h (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100534996: __pfz_setup (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100534974: __libplatform_init (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100060A5E: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==2134==    by 0x7FFF5FC12CEA: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC12E77: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F870: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F805: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F6F7: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F968: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC02229: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC05BE0: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld)
==2134==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==2134== 
==2134== 
==2134== Process terminating with default action of signal 11 (SIGSEGV)
==2134==  Access not within mapped region at address 0x0
==2134==    at 0x10053B037: _h (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100534996: __pfz_setup (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100534974: __libplatform_init (in /usr/lib/system/libsystem_platform.dylib)
==2134==    by 0x100060A5E: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==2134==    by 0x7FFF5FC12CEA: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC12E77: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F870: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F805: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F6F7: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC0F968: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC02229: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==2134==    by 0x7FFF5FC05BE0: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld)
==2134==  If you believe this happened as a result of a stack
==2134==  overflow in your program's main thread (unlikely but
==2134==  possible), you can try to increase the size of the
==2134==  main thread stack using the --main-stacksize= flag.
==2134==  The main thread stack size used in this run was 8388608.
==2134== 
==2134== HEAP SUMMARY:
==2134==     in use at exit: 0 bytes in 0 blocks
==2134==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==2134== 
==2134== All heap blocks were freed -- no leaks are possible
==2134== 
==2134== For counts of detected and suppressed errors, rerun with: -v
==2134== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault: 11
Comment 16 Tom Hughes 2014-11-05 12:27:55 UTC
Wll 0x6F is outs but that seems somewhat unlikely...
Comment 17 Julian Seward 2014-11-05 16:06:17 UTC
(In reply to rhyskidd from comment #5)
> The MIG linking problem above can be avoided in the interim by temporarily
> renaming the following system header: /usr/include/mach/mig_voucher_support.h

That (hacking around one's installation) is a bit unsustainable.  I did this instead:

Index: coregrind/m_main.c
===================================================================
--- coregrind/m_main.c	(revision 14691)
+++ coregrind/m_main.c	(working copy)
@@ -3751,6 +3751,22 @@
 #endif
 
 
+/*====================================================================*/
+/*=== Dummy _voucher_mach_msg_set for OSX 10.10                    ===*/
+/*====================================================================*/
+
+#if defined(VGP_amd64_darwin) && DARWIN_VERS == DARWIN_10_10
+
+/* 64-bit builds on MacOSX 10.10 seem to need this for some reason. */
+void voucher_mach_msg_set ( void );
+void voucher_mach_msg_set ( void )
+{
+   I_die_here;
+}
+
+#endif
+
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/
Comment 18 Julian Seward 2014-11-05 16:19:30 UTC
(In reply to rhyskidd from comment #13)
> Created attachment 89453 [details]
> Proof of concept: Mixup PAGEZERO vmaddr finishing script

By a perhaps huge stroke of luck, we already have a C program that
mashes the Darwin tool executables post linking:
coregrind/fix_macho_loadcmds.c.  It solves an entirely different
problem to this one, but I suspect it wouldn't be difficult to modify
it to fix this too.  Doing it in this program would also mean we
sidestep any build system complexity due to acquiring new build-time
dependencies.
Comment 19 Julian Seward 2014-11-05 17:34:06 UTC
Created attachment 89462 [details]
A fake voucher_mach_msg_set for OSX 10.10 64-bit
Comment 20 Julian Seward 2014-11-05 17:35:41 UTC
Created attachment 89463 [details]
Extend fixup_macho_loadcmds.c to do __PAGEZERO.vmaddr mashing
Comment 21 Julian Seward 2014-11-05 18:00:03 UTC
(In reply to rhyskidd from comment #15)
> vex amd64->IR: unhandled instruction bytes: 0x6F 0x66 0x66 0x0 0x65 0x6C
> 0x69 0x64
> vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
> vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
> vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0

That's "xsave", which isn't implemented and looks a bit hairy.  Its
predecessor, fxsave, was a bit of a swamp.  Fortunately we can kludge
around it for the time being by faking our CPUID to claim to be a more
lowly processor that doesn't support xsave, and
/usr/lib/system/libdyld.dylib won't try to use it.

Index: priv/guest_amd64_toIR.c
===================================================================
--- priv/guest_amd64_toIR.c	(revision 2987)
+++ priv/guest_amd64_toIR.c	(working copy)
@@ -21440,7 +21440,7 @@
       if (haveF2orF3(pfx)) goto decode_failure;
       /* This isn't entirely correct, CPUID should depend on the VEX
          capabilities, not on the underlying CPU. See bug #324882. */
-      if ((archinfo->hwcaps & VEX_HWCAPS_AMD64_SSE3) &&
+      if (0&&(archinfo->hwcaps & VEX_HWCAPS_AMD64_SSE3) &&
           (archinfo->hwcaps & VEX_HWCAPS_AMD64_CX16) &&
           (archinfo->hwcaps & VEX_HWCAPS_AMD64_AVX)) {
          fName = "amd64g_dirtyhelper_CPUID_avx_and_cx16";
Comment 22 Rhys Kidd 2014-11-06 10:14:47 UTC
Created attachment 89474 [details]
Amend and extend 10.10 syscalls with stubs and increase MAXSYSCALL to 490 (v2)
Comment 23 Rhys Kidd 2014-11-06 10:41:36 UTC
(In reply to Julian Seward from comment #17)
> That (hacking around one's installation) is a bit unsustainable.  I did this
> instead:

[snip]

I actually found that dummy function was necessary to expose to both 32 and 64 bit on OS X 10.10. Thus my slightly amended patch to propose is:

Index: coregrind/m_main.c
===================================================================
--- coregrind/m_main.c	(revision 14694)
+++ coregrind/m_main.c	(working copy)
@@ -3751,6 +3751,28 @@
 #endif
 
 
+/*====================================================================*/
+/*=== Dummy _voucher_mach_msg_set for OSX 10.10                    ===*/
+/*====================================================================*/
+
+#if DARWIN_VERS == DARWIN_10_10
+
+/* Builds on MacOSX 10.10 seem to need this for some reason. */
+/* extern boolean_t voucher_mach_msg_set(mach_msg_header_t *msg) 
+                    __attribute__((weak_import));
+   I haven't a clue what the return value means, so just return 0.
+   Looks like none of the generated uses in the tree look at the 
+   return value anyway.
+*/
+UWord voucher_mach_msg_set ( UWord arg1 );
+UWord voucher_mach_msg_set ( UWord arg1 )
+{
+   return 0;
+}
+
+#endif
+
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/
Comment 24 Rhys Kidd 2014-11-06 10:42:49 UTC
Created attachment 89475 [details]
A fake voucher_mach_msg_set for OSX 10.10 (v2)
Comment 25 Rhys Kidd 2014-11-06 11:14:02 UTC
So with all the patches below and the one to VEX/priv/guest_amd64_toIR.c I cleanly progress through compiling and 'make install', but still see the same issue with "xsave". Getting close!

My CPU info here is:

$ sysctl -n machdep.cpu.brand_string
Intel(R) Core(TM)2 Duo CPU     P8800  @ 2.66GHz
Comment 26 Julian Seward 2014-11-06 20:37:31 UTC
FX, Rhys, thank you for the various patches + investigation.  I
committed versions of them as r14695,6,7,8.

With those in place, and using the hack-patch in comment 21 and
also the kludge shown below, I can start and run
/Applications/Calculator.app/Contents/MacOS/Calculator and
/Applications/TextEdit.app/Contents/MacOS/TextEdit
with --tool=none.  I haven't looked at --tool=memcheck yet.

Index: coregrind/m_syswrap/syswrap-darwin.c
===================================================================
--- coregrind/m_syswrap/syswrap-darwin.c	(revision 14698)
+++ coregrind/m_syswrap/syswrap-darwin.c	(working copy)
@@ -7750,7 +7750,8 @@
 
       // GrP fixme handle sender-specified message trailer
       // (but is this only for too-secure processes?)
-      vg_assert(! (mh->msgh_bits & MACH_SEND_TRAILER));
+// FIXME!!!!!!!!!!!!!
+//    vg_assert(! (mh->msgh_bits & MACH_SEND_TRAILER));
 
       MACH_REMOTE = mh->msgh_remote_port;
       MACH_MSGH_ID = mh->msgh_id;
Comment 27 FX 2014-11-06 21:33:55 UTC
(In reply to rhyskidd from comment #24)
> A fake voucher_mach_msg_set for OSX 10.10 (v2)

voucher_mach_msg_set() is in /usr/lib/system/libsystem_kernel.dylib. But I don't understand why it's not getting pulled in while some of the other mach_* function calls used in m_mach/vm_mapUser.c, like mach_msg(), from the same library don't give error messages.
Comment 28 Rhys Kidd 2014-11-07 00:40:17 UTC
(In reply to FX from comment #27)
> voucher_mach_msg_set() is in /usr/lib/system/libsystem_kernel.dylib. But I
> don't understand why it's not getting pulled in while some of the other
> mach_* function calls used in m_mach/vm_mapUser.c, like mach_msg(), from the
> same library don't give error messages.

I've seen this before where 'weak external' symbols are linked when an old version, say OS X 10.5, was used as the minimum OS X version supported. As part of resolving the linking issue, I had previously tried playing around with the linker and compiler options related to the minimum OS X version, however with no success.

So there is probably a way to do it properly on OS X 10.10. Could be an interesting exercise to try.

Overall though, now that valgrind is working in a basic way on OS X 10.10, I'm focusing on ironing out the bugs and improving coverage.

A separate but related task is to address the functional test failures, although there are still many that fail on OS X 10.8/10.9 as well as OS X 10.10.
Comment 29 191919 2014-11-07 14:02:23 UTC
I just built the svn trunk (14703) on my shell but I didn't have any luck to run anything.

My hardware:

```
$ sysctl -n machdep.cpu.brand_string
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz
$ sysctl machdep.cpu.features
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
```

My test program:

```
#include <stdio.h>
int main() { printf("hello, world!\n"); }
```

and I compiled it with `clang -o t t.c` then ran it with valgrind:

```
$ valgrind ./t
==49420== Memcheck, a memory error detector
==49420== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==49420== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==49420== Command: ./t
==49420== 
--49420-- ./t:
--49420-- dSYM directory is missing; consider using --dsymutil=yes
vex amd64->IR: unhandled instruction bytes: 0xF 0xAE 0x24 0x24 0x48 0x8B 0x7D 0x8
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==49420== valgrind: Unrecognised instruction at address 0x10015b3a9.
==49420==    at 0x10015B3A9: dyld_stub_binder (in /usr/lib/system/libdyld.dylib)
==49420==    by 0x100012007: ??? (in /usr/lib/libSystem.B.dylib)
==49420==    by 0x7FFF5FC12CEA: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC12E77: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F870: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F805: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F6F7: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F968: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC02229: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC05BE0: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC01275: dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC01035: _dyld_start (in /usr/lib/dyld)
==49420== Your program just tried to execute an instruction that Valgrind
==49420== did not recognise.  There are two possible reasons for this.
==49420== 1. Your program has a bug and erroneously jumped to a non-code
==49420==    location.  If you are running Memcheck and you just saw a
==49420==    warning about a bad jump, it's probably your program's fault.
==49420== 2. The instruction is legitimate but Valgrind doesn't handle it,
==49420==    i.e. it's Valgrind's fault.  If you think this is the case or
==49420==    you are not sure, please let us know and we'll try to fix it.
==49420== Either way, Valgrind will now raise a SIGILL signal which will
==49420== probably kill your program.
==49420== 
==49420== Process terminating with default action of signal 4 (SIGILL)
==49420==  Illegal opcode at address 0x10015B3A9
==49420==    at 0x10015B3A9: dyld_stub_binder (in /usr/lib/system/libdyld.dylib)
==49420==    by 0x100012007: ??? (in /usr/lib/libSystem.B.dylib)
==49420==    by 0x7FFF5FC12CEA: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC12E77: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F870: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F805: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F6F7: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC0F968: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC02229: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC05BE0: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC01275: dyldbootstrap::start(macho_header const*, int, char const**, long, macho_header const*, unsigned long*) (in /usr/lib/dyld)
==49420==    by 0x7FFF5FC01035: _dyld_start (in /usr/lib/dyld)
==49420== 
==49420== HEAP SUMMARY:
==49420==     in use at exit: 0 bytes in 0 blocks
==49420==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==49420== 
==49420== All heap blocks were freed -- no leaks are possible
==49420== 
==49420== For counts of detected and suppressed errors, rerun with: -v
==49420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[1]    49420 illegal hardware instruction  valgrind ./t
```

The illegal instructions were:

```
0F AE 24 24 xsave [esp]
48 dec eax
8B 7D 08 mov edi,[ebp+8]
```

Is this another xsave problem?

Please let me know what information I can provide.

Thank you.
Comment 30 191919 2014-11-07 14:12:03 UTC
I used the Julian Seward's suggestion to disable `amd64g_dirtyhelper_CPUID_avx_and_cx16`, this time I can run valgrind on my mac.
Comment 31 Julian Seward 2014-11-11 12:55:15 UTC
Closing.  As of vex 2989 and valgrind 14712, it should be possible to
build and run at least simple apps on 64-bit Yosemite without any of
the abovementioned patches.  There are still rough edges compared to
Mavericks, but those are being worked on.  If there are Yosemite
specific problems (quite likely), please file them in new bug reports.