Bug 205241

Summary:	Snow Leopard 10.6 support
Product:	[Developer tools] valgrind	Reporter:	playmobil
Component:	general	Assignee:	Julian Seward <jseward>
Status:	RESOLVED UNMAINTAINED
Severity:	normal	CC:	annacegu, astrange, charley.wang, cpigat242, cwatson, daniel, devchan1, filcab, fxcoudert, gkodinov, glider, gparker, gzjjgod, hub, ian, johannes.simon, jp-keyword-valbug.6b5314, justus.c79, kenny, klaus, konstantin.s.serebryany, len.sassaman, malaperle, mark, martin, njn, pjfloyd, rhys.hill, sean, siegel, tom, trnsca
Priority:	NOR
Version:	3.5.0
Target Milestone:	blocking3.6.0
Platform:	Compiled Sources
OS:	Other
Latest Commit:		Version Fixed In:
Sentry Crash Report:
Attachments:	valgrind -d -v echo "hi" 10.6 support for 3.5.0. Updated Greg's patch to apply to version 11026 Disabled arcrandom interception in coregrind/vg_preloaded.c Remove stdio depency from arc4random don't touch SIGKILL and SIGSTOP for comment #85

Description playmobil 2009-08-26 21:58:50 UTC

Version:           3.5.0 (using Devel)
OS:                OS X
Installed from:    Compiled sources

Valgrind doesn't support Snow Leopard (10.6)

This is according to :
http://www.sealiesoftware.com/blog/archive/2009/06/03/Valgrind_for_Mac_OS_X_goes_mainline.html

Comment 1 Tom Hughes 2009-08-26 23:25:15 UTC

Do you actually have personal knowledge that it doesn't work, or is this "bug report" entirely based on a three month old blog posting by a third party?

Certainly Nicholas Nethercote, who has done most of the work of integrating the Darwin port, implied in a valgrind-developers post yesterday that it should work although it might be a bit slow as it will be using the 64 bit code which has some performance issues.

Comment 2 Nicholas Nethercote 2009-08-27 00:28:08 UTC

Closing, because it's possibly true but is based on hearsay rather than direct experience.

Please reopen if you have actually run Valgrind on 10.6 and found it doesn't work;  if that happens please include the output of "valgrind -d -v <program>".  Thanks.

Comment 3 playmobil 2009-08-29 00:00:20 UTC

Here are some tests on Snow Leopard 10.6 10A432

To summarize:
* configure fails
* Hacking the configure files and building anyway - running valgrind on "echo hello" fails.

Configure output:

$ svn co svn://svn.valgrind.org/valgrind/trunk valgrind
$ cd valgrind/
$ ./autogen.sh
$ ./configure --prefix=/usr/local/valgrind

Here's the output from configure:
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... ./install-sh -c -d
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking whether to enable maintainer-specific portions of Makefiles... no
checking whether ln -s works... yes
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking whether gcc and cc understand -c and -o together... yes
checking how to run the C preprocessor... gcc -E
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking for ranlib... ranlib
checking for ar... /usr/bin/ar
checking for perl... /usr/bin/perl
checking for gdb... /usr/bin/gdb
checking dependency style of gcc... gcc3
checking for diff -u... yes
checking for a supported version of gcc... ok (686)
checking build system type... i386-apple-darwin10.0.0
checking host system type... i386-apple-darwin10.0.0
checking for a supported CPU... ok (i386)
checking for a 64-bit only build... no
checking for a 32-bit only build... no
checking for a supported OS... ok (darwin10.0.0)
checking for the kernel version... unsupported (10.0.0)
configure: error: Valgrind works on Darwin 9.x (Mac OS X 10.5)

After fixing the configure file and compiling successfully:
$ playmobil$ echo "hi"
hi
$ /usr/local/valgrind/bin/valgrind echo "hi"
==24152== Memcheck, a memory error detector
==24152== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==24152== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info
==24152== Command: echo hi
==24152== 
==24152== Invalid read of size 1
==24152==    at 0x33079: localeconv_l (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x2DA8B: __vfprintf (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x4B9E2: vfprintf_l (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x969CF: printf (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x22DD5: pthread_init (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x21C66: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x8FE0ED6C: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24152==    by 0x8FE0D31D: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) (in /usr/lib/dyld)
==24152==    by 0x8FE0D2C1: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) (in /usr/lib/dyld)
==24152==    by 0x8FE0D3D0: ImageLoader::runInitializers(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24152==    by 0x8FE0248E: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==24152==    by 0x8FE0794F: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**) (in /usr/lib/dyld)
==24152==  Address 0x568 is not stack'd, malloc'd or (recently) free'd
==24152== 
==24152== 
==24152== Process terminating with default action of signal 10 (SIGBUS)
==24152==  Non-existent physical address at address 0x568
==24152==    at 0x33079: localeconv_l (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x2DA8B: __vfprintf (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x4B9E2: vfprintf_l (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x969CF: printf (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x22DD5: pthread_init (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x21C66: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==24152==    by 0x8FE0ED6C: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24152==    by 0x8FE0D31D: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) (in /usr/lib/dyld)
==24152==    by 0x8FE0D2C1: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int) (in /usr/lib/dyld)
==24152==    by 0x8FE0D3D0: ImageLoader::runInitializers(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24152==    by 0x8FE0248E: dyld::initializeMainExecutable() (in /usr/lib/dyld)
==24152==    by 0x8FE0794F: dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**) (in /usr/lib/dyld)
==24152== 
==24152== HEAP SUMMARY:
==24152==     in use at exit: 0 bytes in 0 blocks
==24152==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==24152== 
==24152== All heap blocks were freed -- no leaks are possible
==24152== 
==24152== For counts of detected and suppressed errors, rerun with: -v
==24152== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Bus error

Comment 4 playmobil 2009-08-29 00:06:00 UTC

Created attachment 36550 [details]
valgrind -d -v echo "hi"

Comment 5 Rich Siegel 2009-08-29 03:39:38 UTC

I can confirm the behavior of "configure" as reported; I haven't yet taken the time to hack the config file to see what happens, though.

Comment 6 Greg Parker 2009-08-29 07:54:09 UTC

I have a stack of changes that I should be able to sync next week and post once the xnu open-source release is out. (Most of the rest of open-source 10.6 is available, but not xnu.) This would cover most of the new syscalls.

Comment 7 Alexander Potapenko 2009-09-11 16:58:16 UTC

By the way, xnu is out, see http://opensource.apple.com/source/xnu/xnu-1456.1.26/

Comment 8 Greg Parker 2009-09-16 21:06:19 UTC

Created attachment 36999 [details]
10.6 support for 3.5.0.

Comment 9 Greg Parker 2009-09-16 21:13:07 UTC

snowleopard.patch is the 80% solution. Unfortunately, I'm out of town for the rest of the week so I can't polish it off.

Patch applies to tags/VALGRIND_3_5_0.
This merged code has been only trivially tested on 10.6 and not at all on 10.5. It runs `ls` and `defaults` and `TextEdit`, at least.

Bug: mpi doesn't build.
Bug: darwin10.supp and darwin10-drd.supp are empty and need to be repopulated.
Bug: priv_syswrap-darwin.h is probably not fully up-to-date.
Possible bugs: The latest comparison of vki-scnums-darwin.h to xnu/bsd/kern/syscalls.master was well before 10.6 GM
Possible bugs: I added $(srcdir) to the Makefiles in a few places to be compatible with Apple's internal build system (which uses separate source and build directories). My automake-fu is weak.
Bad policy decision: The code uses #ifdefs for Leopard vs SnowLeopard, building for whatever the build machine used. These choices should be made at runtime instead so a single binary can work everywhere.

Comment 10 Greg Parker 2009-09-16 21:25:06 UTC

Bug: valgrind doesn't handle the new malloc purgeable zone correctly. This will require real support for malloc zones, presumably using valgrind's mempool machinery.

Comment 11 Nicholas Nethercote 2009-09-17 00:22:35 UTC

Thanks for this, Greg.  I've marked this as blocking 3.5.1.  You could argue that this is more than just bug fixing and so shouldn't go in 3.5.x, but 3.6.0 won't come out for quite some time and we should add 10.6 support soon.

Greg, do you have any suggestions for testing 10.5 and 10.6?  I have two MacBook Pros, a newish one that I use 95% of the time and an older one that is much slower.  I currently have 10.5 on both, and have been holding off installing 10.6 because Valgrind wouldn't work.  One obvious thing to do is install 10.6 on the new one and keep the old one on 10.5, then I can test both versions.  It would be nice to be able to have both OSes on my newer machine and switch between them easily though... I guess VMWare is the only solution for that.  A dual-boot system wouldn't help me much, having to reboot would be a bigger pain than having to use the older machine.

Comment 12 Greg Parker 2009-09-17 00:28:52 UTC

I multi-boot all of my machines. I don't know how well virtualization of Mac OS X works; it might be too slow or memory-intensive to be practical for valgrind development.

Comment 13 Rich Siegel 2009-09-17 03:06:09 UTC

It is possible to run Mac OS X 10.5 as a guest OS in VMware Fusion on 10.6; I will be happy to advise if you contact me out-of-band since it's not really germane to this issue. :-) On a first-gen 2.66GHz quad-core Mac Pro with 9GB of physical RAM and 3GB allocated to the virtual machine, the virtualized OS performance is reasonable and should be OK for building and testing valgrind; but I wouldn't advocate it as a daily-use setup. :-) It's a little bit more of a chug on my 2.5GHz dual-core MacBook Pro with 6GB of RAM, but with the usual "YMMV" caveat it's probably worth trying.

Comment 14 Julian Seward 2009-09-17 03:27:17 UTC

(In reply to comment #13)
> On a first-gen 2.66GHz quad-core Mac Pro with 9GB of
> physical RAM and 3GB allocated to the virtual machine,

I haven't tried to run MacOS on VMware, but I have used VMware a lot
for testing Valgrind (w/ Linux guests).  It's perfectly viable to
build and test V using a guest with 1GB, or even less, on an older
Core 2 (E6600).  I don't expect a MacOSs guest's resource requirements
to be wildly different from a Linux guest's requirements.

Comment 15 Nicholas Nethercote 2009-09-30 07:25:31 UTC

I installed 10.6 and (lightly) tested the patch.  It seems to work ok, but the first real problem I've hit is that configure still wants to categorise my machine as x86-darwin, ie. 32-bit.  This is a problem as GCC's default is to produce 64-bit binaries.

This x86 categorisation is determined by config.guess, which uses the output of uname, which looks like this:

  Darwin wave 10.0.0 Darwin Kernel Version 10.0.0: Fri Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386

uname is claiming that I only have a 32-bit machine, sigh.

Does 10.6 run on any 32-bit only machines?  If not, we could do a hack like converting x86-darwin to amd64-darwin automatically if 10.6 is installed.  Otherwise I'm not sure what to do, we may just have to tell people to configure with --build=amd64-darwin.

Comment 16 Rich Siegel 2009-09-30 15:39:40 UTC

10.6 does run on any Mac with an Intel CPU, which includes the first-gen Core Duo machines. (I believe those are 32-bit only.) This article applies to Snow Lep Server and doesn't mention any of the laptops, but at least suggests that there are some 32-bit-kernel-only configurations: <http://support.apple.com/kb/HT3770>.

The fact that `uname -m` reports "i386" on a 64-bit capable machine means only that it's booted with the 32-bit kernel. It should be able to run a 64-bit valgrind (and in fact machines booted into the 32-bit kernel run 64-bit apps - you'll find that some are already running on your machine).

Comment 17 Alexander Strange 2009-09-30 18:49:02 UTC

If you update config.guess (you may have to get it from its own SVN, I'm not sure), it checks the compiler target for darwin10 properly. I don't remember how it does it.

Comment 18 Nicholas Nethercote 2009-09-30 23:15:34 UTC

Alexander, config.guess is generated by automake.  I have automake 1.10 on my Mac, it's the one you get with Xcode.  Are you saying that if I update to automake 1.11 (which appears to the be latest one) the generated config.guess will work?  Even if so, I don't want to require 1.11 for it to work.

Maybe a compilation test within the configure script can be used to work out if 64-bit is supported and then we manually override the i386 setting.

Comment 19 Nicholas Nethercote 2009-10-01 02:42:27 UTC

I get this a lot (eg. many of the regtests) with 32-bit programs:

+Process terminating with default action of signal 10 (SIGBUS)
+ Non-existent physical address at address 0x........
+   at 0x........: malloc (in /...libc...)
+   by 0x........: __smakebuf (in /...libc...)
+   by 0x........: __srefill0 (in /...libc...)
+   by 0x........: fread (in /...libc...)
+   by 0x........: arc4random (vg_preloaded.c:137)
+   by 0x........: create_scalable_zone (in /...libc...)
+   by 0x........: _malloc_initialize (in /...libc...)
+   by 0x........: malloc (in /...libc...)
+   by 0x........: get_or_create_key_element (in /...libc...)
+   by 0x........: _keymgr_get_and_lock_processwide_ptr_2 (in /...libc...)
+   by 0x........: __keymgr_initializer (in /...libc...)
+   by 0x........: libSystem_initializer (in /...libc...)

For example with none/tests/x86/insn_basic.c.  But I only get it with --tool=none, --tool=memcheck works ok, so it must be something to do with malloc replacement.

Comment 20 jp-keyword-valbug.6b5314 2009-10-02 20:27:05 UTC

FYI - I ran into this trying to run the patched valgrind on Snow Leopard.  I can try generating a smaller test case if this isn't a known problem.  Removing a call to a floating point "ceil" function allowed the program to run (at least until I ran out of memory - which is a different issue).

MacBook-Pro:CudaCutsAtomic jpbonn$ valgrind --leak-check=yes ./emudebug/cudaCuts data/sponge.txt 
==20103== Memcheck, a memory error detector
==20103== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==20103== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info
==20103== Command: ./emudebug/cudaCuts data/sponge.txt
==20103== 
--20103-- ./emudebug/cudaCuts:
--20103-- dSYM directory is missing; consider using --dsymutil=yes
enterd
No. of devices 2
vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0xA
==20103== valgrind: Unrecognised instruction at address 0x233ae2.
==20103== Your program just tried to execute an instruction that Valgrind
==20103== did not recognise.  There are two possible reasons for this.
==20103== 1. Your program has a bug and erroneously jumped to a non-code
==20103==    location.  If you are running Memcheck and you just saw a
==20103==    warning about a bad jump, it's probably your program's fault.
==20103== 2. The instruction is legitimate but Valgrind doesn't handle it,
==20103==    i.e. it's Valgrind's fault.  If you think this is the case or
==20103==    you are not sure, please let us know and we'll try to fix it.
==20103== Either way, Valgrind will now raise a SIGILL signal which will
==20103== probably kill your program.

Comment 21 Nicholas Nethercote 2009-10-02 22:28:57 UTC

jp, that's a separate bug.

Comment 22 jp-keyword-valbug.6b5314 2009-10-02 22:47:44 UTC

(In reply to comment #21)
> jp, that's a separate bug.
OK, I didn't know where to report it since this patch, AFAIK, isn't in the trunk.  Although this maybe a problem in 10.5 too,  I just have 10.6 so I can't test it.   I didn't think I'd get much response if I reported against a potential future version of the software. ;-)

Comment 23 Alexander Strange 2009-11-21 22:19:17 UTC

I get a lot of these messages with the patch:
--40886:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--40886:0:aspacem  438: anon 0109956000-0109958fff   12288 rw--- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--40886:0:aspacem  ...: .... 0109956000-0109956fff    4096 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--40886:0:aspacem  sync check at wqthread_hijack:0 (after): FAILED
--40886:0:aspacem  


and:
--40886-- WARNING: unhandled syscall: unix:336
--40886-- You may be able to write your own handler.
--40886-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--40886-- Nevertheless we consider this a bug.  Please report
--40886-- it at http://valgrind.org/support/bug_reports.html.

which is proc_info (<sys/proc_info.h>).

but it seems to work more or less anyway. There are many more uninitialized value warnings in system frameworks, but they might be real.

Comment 24 FX 2009-12-10 23:58:52 UTC

Could the first draft patch be committed, to get things rolling?

Comment 25 Alexander Potapenko 2010-01-19 16:19:44 UTC

*** This bug has been confirmed by popular vote. ***

Comment 26 Alexander Potapenko 2010-01-19 16:55:48 UTC

I've bumped into the SIGBUS issue, too, trying to build ThreadSanitizer on top of Valgrind for Snow Leopard.

The crash is induced by the preloaded arc4random() function (vg_preloaded.c:164), which is a fix for some "Darwin arc4random (rdar://6166275)"
Could someone who knows the details of that issue write them down to openradar.appspot.com?

Looks like this can be easily fixed by removing this function from coregrind/vg_preloaded.c
Maybe the problem is fixed in Snow Leopard and we need the workaround no more?

Comment 27 Alexander Potapenko 2010-01-21 11:38:01 UTC

Created attachment 40091 [details]
Updated Greg's patch to apply to version 11026

Comment 28 Alexander Potapenko 2010-01-21 11:42:34 UTC

Created attachment 40092 [details]
Disabled arcrandom interception in coregrind/vg_preloaded.c

This patch may help to run Nullgrind and some other tools on Snow Leopard.
OTOH it breaks Valgrind tools on Mac OS 10.5 and thus should not be used on Leopard.

Comment 29 Alexander Potapenko 2010-02-01 18:30:48 UTC

A small test that makes V (even Nullgrind) with the current patch complain about unhandled instruction:
=====================================
#include <pthread.h>
typedef void *(*worker_t)(void*);
int     GLOB = 0;
void Worker1() {
  GLOB = 1;
}
void Worker2() {
  GLOB = 2;
}
int main() {
  pthread_t w_1;
  pthread_t w_2;
  pthread_create(&w_1, NULL, worker_t(Worker1), NULL);
  pthread_create(&w_2, NULL, worker_t(Worker2), NULL);
  pthread_join(w_1, NULL);
  pthread_join(w_2, NULL);
  return 0;
}
=====================================

Comment 30 Filipe Cabecinhas 2010-02-18 19:30:12 UTC

Created attachment 40900 [details]
Remove stdio depency from arc4random

Fixes arc4random so nullgrind can run.

Uses system calls instead of c library functions.

Comment 31 Filipe Cabecinhas 2010-02-18 19:39:08 UTC

Alexander Potapenko:

arc4random MUST be there, but can't be as it was (memcheck runs fine but other tools seem to just blow up, like nullgrind).

I changed it to use system calls instead of the C library functions (as per Greg Parker's suggestion) and it works.

Your test-case is calling a typedef... I assume that should be a cast (worker_t, in pthread_create).

Try with my patch to arc4random... It should work.

Comment 32 Alexander Potapenko 2010-02-24 14:54:44 UTC

Filipe,
thanks for the responce about arc4random, I'll invalidate my patch then.

However, your patch doesn't work with my test case. Have you tried building and running it?

Comment 33 Alexander Potapenko 2010-02-24 14:54:55 UTC

Filipe,
thanks for the response about arc4random, I'll invalidate my patch then.

However, your patch doesn't work with my test case. Have you tried building and running it?

Comment 34 Sean Farley 2010-02-26 03:11:51 UTC

Alexander,

I am unable get your patch for revision 11026 to build:

$ svn up -r11026

...
Updated to revision 11026.

$ ./autogen.sh
$ ./configure

...

$ make
make: *** No rule to make target `darwin10-drd.supp', needed by `default.supp'.  Stop.

Anything that I am missing?

Comment 35 Sean Farley 2010-02-26 04:22:18 UTC

(In reply to comment #34)

> Anything that I am missing?

I finally found the problem. The patch already assumes that you have darwin10.supp and darwin10-drd.supp, which can be fixed by copying the drawin9* files, respectively.

> However, your patch doesn't work with my test case. Have you tried building and running it?

Alex, I was able to run your code example in valgrind by changing:

worker_t(Worker1)

to

(worker_t)(Worker1)

and similar for Worker2. What command did you use to run it?

Comment 36 Alexander Potapenko 2010-02-26 11:16:09 UTC

(In reply to comment #35)
> Alex, I was able to run your code example in valgrind by changing:
> 
> worker_t(Worker1)
> 
> to
> 
> (worker_t)(Worker1)
> 
> and similar for Worker2. What command did you use to run it?

I've compiled the source with:
 $ g++ unhandled.cc -m32 -o unhandled
(Valgrind wants a 32-bit binary, though gcc produces a 64-bit by default)
And ran:
 $ valgrind --tool=none  ./unhandled
==66280== Nulgrind, the minimal Valgrind tool
==66280== Copyright (C) 2002-2009, and GNU GPL'd, by Nicholas Nethercote.
==66280== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info
==66280== Command: /Users/glider/src/valgrind-bug/unhandled
==66280== 
--66280-- /Users/glider/src/valgrind-bug/unhandled:
--66280-- dSYM directory is missing; consider using --dsymutil=yes
vex x86->IR: unhandled instruction bytes: 0xF 0x1 0xC 0x24
==66280== valgrind: Unrecognised instruction at address 0xffff01e6.
==66280== Your program just tried to execute an instruction that Valgrind
==66280== did not recognise.  There are two possible reasons for this.
==66280== 1. Your program has a bug and erroneously jumped to a non-code
==66280==    location.  If you are running Memcheck and you just saw a
==66280==    warning about a bad jump, it's probably your program's fault.
==66280== 2. The instruction is legitimate but Valgrind doesn't handle it,
==66280==    i.e. it's Valgrind's fault.  If you think this is the case or
==66280==    you are not sure, please let us know and we'll try to fix it.
==66280== Either way, Valgrind will now raise a SIGILL signal which will
==66280== probably kill your program.
==66280== 
==66280== Process terminating with default action of signal 4 (SIGILL)
==66280==  Illegal opcode at address 0xFFFF01E6
==66280==    at 0xFFFF01E6: ???
==66280==    by 0xD6A51: szone_malloc_should_clear (in /usr/lib/libSystem.B.dylib)
==66280==    by 0xD6987: malloc_zone_malloc (in /usr/lib/libSystem.B.dylib)
==66280==    by 0xDB6E3: realloc (in /usr/lib/libSystem.B.dylib)
==66280==    by 0x10076C: new_sem_from_pool (in /usr/lib/libSystem.B.dylib)
==66280==    by 0x1088C7: _pthread_exit (in /usr/lib/libSystem.B.dylib)
==66280==    by 0xFFE41: thread_start (in /usr/lib/libSystem.B.dylib)
--66280:0:schedule VG_(sema_down): read returned -4
==66280== 
Killed

BTW, I haven't had any problems with the compilation (i.e "worker_t(Worker1)" was ok for me)

Comment 37 Sean Farley 2010-02-26 18:05:32 UTC

(In reply to comment #36)
> I've compiled the source with:
>  $ g++ unhandled.cc -m32 -o unhandled
> (Valgrind wants a 32-bit binary, though gcc produces a 64-bit by default)

Using svn revision 11055 I was able to use both 32 and 64 binaries in valgrind
:-)

> And ran:
>  $ valgrind --tool=none  ./unhandled
> ...

Ah, I wasn't using --tool=none. I produce the same output you have now

> BTW, I haven't had any problems with the compilation (i.e "worker_t(Worker1)"
> was ok for me)

I see the problem now, I was using 'gcc' as opposed to 'g++'. Your test still
seems to be a problem with nulgrind, indeed.

Comment 38 Alexander Potapenko 2010-03-09 11:32:27 UTC

For the record, 0f 01 0c 24 are decoded to "sidt (%rsp)" (store interrupt descriptor table register). Does anyone know how to handle this?

Comment 39 Alexander Potapenko 2010-04-06 16:08:20 UTC

I've implemented the SIDT handler for x86 (see bug 93498, http://bugsfiles.kde.org/attachment.cgi?id=42530). The patched Valgrind version works correctly with the test from https://bugs.kde.org/show_bug.cgi?id=205241#c29

Comment 40 Anna C 2010-04-13 13:37:44 UTC

Hi,
I just tried both of the patches, and they don't work with revision 11104. Here is the first part of the output:
patching file Makefile.all.am
Reversed (or previously applied) patch detected!  Assume -R? [n] n

Regards,
Anna.

Comment 41 Alexander Potapenko 2010-04-15 12:12:53 UTC

I've put an instruction for using the patches to http://code.google.com/p/data-race-test/wiki/ValgrindOnSnowLeopard

Hope that helps to speed up the testing of the patches.

Comment 42 Charley 2010-04-15 21:14:00 UTC

Just wanted to say that Alexander's instructions worked for me. 

OS: OS X 10.6.2 (Snow Leopard) Kernel: Darwin 10.2.0. x86_64

I was able to run valgrind via the eclipse-valgrind tool and get working results for a binary compiled with -m32 on my machine.

Thanks! :)

Comment 43 Klaus Kuehnhammer 2010-04-19 13:46:55 UTC

Hi!

I was able to compile and run valgrind under Snow Leopard with these instructions. Thanks!

There seems to be an issue w/signals though. Debug output reports:
'Max kernel-supported signal is 8'

This can't be right, and causes the program to abort when installing signal handlers higher that that, eg sigaction( SIGTERM, &sa, NULL ).

regards,
Klaus

Comment 44 Klaus Kuehnhammer 2010-04-19 14:11:01 UTC

Created attachment 42892 [details]
don't touch SIGKILL and SIGSTOP

Was missing an #ifdef for amd64

Comment 45 Georgi Kodinov 2010-04-27 17:49:57 UTC

I've tried the procedure described by Alexander and it worked on my SL. However when trying to valgrind a mysql test run I've got the following : 
valgrind: m_sigframe/sigframe-amd64-darwin.c:59 (vgPlain_sigframe_create): Assertion 'Unimplemented functionality' failed.
valgrind: valgrind

Here's how to repeat this : 
lp:~mysql/mysql-server/mysql-5.1-bugteam
cd mysql-5.1-bugteam
./BUILD/compile-pentium-valgrind-max
cd mysql-test
./mysql-test-run.pl --valgrind t/alias.test

Comment 46 Georgi Kodinov 2010-04-27 17:50:56 UTC

a typo : it's 'bzr clone lp:~mysql/mysql-server/mysql-5.1-bugteam', not 'lp:~mysql/mysql-server/mysql-5.1-bugteam'.

Comment 47 Martin Storsjö 2010-05-02 19:30:18 UTC

As noted above, signal handling doesn't work well on x86_64 (on i386 it works just fine, though).

This can be demonstrated with a simple test application:

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>

int fib(int n) {
        if (n <= 1)
                return n;
        return fib(n-1) + fib(n-2);
}

void sighandler(int sig) {
        printf("signal %d\n", sig);
}

int main(int argc, char* argv[]) {
        if (argc < 2) {
                printf("%s n\n", argv[0]);
                return 1;
        }
        signal(SIGINT, sighandler);
        printf("%d\n", fib(atoi(argv[1])));
        return 0;
}

Compile this, and run with e.g. valgrind ./test 100, abort the calculation with a ctrl+c and you'll get the
valgrind: m_sigframe/sigframe-amd64-darwin.c:59 (vgPlain_sigframe_create): Assertion 'Unimplemented functionality' failed.
error as reported above. As the error message says, coregrind/m_sigframe/sigframe-amd64-darwin.c is totally unimplemented.

If no signal handler is installed, the application terminates cleanly.

In x86_64 mode, valgrind seems to drop signals while doing a system call. If the calculation/printf above is replaced with sleep(10), the SIGINT from ctrl+c is never received at all.

Comment 48 Julian Seward 2010-05-10 23:47:15 UTC

Am just starting to look at this now.

Given the scale of the changes required, I'm inclined to make a branch
to experiment on, so as not to disturb the trunk until such time as
it becomes clear what the consequences for the trunk will be.

One thing I'd like to ask is:

> Comment #10
> [...]
> Bug: valgrind doesn't handle the new malloc purgeable zone
> correctly.  This will require real support for malloc zones,
> presumably using valgrind's mempool machinery.

This concerns me somewhat.  I don't know anything about these
new malloc purgeable zones.  Does anybody have an explanation
of what they are, and ideally a test program that shows 
Memcheck mis-handling them?

Comment 49 Julian Seward 2010-06-02 14:10:54 UTC

As a first step, I have made a new branch in the repo, to fold in and
test 10.6 support.  Right now it contains a copy of trunk r11142
plus a change to the way the tool executables are linked on Darwin,
which improves 64-bit support, gets rid of the 4 second startup
delay for 64-bit programs.  It does not yet contain the main 10.6
patch in this report, although that is the next thing to merge.

If you want to try out this branch, here's what to do:

  svn co svn://svn.valgrind.org/valgrind/branches/MACOSX106 mx106
  cd mx106
  ./autogen.sh
  ./configure --prefix=`pwd`/Inst --build=amd64-darwin
  make -j2 all
  make install

That should do a biarch build (meaning, both 32- and 64-bit
Valgrinds).  At least it does on 10.5.8.  If you want only a 32-bit
version, omit the --build=amd64-darwin

I will post a status update when the main patch is committed.

Comment 50 Julian Seward 2010-06-03 23:26:16 UTC

Comment on attachment 42892 [details]
don't touch SIGKILL and SIGSTOP

Committed (r11150) on
branches/MACOSX106.

Comment 51 Len Sassaman 2010-06-06 16:46:54 UTC

I've applied this patch (mostly by hand) to the mx106 branch 11152. I've run into a few problems that make me think the code in mx106 is older than the code this patch was generated against. For example, I see nowhere to apply this patch:
@@ -1471,10 +1480,22 @@
 
 #----------------------------------------------------------------------------
 # Check for /proc filesystem
+# This doesn't work when cross-compiling, including Darwin universal builds.
 #----------------------------------------------------------------------------
+if test "x$cross_compiling" = "xno"; then
 AC_CHECK_FILES(/proc/self/fd /proc/self/exe /proc/self/maps, 
     [ AC_DEFINE([HAVE_PROC], 1, [can use /proc filesystem]) ], 
     [])
+else
+     # assume /proc/self/* is available everywhere except Darwin
+     case "$VGCONF_OS" in
+        darwin)
+             ;;
+        *)
+           AC_DEFINE([HAVE_PROC], 1, [can use /proc filesystem])
+           ;;
+     esac
+fi
 
Nor this one:
Index: coregrind/m_main.c
===================================================================
--- coregrind/m_main.c  (revision 10888)
+++ coregrind/m_main.c  (working copy)
@@ -1510,7 +1510,7 @@
    VG_(do_syscall2)(__NR_munmap, 0x00000000, 0xf0000000);
 # else
    // open up client space
-   VG_(do_syscall2)(__NR_munmap, 0x100000000, 0x700000000000-0x100000000);
+   VG_(do_syscall2)(__NR_munmap, 0x100000000, 0x7fff50000000-0x100000000);
    // open up client stack and dyld
    VG_(do_syscall2)(__NR_munmap, 0x7fff5c000000, 0x4000000);
 # endif

Also, the part about the /proc file system is missing, so that patch was left off. And in the patch to Makefile.vex.am, the file in TRUNK is longer: 

--- Makefile.vex.am     (revision 10888)
+++ Makefile.vex.am     (working copy)
@@ -51,14 +51,11 @@
 # This is very uggerly.  Need to sed out both "xyzzyN" and
 # "xyzzy$N" since gcc on different targets emits the constants
 # differently -- with a leading $ on x86/amd64 but none on ppc32/64.
-pub/libvex_guest_offsets.h:
-       rm -f auxprogs/genoffsets.s
-       $(CC) $(LIBVEX_CFLAGS) -O -S -o auxprogs/genoffsets.s \
-                                       auxprogs/genoffsets.c
-       grep xyzzy auxprogs/genoffsets.s | grep define \
+pub/libvex_guest_offsets.h: $(srcdir)/auxprogs/genoffsets.c
+       $(CC) $(LIBVEX_CFLAGS) -O -S -o - $(srcdir)/auxprogs/genoffsets.c \
+          | grep xyzzy | grep define \
           | sed "s/xyzzy\\$$//g" | sed "s/xyzzy//g" \
-          > pub/libvex_guest_offsets.h
-       rm -f auxprogs/genoffsets.s
+          > $(srcdir)/pub/libvex_guest_offsets.h
 


Would it be helpful for me to submit my cleaned-up patch, with these bugs still blocking it? I'll work on trying to get the TRUNK features being relied upon backported if no one else gets to it in the mean time, but if the patch helps, all the better.

Comment 52 Alexander Potapenko 2010-06-06 17:01:09 UTC

// fixed the CC list. Please do not delete anyone.

Comment 53 Julian Seward 2010-06-06 21:23:58 UTC

(In reply to comment #51)
> I've applied this patch (mostly by hand) to the mx106 branch 11152. I've run
> into a few problems that make me think the code in mx106 is older than the code
> this patch was generated against.

No .. the mx106 code is very new -- it was branched from trunk less
than a week ago.

> For example, I see nowhere to apply this
> patch:
> @@ -1471,10 +1480,22 @@
> 
>  #----------------------------------------------------------------------------
>  # Check for /proc filesystem
> +# This doesn't work when cross-compiling, including Darwin universal builds.
>  #----------------------------------------------------------------------------
> +if test "x$cross_compiling" = "xno"; then

I don't see any such stuff in the version of the patch at 
https://bugs.kde.org/show_bug.cgi?id=205241#c27.  Is that
the version you are using?

Overall I saw no significant merge problems for the #c27 patch.  Am
trying it out now.

Comment 54 Len Sassaman 2010-06-07 00:55:49 UTC

I was using the patch in c50, against the mx106 tree. I got a different (less problematic) set of problems applying patch c50 against branches/MACOSX106 after trying to do a straight-up compile from that branch, thinking it was pre-patched. It's possible my earlier report included cruft that somehow got in there from patching/restoring mistakes, but the fact remains I had no success from the start, using --dry-run. 

I just tried c27 against the mx106 tree, and that was the best result yet -- only four failures in the patching. 

Let me ask this, please, since I'm joining the discussion late and might have missed something that makes it more obvious: if one is looking to use Valgrind on OS 10.6.3, which of the two OS X 10.6 branches should we be using, and what patches do we need to apply (if any, and of course, what order do you recommend?)

Thanks a ton for looking into this problem.

Comment 55 Julian Seward 2010-06-07 01:10:15 UTC

(In reply to comment #54)
> Let me ask this, please, since I'm joining the discussion late and might have
> missed something that makes it more obvious: if one is looking to use Valgrind
> on OS 10.6.3, which of the two OS X 10.6 branches

Two OS X 10.6 branches?  There is only one.  There is no mx106 branch.
"mx106" is merely a directory name I made up as part of the
how-to-check-out examples in Comment #49.  The only OS X 10.6 branch
is svn://svn.valgrind.org/valgrind/branches/MACOSX106 as stated in
Comment #49.

Anyway, the MACOSX106 branch + the comment 27 patch produces two patch
failures, both of which are insignificant, and produce a valgrind
which can run 32- and 64-bit hello-world, programs, at least.  I will
tidy the patch up and commit it tomorrow.

Comment 56 Len Sassaman 2010-06-07 01:51:18 UTC

(In reply to comment #55)
> (In reply to comment #54)
> > Let me ask this, please, since I'm joining the discussion late and might have
> > missed something that makes it more obvious: if one is looking to use Valgrind
> > on OS 10.6.3, which of the two OS X 10.6 branches
> 
> Two OS X 10.6 branches?  There is only one.  There is no mx106 branch.

Oh, right. That's just the destination directory. I clearly need more caffeine. 

> Anyway, the MACOSX106 branch + the comment 27 patch produces two patch
> failures, both of which are insignificant, and produce a valgrind
> which can run 32- and 64-bit hello-world, programs, at least.  I will
> tidy the patch up and commit it tomorrow.

Great. I'm not sure what I did wrong, but I'm at the point now where I'm Hello World tests certainly work. Thanks for being so responsive.

I'm sure this has been asked elsewhere, but is there an ETA on the official release that will include Snow Leopard? I saw discussion about 3.5.1 vs. 3.6, but that didn't really reveal much.

Also, are these instructions still useful? 

http://code.google.com/p/data-race-test/wiki/ValgrindOnSnowLeopard

(Are any of those patches still relevant, or are they all subsumed by MACOSX106+c27 (plus your commit tomorrow?) If so, maybe that page should be updated? 

Again, thanks for your help!

Comment 57 Julian Seward 2010-06-07 15:09:57 UTC

Does anyone know what the purpose of the arc4random intercept is?
I'd like to get rid of it if possible, but I don't know why we have it.

Comment 58 Alexander Potapenko 2010-06-07 16:54:29 UTC

(In reply to comment #56)
Len, the instruction at http://code.google.com/p/data-race-test/wiki/ValgrindOnSnowLeopard is still useful. At the moment it patches Valgrind trunk, but I'm planning to update that page to use MACOSX106 branch.

Comment 59 Alexander Potapenko 2010-06-07 16:54:29 UTC

(In reply to comment #56)
Len, the instruction at http://code.google.com/p/data-race-test/wiki/ValgrindOnSnowLeopard is still useful. At the moment it patches Valgrind trunk, but I'm planning to update that page to use MACOSX106 branch.

Comment 60 Julian Seward 2010-06-07 18:24:59 UTC

(In reply to comment #39)
> I've implemented the SIDT handler for x86

Committed, vex r1982.  Thanks.

Comment 61 Julian Seward 2010-06-07 18:26:29 UTC

(In reply to comment #30)
> Created an attachment (id=40900) [details]
> Remove stdio depency from arc4random

Committed, r11158.  Thanks.

Comment 62 Julian Seward 2010-06-07 18:27:24 UTC

(In reply to comment #44)
> Created an attachment (id=42892) [details]
> don't touch SIGKILL and SIGSTOP

Committed, r11150.  Thanks.

Comment 63 Julian Seward 2010-06-07 18:28:23 UTC

(In reply to comment #27)
> Created an attachment (id=40091) [details]
> Updated Greg's patch to apply to version 11026

Committed in pieces,  r11153,4,5,6,7.  Thanks.

Comment 64 Julian Seward 2010-06-07 18:36:24 UTC

The initial set of patches associated with this bug have now been
committed.  This should give at least initial Snow Leopard support,
both for 32- and 64-bit processes.  If you want to try it out, follow
the instructions as given in comment #49 above.

Other stuff that still needs fixing:

* write a sigframe builder, as per comment #47, and generally make
  signal handling work for 64-bit processes.  (Should be OK for 32-bit
  processes).

* https://bugs.kde.org/show_bug.cgi?id=216837, maybe.

* building the regression tests now fails.

Comment 65 Greg Parker 2010-06-07 20:17:56 UTC

"Does anyone know what the purpose of the arc4random intercept is?"
Noise suppression. libc's arc4random() copies uninitialized bytes from the stack into the entropy pool. That taints every arc4random() result after that. I couldn't find a cleaner way to suppress it.

Comment 66 Julian Seward 2010-06-07 23:09:03 UTC

(In reply to comment #47)
> As noted above, signal handling doesn't work well on x86_64 (on i386 it works
> just fine, though).
> [...]
> valgrind: m_sigframe/sigframe-amd64-darwin.c:59 (vgPlain_sigframe_create):
> Assertion 'Unimplemented functionality' failed.

As of r11162 (on branches/MACOSX106) this assertion is fixed, so
signal delivery works for both 32- and 64-bit processes.  But only on
10.5.x.  For whatever reason it doesn't work at all on 10.6.x, for
either 32- or 64-bit.  I will look into it.

Comment 67 Julian Seward 2010-06-08 12:28:41 UTC

(In reply to comment #66)

> As of r11162 (on branches/MACOSX106) this assertion is fixed, so
> signal delivery works for both 32- and 64-bit processes.  But only on
> 10.5.x.  For whatever reason it doesn't work at all on 10.6.x, for
> either 32- or 64-bit.  I will look into it.

At least for simple signal delivery (eg the program in comment #47)
I have the following results w/ r11162:

  10.5.8, 32-bit   works
  10.5.8, 64-bit   works
  10.6.3, 32-bit   doesn't work (hangs)
  10.6.3, 64-bit   works

Having dug around in the signal handling mechanism for an hour I
have no idea why 10.6.3 32-bit hangs when it doesn't on 10.5.8.
It's the syscall sigsuspend_nocancel in VG_(sigtimedwait_zero)
in m_libcsignal.c that is hanging.  VG_(sigtimedwait_zero) is used
to periodically poll the host for signals that might need to be
delivered to the guest.  I am completely mystified.

Comment 68 Martin Storsjö 2010-06-09 13:01:57 UTC

I tested the latest from the MACOSX106 branch (rev 11164) now, but I'm unable to launch even minimal test apps with this version, I get this error:

valgrind: mmap(0x100000000, 4096) failed in UME (load_segment1).

Comment 69 Julian Seward 2010-06-10 10:23:31 UTC

(In reply to comment #68)
> I tested the latest from the MACOSX106 branch (rev 11164) now, but I'm unable
> to launch even minimal test apps with this version, I get this error:
> valgrind: mmap(0x100000000, 4096) failed in UME (load_segment1).

This is with a completely clean build of the branch?  It works ok
for me and also for Nick Nethercote, who tried it.

Comment 70 Alexander Potapenko 2010-06-10 15:29:38 UTC

Valgrind built from the branch works ok for me as well.

Comment 71 Alexander Potapenko 2010-06-10 15:29:40 UTC

Valgrind built from the branch works ok for me as well.

Comment 72 Alexander Potapenko 2010-06-10 16:07:52 UTC

BTW, I'm constantly getting the following error message trying to run autogen.sh for every revision in the trunk and in the MACOSX106 branch:

$ ./autogen.sh 
running: aclocal
running: autoheader
aclocal.m4:14: error: this file was generated for autoconf 2.61.
You have another version of autoconf.  If you want to use that,
you should regenerate the build system entirely.
aclocal.m4:14: the top level
autom4te: /opt/local/bin/gm4 failed with exit status: 63
autoheader: '/opt/local/bin/autom4te' failed with exit status: 63
error: while running 'autoheader'

My autoconf and autoheader utilities claim they have the 2.61 version.

To overcome this I have to run:
$ ./autogen.sh || autoreconf -fvi

Am I doing something wrong?

Comment 73 Martin Storsjö 2010-06-14 12:11:35 UTC

(In reply to comment #69)
> (In reply to comment #68)
> > I tested the latest from the MACOSX106 branch (rev 11164) now, but I'm unable
> > to launch even minimal test apps with this version, I get this error:
> > valgrind: mmap(0x100000000, 4096) failed in UME (load_segment1).
> 
> This is with a completely clean build of the branch?  It works ok
> for me and also for Nick Nethercote, who tried it.

Sorry for the noise, it does work after I did a full rebuild including rerunning autogen.sh, too.

Comment 74 Alexander Potapenko 2010-06-22 14:41:24 UTC

A new problem observed under 10.6: sync check fails when a stack is reused.

=================================
$ cat stack-reuse.c 
#include <pthread.h>

void RealWorker() {
  int stack_var = 0;
  stack_var++;
}

void Worker() {
  pthread_t thread;
  pthread_create(&thread, NULL, RealWorker, NULL);
  pthread_join(thread, NULL);
}

void Worker0() { sleep(0); Worker(); }
void Worker1() { sleep(1); Worker(); }

int main() {
  pthread_t th0, th1;
  pthread_create(&th0, NULL, Worker0, NULL);
  pthread_create(&th1, NULL, Worker1, NULL);
  pthread_join(th0, NULL);
  pthread_join(th1, NULL);
  return 0;
}
=================================
$ gcc stack-reuse.c -m32 -o stack-reuse32  # some warnings may be reported
$ gcc stack-reuse.c -m64 -o stack-reuse64

$ inst/bin/valgrind --tool=none   ~/src/stack-reuse/stack-reuse64
...

--7171:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--7171:0:aspacem   23:      0100400000-0100482fff  536576 ----- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--7171:0:aspacem  ...: .... 0100400000-0100400fff    4096 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--7171:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--7171:0:aspacem   23:      0100400000-0100482fff  536576 ----- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--7171:0:aspacem  ...: .... 0100401000-0100480fff  524288 rw-.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--7171:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--7171:0:aspacem   27:      0100506000-01005fffff 1024000 ----- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--7171:0:aspacem  ...: .... 0100506000-0100506fff    4096 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--7171:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--7171:0:aspacem   27:      0100506000-01005fffff 1024000 ----- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--7171:0:aspacem  ...: .... 0100507000-0100586fff  524288 rw-.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--7171:0:aspacem  sync check at pthread_hijack:0 (after): FAILED
--7171:0:aspacem  
...

No errors are reported for stack-reuse32.

Comment 75 Nicholas Nethercote 2010-07-01 02:54:09 UTC

I just merged the branch to the trunk (r11194, with a few follow-up fixes).  The branch is now closed.

10.6 support is fairly good, but not perfect.  Perhaps we should close this bug and open new bugs for more specific problems as they occur?

Comment 76 Sean Farley 2010-07-01 11:57:55 UTC

> I just merged the branch to the trunk (r11194, with a few follow-up fixes). 
> The branch is now closed.

This is great!

> 10.6 support is fairly good, but not perfect.  Perhaps we should close this bug
> and open new bugs for more specific problems as they occur?

I still have problems building since it picks up my MPI compilers. Is this still broken? The error is:

mpicc -g -O -fno-omit-frame-pointer -Wall -dynamic -arch x86_64  -dynamic -dynamiclib -all_load  -o libmpiwrap-amd64-darwin.so libmpiwrap_amd64_darwin_so-libmpiwrap.o  
mpicc    -I../include  -g -O -fno-omit-frame-pointer -Wall -dynamic -arch i386  -MT libmpiwrap_x86_darwin_so-libmpiwrap.o -MD -MP -MF .deps/libmpiwrap_x86_darwin_so-libmpiwrap.Tpo -c -o libmpiwrap_x86_darwin_so-libmpiwrap.o `test -f 'libmpiwrap.c' || echo './'`libmpiwrap.c
gcc-4.2: -E, -S, -save-temps and -M options are not allowed with multiple -arch flags

Comment 77 Julian Seward 2010-07-01 12:22:31 UTC

(In reply to comment #76)

> I still have problems building since it picks up my MPI compilers. Is this
> still broken? The error is:
> 
> mpicc -g -O -fno-omit-frame-pointer -Wall -dynamic -arch x86_64  -dynamic
> -dynamiclib -all_load  -o libmpiwrap-amd64-darwin.so
> libmpiwrap_amd64_darwin_so-libmpiwrap.o  
> mpicc    -I../include  -g -O -fno-omit-frame-pointer -Wall -dynamic -arch i386 
> -MT libmpiwrap_x86_darwin_so-libmpiwrap.o -MD -MP -MF
> .deps/libmpiwrap_x86_darwin_so-libmpiwrap.Tpo -c -o
> libmpiwrap_x86_darwin_so-libmpiwrap.o `test -f 'libmpiwrap.c' || echo
> './'`libmpiwrap.c
> gcc-4.2: -E, -S, -save-temps and -M options are not allowed with multiple -arch
> flags

I don't know if this is Darwin-specific or not.  An easy workaround is
to configure V with --with-mpicc=/some/path/that/doesnt/exist so as to
throw it off the trail; then it can't find any such compiler and so
skips building the MPI stuff.  (At least, that's the theory.  It would
be good if you could confirm this kludge still works.)

Comment 78 Sean Farley 2010-07-01 12:31:00 UTC

> I don't know if this is Darwin-specific or not.  An easy workaround is
> to configure V with --with-mpicc=/some/path/that/doesnt/exist so as to
> throw it off the trail; then it can't find any such compiler and so
> skips building the MPI stuff.  (At least, that's the theory.  It would
> be good if you could confirm this kludge still works.)

That does indeed work and I meant to say that I had already tried that in my previous response. It also works if you specify --enable-only[32|64]bit

Comment 79 Julian Seward 2010-07-01 12:45:06 UTC

(In reply to comment #75)
> 10.6 support is fairly good, but not perfect.  Perhaps we should
> close this bug and open new bugs for more specific problems as they
> occur?

I think we should do that, but first record here all currently known
problems.  I know of these:

* re comment #67, signal delivery is broken on 32-bit 10.6 but not
  on any other combination.  I tracked this down some more.  It 
  seems that the kernel is delivering signals to Valgrind 
  (which we will eventually forward to the application), but V's
  signal handler (darwin_signal_demux) segfaults on its first
  instruction, hence leading to an infinite signal delivery loop.
  Despite some digging around I am absolutely mystified.  If any
  Darwin kernel gurus want to help out with this, I'd be happy to
  hear from you.

* "WARNING: unhandled syscall: unix:277"

* a whole bunch of sync check failures, as per comment #74

* need to investigate
  https://bugs.kde.org/show_bug.cgi?id=216837
  and commit it if necessary.  My reluctance to do so partly
  comes from the fact that it adds Darwin-specific calls
  (track_pre_wqthread_ll_create) to the core-tool interface.

If people know of any other obvious breakage for 10.6, or bugs/
patches that need to be fixed/committed to make it work better, please
summarise them here.

Comment 80 Sean 2010-07-11 18:59:22 UTC

I built trunk r11212, and tried it out.  I can successfully run "ls" under valgrind on 10.6.4.  But not much luck with GUI apps. I created a plain Cocoa app from a new Xcode project and it launched, but crashes as soon as I touch the menu bar. This is running the app as 64 bit.

--50425-- ./PlainApp.app//Contents/MacOS/PlainApp:
--50425-- dSYM directory is missing; consider using --dsymutil=yes
--50425:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--50425:0:aspacem  316: anon 010699d000-010699ffff   12288 rw--- SmFixed d=0x000 i=0       o=0       (-1) m=0 (none)
--50425:0:aspacem  ...: .... 010699d000-010699dfff    4096 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--50425:0:aspacem  sync check at wqthread_hijack:0 (after): FAILED
--50425:0:aspacem  
==50425== Jump to the invalid address stated on the next line
==50425==    at 0x1000272FA: ???
==50425==    by 0x1030CD3A1: ???
==50425==    by 0x1030CCE28: ???
==50425==    by 0x101E7C6EB: ???
==50425==    by 0x101E7745E: ???
==50425==    by 0x101E76DA3: ???
==50425==    by 0x101E76BC2: ???
==50425==    by 0x101E6304A: ???
==50425==    by 0x101E62F18: ???
==50425==    by 0x101E62996: ???
==50425==    by 0x101E61EE5: ???
==50425==    by 0x101E61D56: ???
==50425==  Address 0x1000272fa is not stack'd, malloc'd or (recently) free'd

Comment 81 Julian Seward 2010-07-12 15:40:51 UTC

(In reply to comment #80)
> I built trunk r11212, and tried it out.  I can successfully run "ls" under
> valgrind on 10.6.4.  But not much luck with GUI apps. I created a plain Cocoa
> app from a new Xcode project and it launched, but crashes as soon as I touch
> the menu bar. This is running the app as 64 bit.

Please send a simple test case program + instructions on how to
compile it, so I can look at this.

Comment 82 Rich Siegel 2010-07-12 17:13:12 UTC

What should our expectations be for using trunk r11212 "out of the box"? I updated and built clean, but I am unable to valgrind a 32-bit app. When I do, here's what I get:


RoadHawg:valgrind siegel$ valgrind ~/svn/builds/Debug/RecentItems.app
==50002== Memcheck, a memory error detector
==50002== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==50002== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info
==50002== Command: /Users/siegel/svn/builds/Debug/RecentItems.app/Contents/MacOS/RecentItems
==50002== 
--50002-- /Users/siegel/svn/builds/Debug/RecentItems.app/Contents/MacOS/RecentItems:
--50002-- dSYM directory is missing; consider using --dsymutil=yes
--50002-- WARNING: unhandled syscall: unix:336
--50002-- You may be able to write your own handler.
--50002-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--50002-- Nevertheless we consider this a bug.  Please report
--50002-- it at http://valgrind.org/support/bug_reports.html.
vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0xA
==50002== valgrind: Unrecognised instruction at address 0x83272.
==50002== Your program just tried to execute an instruction that Valgrind
==50002== did not recognise.  There are two possible reasons for this.
==50002== 1. Your program has a bug and erroneously jumped to a non-code
==50002==    location.  If you are running Memcheck and you just saw a
==50002==    warning about a bad jump, it's probably your program's fault.
==50002== 2. The instruction is legitimate but Valgrind doesn't handle it,
==50002==    i.e. it's Valgrind's fault.  If you think this is the case or
==50002==    you are not sure, please let us know and we'll try to fix it.
==50002== Either way, Valgrind will now raise a SIGILL signal which will
==50002== probably kill your program.
==50002== 
==50002== Process terminating with default action of signal 4 (SIGILL)
==50002==  Illegal opcode at address 0x83272
==50002==    at 0x83272: floorf$fenv_access_off (in /usr/lib/libSystem.B.dylib)
==50002==    by 0x6BC324: IBCarbonContainer::initWithDecoder(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F4A6: IBXObjectElement::instantiate(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F7DE: IBXObjectElement::instantiateChild(void const*, __CFString const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F798: IBXDecoderDecodeObject (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x6C5CCC: IBCarbonWindow::initWithDecoder(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F4A6: IBXObjectElement::instantiate(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F828: IBXArrayElement::instantiate(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F7DE: IBXObjectElement::instantiateChild(void const*, __CFString const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F798: IBXDecoderDecodeObject (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F688: IBCarbonNib::initWithDecoder(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002==    by 0x57F4A6: IBXObjectElement::instantiate(void const*) (in /System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/HIToolbox.framework/Versions/A/HIToolbox)
==50002== 
==50002== HEAP SUMMARY:
==50002==     in use at exit: 294,429 bytes in 4,387 blocks
==50002==   total heap usage: 18,565 allocs, 14,178 frees, 1,373,169 bytes allocated
==50002== 
==50002== LEAK SUMMARY:
==50002==    definitely lost: 0 bytes in 0 blocks
==50002==    indirectly lost: 0 bytes in 0 blocks
==50002==      possibly lost: 28,256 bytes in 2 blocks
==50002==    still reachable: 262,193 bytes in 4,286 blocks
==50002==         suppressed: 3,980 bytes in 99 blocks
==50002== Rerun with --leak-check=full to see details of leaked memory
==50002== 
==50002== For counts of detected and suppressed errors, rerun with: -v
==50002== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Killed

I will email a zip of the app's source code (it's one of the sample apps from developer.apple.com) and the built app under separate cover.

Comment 83 Alexander Potapenko 2010-07-12 18:29:55 UTC

(In reply to comment #82)
Rich, please take a look at https://bugs.kde.org/show_bug.cgi?id=241377#c6
SSE4 support is 64-bit only now.

Comment 84 Rich Siegel 2010-07-12 18:59:55 UTC

(In reply to comment #83)

Aha, thank you - I have added a comment to that bug.

Comment 85 Sean 2010-07-13 02:50:52 UTC

(In reply to comment #81)
> Please send a simple test case program + instructions on how to
> compile it, so I can look at this.

You can repro like so:
- on 10.6.4, launch Xcode 3.2.2
- File > New Project
- choose Mac OS X > Application > Cocoa Application
- press Choose button
- save to Desktop or wherever
- build in Debug
- run with valgrind

Incase you're not familiar with Xcode, I'll attached the project and binary.

Comment 86 Sean 2010-07-13 02:55:49 UTC

Created attachment 49086 [details]
for comment #85

Comment 87 Georgi Kodinov 2010-07-13 05:52:04 UTC

I've tried running mysql's test suites under valgrind using the latest (11212) valgrind trunk on my MacOSX 10.6.
I'm still getting "unimplemented functionality" : 
--83142-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting

valgrind: m_signals.c:513 (VG_UCONTEXT_STACK_PTR): Assertion 'Unimplemented functionality' failed.
valgrind: valgrind
==83142==    at 0x13802F257: ???
==83142==    by 0x13802F494: ???
==83142==    by 0x1380436A9: ???
==83142==    by 0x13803365F: ???

sched status:
  running_tid=2

Thread 1: status = VgTs_WaitSys
==83142==    at 0x100DAAEB6: __semwait_signal (in /usr/lib/libSystem.B.dylib)
==83142==    by 0x1004CFDA6: safe_cond_wait (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x100110003: start_signal_handler() (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x1001156E0: main (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)

Thread 2: status = VgTs_Runnable
==83142==    at 0x100E4B9D2: __pthread_kill (in /usr/lib/libSystem.B.dylib)
==83142==    by 0x1004C3C12: my_write_core (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x10011091A: handle_segfault (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x138046963: ???
==83142==    by 0x100333DF1: Events::dump_internal_status() (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x1001E55D4: mysql_print_status() (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x10010FDCF: signal_hand (in /Users/kgeorge/mysql/work/test-vg-5.1-bugteam/sql/mysqld)
==83142==    by 0x100DA9455: _pthread_start (in /usr/lib/libSystem.B.dylib)
==83142==    by 0x100DA9308: thread_start (in /usr/lib/libSystem.B.dylib)

The way to reproduce is as follows : 
1. compile valgrind with " ./configure --prefix=`pwd`/install --enable-only64bit --build=amd64-darwin "
2. checkout the mysql source : bzr clone https://code.launchpad.net/~mysql/mysql-server/mysql-5.1 
3. compile the mysql source using : BUILD/compile-pentium-valgrind-max
4. "cd mysql-test" and run "./mysql-test-run.pl --valgrind t/alias.test

Comment 88 Georgi Kodinov 2010-07-13 05:54:21 UTC

hmm, there's a typo in step 3 : it should be : bzr branch lp:mysql-server/5.1. You can also get a source package from the mysql download site

Comment 89 Julian Seward 2010-07-21 19:52:25 UTC

(In reply to comment #85)
> You can repro like so:
> - on 10.6.4, launch Xcode 3.2.2
> - File > New Project
> - choose Mac OS X > Application > Cocoa Application
> - press Choose button
> - save to Desktop or wherever
> - build in Debug
> - run with valgrind

Sean: I succeeded in building it using these instructions, and I can get
the sync check messages, but not the "Jump to invalid address" that 
follows.  This is with current valgrind svn trunk (r11221) on 10.6.4.
Are you still able to reproduce this problem?  I also tried with the
exe you attached, same result.

Comment 90 Julian Seward 2010-07-22 10:50:02 UTC

(In reply to comment #74)
> A new problem observed under 10.6: sync check fails when a stack is reused.

"Fixed" (for some definition of "fixed", possibly including "merely
hides the problem") by r11223.

Comment 91 trnsca 2010-12-07 19:52:42 UTC

(In reply to comment #68)
> I tested the latest from the MACOSX106 branch (rev 11164) now, but I'm unable
> to launch even minimal test apps with this version, I get this error:
> 
> valgrind: mmap(0x100000000, 4096) failed in UME (load_segment1).

I just tried building from the 3.6.0 release tarball on a 10.6.5 MacBook Pro (XCode 3.2.5) and get the same error...even on a hello_world program...

Comment 92 Sean 2010-12-12 06:27:19 UTC

valgrind 3.6 is working well for me on 10.6.5 with Xcode 3.2.5.  The problem I reported in comment #85 no longer occurs. I'm able to run even medium-sized Cocoa apps.

Now if only garbage collection was supported... bug #208215

Comment 93 Paul Floyd 2023-02-18 21:09:21 UTC

Snow Leopard is too old.