Version: 2.0.0 (using KDE KDE 3.1.3) Installed from: RedHat RPMs Compiler: sun java j2sdk1.4.2 OS: Linux valgrind is apparently incompatible with Sun Java 1.4.2. Even the most trivial use of valgrind and java fails; valgrind -v --error-limit=no --trace-children=yes java -version Fatal: Stack size too small. Use 'java -Xss' to increase default stack size. Doing so, even up to ridiculous levels (35 Megabytes in java, 40 megs in valgrind), has no effect. ============================================================================= Detailed output follows; ==8884== Using valgrind-2.0.0, a program supervision framework for x86-linux. ==8884== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==8884== Command line: ==8884== java ==8884== -version ==8884== 1 ==8884== Startup, with flags: ==8884== --suppressions=/usr/local/lib/valgrind/default.supp ==8884== -v ==8884== --error-limit=no ==8884== --trace-children=yes ==8884== Reading syms from /usr/java/j2sdk1.4.2/bin/java ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/ld-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /usr/local/lib/valgrind/vgskin_memcheck.so ==8884== Reading syms from /usr/local/lib/valgrind/valgrind.so ==8884== Reading syms from /usr/local/lib/valgrind/libpthread.so ==8884== Reading syms from /lib/libdl-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/libc-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading suppressions file: /usr/local/lib/valgrind/default.supp ==8884== Estimated CPU clock rate is 1989 MHz ==8884== ==8884== Memcheck, a.k.a. Valgrind, a memory error detector for x86-linux. ==8884== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward. ==8884== Using valgrind-2.0.0, a program supervision framework for x86-linux. ==8884== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward. ==8884== Command line: ==8884== java ==8884== -version ==8884== 1 ==8884== Startup, with flags: ==8884== --suppressions=/usr/local/lib/valgrind/default.supp ==8884== -v ==8884== --error-limit=no ==8884== --trace-children=yes ==8884== Reading syms from /usr/java/j2sdk1.4.2/bin/java ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/ld-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /usr/local/lib/valgrind/vgskin_memcheck.so ==8884== Reading syms from /usr/local/lib/valgrind/valgrind.so ==8884== Reading syms from /usr/local/lib/valgrind/libpthread.so ==8884== Reading syms from /lib/libdl-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/libc-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading suppressions file: /usr/local/lib/valgrind/default.supp ==8884== Estimated CPU clock rate is 1997 MHz ==8884== ==8884== Reading syms from /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so ==8884== Reading syms from /lib/libnsl-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/libm-2.2.5.so ==8884== object doesn't have any debug info ==8884== Syscall param sigaction(act) contains uninitialised or unaddressable byte(s) ==8884== at 0x40275FA0: __libc_sigaction (in /lib/libc-2.2.5.so) ==8884== by 0x4023014C: __sigaction (vg_libpthread.c:1891) ==8884== by 0x414538CD: SR_initialize(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==8884== by 0x41453E59: os::init_2(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==8884== Address 0xBFFFC93C is on thread 1's stack ==8884== Reading syms from /usr/java/j2sdk1.4.2/jre/lib/i386/native_threads/libhpi.so ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Reading syms from /lib/libnss_files-2.2.5.so ==8884== object doesn't have any debug info ==8884== Reading syms from /usr/java/j2sdk1.4.2/jre/lib/i386/libverify.so ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Reading syms from /usr/java/j2sdk1.4.2/jre/lib/i386/libjava.so ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Reading syms from /usr/java/j2sdk1.4.2/jre/lib/i386/libzip.so ==8884== object doesn't have a symbol table ==8884== object doesn't have any debug info ==8884== Warning: set address range perms: large range 134217728, a 0, v 0 ==8884== Warning: set address range perms: large range 134217728, a 1, v 1 ==8884== Warning: set address range perms: large range 134283264, a 0, v 0 ==8884== valgrind's libpthread.so: IGNORED call to: pthread_attr_destroy ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_getattr_np ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_attr_getstackaddr ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_getattr_np ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_attr_getstackaddr ==8884== valgrind's libpthread.so: IGNORED call to: pthread_setschedparam ==8884== ==8884== Invalid write of size 4 ==8884== at 0x4633DE6A: ??? ==8884== by 0x46336103: ??? ==8884== by 0x413A0A53: JavaCalls::call_helper(JavaValue *, methodHandle *, JavaCallArguments *, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==8884== by 0x414549EC: os::os_exception_wrapper(void (*)(JavaValue *, methodHandle *, JavaCallArguments *, Thread *), JavaValue *, methodHandle *, JavaCallArguments *, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) [...] ==8884== Invalid write of size 4 ==8884== at 0x4633F134: ??? ==8884== by 0x46338DDA: ??? ==8884== by 0x46336103: ??? ==8884== by 0x413A0A53: JavaCalls::call_helper(JavaValue *, methodHandle *, JavaCallArguments *, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==8884== Address 0xBFFF95F4 is not stack'd, malloc'd or free'd ==8884== valgrind's libpthread.so: IGNORED call to: pthread_attr_destroy ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_getattr_np ==8884== valgrind's libpthread.so: KLUDGED call to: pthread_attr_getstackaddr ==8884== ==8884== ERROR SUMMARY: 1248 errors from 68 contexts (suppressed: 56 from 4) [...] --8884-- supp: 44 _dl_relocate_object*/dl_open_worker/_dl_catch_error*(Cond) --8884-- supp: 8 __pthread_mutex_unlock/_IO_funlockfile --8884-- supp: 3 pthread_error/__pthread_mutex_destroy/_IO_default_finish --8884-- supp: 1 pthread_error/__pthread_mutex_destroy/__closedir ==8884== ==8884== IN SUMMARY: 1248 errors from 68 contexts (suppressed: 56 from 4) ==8884== ==8884== malloc/free: in use at exit: 683340 bytes in 536 blocks. ==8884== malloc/free: 1063 allocs, 527 frees, 709786 bytes allocated. ==8884== --8884-- TT/TC: 0 tc sectors discarded. --8884-- 15755 chainings, 0 unchainings. --8884-- translate: new 21495 (423560 -> 6076379; ratio 143:10) --8884-- discard 0 (0 -> 0; ratio 0:10). --8884-- dispatch: 5000000 jumps (bb entries), of which 877975 (17%) were unchained. --8884-- 107/97722 major/minor sched events. 26437 tt_fast misses. --8884-- reg-alloc: 5525 t-req-spill, 1154133+31714 orig+spill uis, 119735 total-reg-r. --8884-- sanity: 108 cheap, 5 expensive checks. --8884-- ccalls: 144946 C calls, 51% saves+restores avoided (441332 bytes) --8884-- 195201 args, avg 0.89 setup instrs each (39048 bytes) --8884-- 0% clear the stack (434838 bytes) --8884-- 44881 retvals, 35% of reg-reg movs avoided (30870 bytes) ~
The abort following the message about the stack size is occuring at: ==9328== Process terminating with default action of signal 6 (SIGABRT): dumping core ==9328== at 0x40166B11: __GI___kill (in /lib/i686/libc-2.3.2.so) ==9328== by 0x40167F07: __GI_abort (in /lib/i686/libc-2.3.2.so) ==9328== by 0x40CD01D6: os::abort(int) (in /usr/java/j2sdk1.4.1_05/jre/lib/i386/client/libjvm.so) ==9328== by 0x40CCE465: os::Linux::install_alternate_signal_stack(void) (in /usr/java/j2sdk1.4.1_05/jre/lib/i386/client/libjvm.so) I found an interesting comment in the RedHat Bugzilla (bug 26096) about a problem in os::Linux::install_alternate_signal_stack where it assumes things about the size and alignment of a thread stack.
Created attachment 3599 [details] Patch to allow fetching of stack details for a thread This patch makes pthread_getattr_np save the stack address and size for the requested thread in the attribute structure and fixed pthread_attr_getstackaddr and pthread_attr_getstacksize to return that information. This is enough to make recent JVMs work.
I installed the patch on the current CVS version. I get a little further and then - the impossible happens ... ==22135== Invalid write of size 4 ==22135== at 0x4647B230: ??? ==22135== by 0x46475C82: ??? ==22135== by 0x46475D03: ??? ==22135== by 0x46475DDA: ??? ==22135== Address 0xBFFF9FBC is not stack'd, malloc'd or free'd ==22135== ==22135== Invalid write of size 4 ==22135== at 0x4647B237: ??? ==22135== by 0x46475C82: ??? ==22135== by 0x46475D03: ??? ==22135== by 0x46475DDA: ??? ==22135== Address 0xBFFF8FBC is not stack'd, malloc'd or free'd ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_getattr_np is incomplete ==22135== your program may misbehave as a result java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-b28) Java HotSpot(TM) Client VM (build 1.4.2-b28, mixed mode) ==22135== ==22135== Thread 6: ==22135== Syscall param mmap(args) contains uninitialised or unaddressable byte(s) ==22135== at 0x4033412D: __mmap (in /lib/libc-2.2.5.so) ==22135== by 0x416D794E: JavaThread::remove_stack_guard_pages(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416D3CBF: JavaThread::exit(int) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416D7C4C: JavaThread::thread_main_inner(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== Address 0x67A8EE50 is on thread 6's stack ==22135== warning: Valgrind's pthread_cond_destroy is incomplete ==22135== (it doesn't check if the cond is waited on) ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_cond_destroy is incomplete ==22135== (it doesn't check if the cond is waited on) ==22135== your program may misbehave as a result ==22135== warning: Valgrind's pthread_cond_destroy is incomplete ==22135== (it doesn't check if the cond is waited on) ==22135== your program may misbehave as a result --22135-- INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --22135-- si_code=2 Fault EIP: 0x40020A4F; Faulting address: 0x67B42000 valgrind: the `impossible' happened: Killed by fatal signal Basic block ctr is approximately 14050000 ==22135== at 0x40170A73: (within /usr/local/lib/valgrind/valgrind.so) ==22135== by 0x40170A72: panic (vg_mylibc.c:1117) ==22135== by 0x40170A99: vgPlain_core_panic (vg_mylibc.c:1122) ==22135== by 0x40177AC7: vg_sync_signalhandler (vg_signals.c:1674) sched status: Thread 1: status = Runnable, associated_mx = 0x0, associated_cv = 0x0 ==22135== at 0x40230378: (within /usr/local/lib/valgrind/libpthread.so) ==22135== by 0x40232C21: __pthread_getspecific (vg_libpthread.c:1446) ==22135== by 0x4168E6B8: ThreadLocalStorage::thread(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x41702888: Handle::Handle(oopDesc *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Thread 2: status = WaitCV, associated_mx = 0x412E487C, associated_cv = 0x412E4894 ==22135== at 0x402321D9: pthread_cond_timedwait (vg_libpthread.c:1122) ==22135== by 0x4168D499: os::Linux::safe_cond_timedwait(pthread_cond_t *, pthread_mutex_t *, timespec const *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4167E1CE: Monitor::wait(int, long) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416F5979: VMThread::loop(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Thread 3: status = WaitCV, associated_mx = 0x41398CB8, associated_cv = 0x41398CD0 ==22135== at 0x40232061: pthread_cond_wait (vg_libpthread.c:1088) ==22135== by 0x4168D32C: os::Linux::safe_cond_wait(pthread_cond_t *, pthread_mutex_t *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4168663D: ObjectMonitor::wait(long long, int, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416BBCB6: ObjectSynchronizer::wait(Handle, long long, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Thread 4: status = WaitCV, associated_mx = 0x4139AF20, associated_cv = 0x4139AF38 ==22135== at 0x40232061: pthread_cond_wait (vg_libpthread.c:1088) ==22135== by 0x4168D32C: os::Linux::safe_cond_wait(pthread_cond_t *, pthread_mutex_t *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4168663D: ObjectMonitor::wait(long long, int, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416BBCB6: ObjectSynchronizer::wait(Handle, long long, Thread *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Thread 5: status = WaitCV, associated_mx = 0x412E38DC, associated_cv = 0x412E38F4 ==22135== at 0x40232061: pthread_cond_wait (vg_libpthread.c:1088) ==22135== by 0x4168D28B: os::Linux::safe_cond_wait(pthread_cond_t *, pthread_mutex_t *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4167E197: Monitor::wait(int, long) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x416D2FE0: SuspendCheckerThread::run(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Thread 7: status = WaitCV, associated_mx = 0x412E531C, associated_cv = 0x412E5334 ==22135== at 0x40232061: pthread_cond_wait (vg_libpthread.c:1088) ==22135== by 0x4168D32C: os::Linux::safe_cond_wait(pthread_cond_t *, pthread_mutex_t *) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4167E2C5: Monitor::wait(int, long) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) ==22135== by 0x4158E31E: CompileBroker::compiler_thread_loop(void) (in /usr/java/j2sdk1.4.2/jre/lib/i386/client/libjvm.so) Note: see also the FAQ.txt in the source distribution. It contains workarounds to several common problems. If that doesn't help, please report this bug to: valgrind.kde.org In the bug report, send all the above text, the valgrind version, and what Linux distro you are using. Thanks.
I've managed to reproduce this latest issue, which looks like it is a completely separate problem. It also looks like a bug in the JVM to me. The crash in valgrind is actually happening when it is trying to mark the stack of a thread that has exited as inaccessible. The problem is that part of one of valgrind's data structures has mysteriously become inaccessible, because of the following sequence of calls made by the JVM: SYSCALL[4481,7](192):mmap2 ( 0x661CD000, 12288, 7, 50, -1, 0 ) SYSCALL[4481,7](125):mprotect ( 0x661CD000, 12288, 0 ) The first of those is an mmap with MAP_FIXED but the address given is within an area of memory which has already been allocated by valgrind for it's own use so I'm not sure why the JVM should think it can fiddle with it. The second call marks that memory as inaccessible which is what causes the crash later on.
Created attachment 3742 [details] Updated patch for current CVS head
I wonder if the JVM is looking at /proc/self/maps and doing something with that info. Anyway, I tried your most recent patch, and it didn't seem to help.
Created attachment 4001 [details] Updated patch to implement more pthread stack attributes properly This patch extends the previous version of the patch to more fully implemented various stack related pthread attributes.
What's the status on this one -- Tom, is the JVM working for you? Jeremy? Should this patch be committed?
Actually the JVM doesn't seem to be working for me at the moment, even with this patch, but it was working up to a point when I first submitted the patch. The current breakage is very odd and I haven't managed to track down the cause yet. At the end of the day though, all the patch does is to improve valgrind's handling of various stack related attributes in the pthread simulation, which is worthwhile even if it isn't enough to make the JVM work - it isn't entirely clear what use it is to use valgrind on the JVM anyway unless you're working for Sun... I have in fact found other code which needs this patch, namely current versions of wine when using the pthread driver rather than the kthread driver, something which makes valgrinding wine programs much easier. That is actually what led to the third version of the patch because I had to extend it a bit to get wine going.
valgrind could be able to help find memory errors in JNI code called by the VM -- that's the value to a developer of mixed java/jni code.
*** Bug 75505 has been marked as a duplicate of this bug. ***
I have now committed the patch that is attached to this bug even though it isn't sufficient to get current JVMs working as it is a sensible extension to valgrind's pthread support anyway. Current obstacles to getting the JVM running are several... Firstly, it doesn't like the fact that valgrind adds itself to the front of LD_LIBRARY_PATH and keeps reexecing itself. The fix in vg_main.c to make valgrind not add itself if already present doesn't actually seem to stop this for some reason that I can't figure out. The only thing that seems to fix it is changing valgrind to add itself to the end of LD_LIBRARY_PATH but that is not a good idea in general. The second problem is that 1.4.0 versions (at least 1.4.0_03 and 1.4.0_04) of the JVM try and allocate an alternate signal stack at a fixed location, as shown in this strace output: mmap2(0xfee0e000, 12288, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfee0e000 mprotect(0xfee0e000, 12288, PROT_NONE) = 0 mmap2(0xfee04000, 40960, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfee04000 sigaltstack({ss_sp=0xfee04000, ss_flags=0, ss_size=40960}, NULL) = 0 The problem is that the address used is in valgrind's address space so valgrind faults the first mmap call and the JVM then gives up with an error about failing to allocate the stack guard page. The 1.5.0 beta 2 version of the JVM doesn't do this, but fails with an abort in the hotspot compiler, apparently in pthread_cond_wait. Turning off hotspot doesn't seem to work as it still seems to be used...
This specific bug is fixed in CVS head. There still seems to be problems with using --trace-children=yes; the java command just keeps re-execing itself. I think it's getting confused by Valgrind's environment changes.
The re-execing is because Java insists on having it's own directory at the front of LD_LIBRARY_PATH and if it isn't then it adds it and re-execs. So valgrind puts itself at the front and starts Java which puts itself at the front and restarts valgrind which puts itself at the front and so on ad infinitum.