i knew its still experimental but it seemed the only option for me... i gave valgrinds experimental arm support a go. i used svn sources from this afternoon. i used a natively compiled version on this platform: Processor : ARMv7 Processor rev 3 (v7l) BogoMIPS : ***** Features : swp half thumb fastmult vfp edsp CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x1 CPU part : 0xc08 CPU revision : 3 L1 I cache :VIPT it is supposed to be some TI OMAP3 chip, meaning the Cortex-A8 core design. the used compiler has this version information; gcc (GCC) 4.3.1 Copyright (C) 2008 Free Software Foundation, Inc. when running it the normal way it simply segfaults. when running it in gdb it reports the lines appended below. (for other reasons i am just updating to latest stable gdb.) as i really want to use your tooling i would be keen on providing you more information. please instruct me what i should do in this case to improve the situation for the test reports i can provide to you. regards, Alex. PS: i have a long term programming experience - its just a blockage that i am seeing right now because the segfault is in-between nowhere. user@machine# gdb valgrind GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "arm-angstrom-linux-gnueabi"... (gdb) run ls -l Starting program: /usr/local/bin/valgrind ls -l Executing new program: /usr/local/lib/valgrind/memcheck-arm-linux ==6437== Memcheck, a memory error detector ==6437== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. ==6437== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info ==6437== Command: ls -l ==6437== Program received signal SIGSEGV, Segmentation fault. 0x625ce9f4 in ?? () (gdb) bt #0 0x625ce9f4 in ?? () Cannot access memory at address 0x0 (gdb)
What is the result of the command /usr/local/bin/valgrind -d -d -v -v --trace-flags=10000000 ls -l (there may be quite a lot of output. Post or attach it all.)
Created attachment 51462 [details] console log from proposed test run see attachment. path to executeable was tuned since its now integrated in the cross-build environment. SVN snapshop is still from the same date/time. ~> valgrind --version valgrind-3.6.0.SVN
(In reply to comment #2) > Created an attachment (id=51462) [details] > console log from proposed test run Startup was almost completely successful, but the place where the TLS pointer is obtained from is wrong, and this causes the program to segfault when starting up libpthread. ARM Linux has two different ways to tell the current thread what its TLS pointer is, but Valgrind only supports one of them, and your Linux setup uses the other. (I guess). I will try to find more information. ==== SB 1158 [tid 1] (0xffff0fe0) SBs exec'd 36436 ==== ==== SB 1159 [tid 1] __pthread_initialize_minimal+8(0x498be84) SBs exec'd 36437 ==== ==== SB 1160 [tid 1] __pthread_initialize_minimal+40(0x498bea4) SBs exec'd 36438 ==== ==11969== Invalid write of size 4 ==11969== at 0x498BEAC: __pthread_initialize_minimal (in /lib/libpthread-2.9.so) ==11969== Address 0x4002204c is not stack'd, malloc'd or (recently) free'd ==11969== Process terminating with default action of signal 11 (SIGSEGV) ==11969== Access not within mapped region at address 0x4002204C ==11969== at 0x498BEAC: __pthread_initialize_minimal (in /lib/libpthread-2.9.so
i've found this FAQ source (talking about Android, i'm on OE-OAOS-Angstroem-custom): http://elinux.org/Android_on_OMAP#TLS_issue the linked chapter and the two consecutive ones might be of some help for understanding this case. that far as i can resolve on wikipedia (http://en.wikipedia.org/wiki/Armv7) the term "armv7" resolves to an ARM Cortex-* model - its supposed to be a Cortex-A8 in form of the TI OMAP3. The above linked FAQ lists "ARMv6K (MPCORE) and ARMv7 (Cortex). Regarding OMAP, this is OMAP3 (Cortex)." (my case) - it says these models have a "TLS issue" or better say they have an asic add-on that serves for TLS in hardware. other older models ("OMAP1 (ARM9) and OMAP2 (ARM11) don't have this issue. ") dont have this unit. for my understanding on old devices you will run the "old" method as you have no choice. for new devices you might be able to still run the old method if you platform code is consistent but you rather want to use the "new" method. (sorry if i cant give them a more valid name... i'm typing on the fly) the android solution seems to be a trap operation making all code the same but serving the intended behavior using a fault handler. - i've to dig my packages in order to see for critical components (e.g. pthreads and mono) to make sure they are consistently configured. as valgrind is definitely very system dependent i see that this might be the root cause. depending on the TLS design the needed changes might range from "none" to "runtime-detectable" to "compile time defined". anything that replaces that sole segfault with something more informative is indeed desirable. (as said, this is writing on the fly - i might be very on error with that.) my main focus still simply is: valgrind should do its job on that system.
Try applying the inverse of r10973 to your tree, with a command like this (not sure if this is right) svn merge -r10973:10972 svn://svn.valgrind.org/valgrind/branches/ARM . Does that help?
found the change in question linked there: http://old.nabble.com/ARM-set_tls-syscall-handling-td27407796.html applied it that way to the targets machine native build environment: # patch -R -p2 <../r10973.patch patching file coregrind/m_scheduler/scheduler.c Hunk #1 succeeded at 1070 (offset 10 lines). patching file coregrind/m_syswrap/syswrap-arm-linux.c Hunk #1 succeeded at 279 (offset 14 lines). patching file coregrind/pub_core_threadstate.h compiling, installing, testing... old: # /usr/______bin/valgrind -d -d -v -v --trace-flags=10000000 ls -l new: # /usr/local/bin/valgrind -d -d -v -v --trace-flags=10000000 ls -l [...] ==12865== ==12865== HEAP SUMMARY: ==12865== in use at exit: 14,326 bytes in 38 blocks ==12865== total heap usage: 106 allocs, 68 frees, 26,143 bytes allocated ==12865== ==12865== Searching for pointers to 38 not-freed blocks --12865-- Scanning root segment: 0x28000..0x28fff (4096) --12865-- Scanning root segment: 0x401d000..0x401dfff (4096) --12865-- Scanning root segment: 0x4022000..0x4023fff (8192) --12865-- Scanning root segment: 0x4025000..0x4025fff (4096) --12865-- Scanning root segment: 0x4026000..0x4026fff (4096) --12865-- Scanning root segment: 0x482e000..0x482efff (4096) --12865-- Scanning root segment: 0x483e000..0x483efff (4096) --12865-- Scanning root segment: 0x484d000..0x484dfff (4096) --12865-- Scanning root segment: 0x485f000..0x485ffff (4096) --12865-- Scanning root segment: 0x4983000..0x4983fff (4096) --12865-- Scanning root segment: 0x4984000..0x4986fff (12288) --12865-- Scanning root segment: 0x49a3000..0x49a3fff (4096) --12865-- Scanning root segment: 0x49a4000..0x49a5fff (8192) --12865-- Scanning root segment: 0xbd7fd000..0xbd800fff (16384) ==12865== Checked 72,056 bytes ==12865== ==12865== LEAK SUMMARY: ==12865== definitely lost: 72 bytes in 2 blocks ==12865== indirectly lost: 240 bytes in 20 blocks ==12865== possibly lost: 0 bytes in 0 blocks ==12865== still reachable: 14,014 bytes in 16 blocks ==12865== suppressed: 0 bytes in 0 blocks ==12865== Rerun with --leak-check=full to see details of leaked memory ==12865== ==12865== Use --track-origins=yes to see where uninitialised values come from ==12865== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 25 from 5) ==12865== ==12865== 2 errors in context 1 of 1: ==12865== Conditional jump or move depends on uninitialised value(s) ==12865== at 0x4016554: index (in /lib/ld-2.9.so) ==12865== --12865-- --12865-- used_suppression: 25 U1004-ARM-_dl_relocate_object ==12865== ==12865== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 25 from 5) --12865:1:core_os VG_(terminate_NORETURN)(tid=1) looks like it reached the end of the test run without hitting a segfault. thank you for that hint. whatever this tells me about the used platform...
There is also discussion of this issue in bug 254556.