SUMMARY Exactly what is says in the title. AArch64 ARMv8.3 LDAPR/LDAPRH/LDAPRB instructions aren't supported by Valgrind 3.22.0. It's annoying since on my Pixel 6, the linker64 binary itself has these instructions. STEPS TO REPRODUCE On Termux via an ARMv8.3 device running Android 14: echo 'int main(void) {}' | clang-17 -g -o test -xc - && valgrind ./test Alternatively, compile and run something like this quick dirty test (just returns the first byte of itself): .text .globl _start _start: adrp x8, _start add x8, x8, :lo12:_start ldaprb w0, [x8] mov x8, #93 // exit svc #0 OBSERVED RESULT ==9889== Memcheck, a memory error detector ==9889== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==9889== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info ==9889== Command: ./test ==9889== ARM64 front end: load_store disInstr(arm64): unhandled instruction 0x38BFC109 disInstr(arm64): 0011'1000 1011'1111 1100'0001 0000'1001 ==9889== valgrind: Unrecognised instruction at address 0x411e4f0. ==9889== at 0x411E4F0: __dl__Z26__libc_safe_arc4random_bufPvm (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x41235E7: __dl___libc_init_main_thread_late (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40D4F4F: __dl__ZL29__linker_init_post_relocationR19KernelArgumentBlockR6soinfo (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40D4ECF: __dl___linker_init (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40640B7: __dl__start (in /apex/com.android.runtime/bin/linker64) ==9889== Your program just tried to execute an instruction that Valgrind ==9889== did not recognise. There are two possible reasons for this. ==9889== 1. Your program has a bug and erroneously jumped to a non-code ==9889== location. If you are running Memcheck and you just saw a ==9889== warning about a bad jump, it's probably your program's fault. ==9889== 2. The instruction is legitimate but Valgrind doesn't handle it, ==9889== i.e. it's Valgrind's fault. If you think this is the case or ==9889== you are not sure, please let us know and we'll try to fix it. ==9889== Either way, Valgrind will now raise a SIGILL signal which will ==9889== probably kill your program. ==9889== ==9889== Process terminating with default action of signal 4 (SIGILL) ==9889== Illegal opcode at address 0x411E4F0 ==9889== at 0x411E4F0: __dl__Z26__libc_safe_arc4random_bufPvm (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x41235E7: __dl___libc_init_main_thread_late (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40D4F4F: __dl__ZL29__linker_init_post_relocationR19KernelArgumentBlockR6soinfo (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40D4ECF: __dl___linker_init (in /apex/com.android.runtime/bin/linker64) ==9889== by 0x40640B7: __dl__start (in /apex/com.android.runtime/bin/linker64) ==9889== ==9889== HEAP SUMMARY: ==9889== in use at exit: 0 bytes in 0 blocks ==9889== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==9889== ==9889== All heap blocks were freed -- no leaks are possible ==9889== ==9889== For lists of detected and suppressed errors, rerun with: -s ==9889== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Illegal instruction Note: 0x38BFC109 is ldaprb w9, [x8] EXPECTED RESULT No failure.
i should have set the severity higher since it renders it unusable for dynamically linked binaries on my device
*** Bug 476895 has been marked as a duplicate of this bug. ***
Created attachment 163612 [details] patch adding LDAPR instruction handling
after upgrading to GCC 13 we're seeing that the compiler generates LDAPR instructions for simple code when "RCpc" extension is enabled, as described here: https://community.arm.com/arm-community-blogs/b/tools-software-ides-blog/posts/enabling-rcpc-in-gcc-and-llvm I've attached a minimal patch that handles the simple variants of LDAPR together with LDAR, following doc here: https://developer.arm.com/documentation/ddi0602/2023-06/Base-Instructions/LDAPR--Load-Acquire-RCpc-Register- There's various other variants which I have not attempted handling of (including Post-index variant, also described in the doc above).
on RHEL8 (or almalinux 8.9 in my case) this is easily triggered by installing gcc-toolset-13 and compiling this C++ program: #include <stdlib.h> const char * my_getenv_once() { static const char * value = getenv("MYVALUE"); return value; } int main(int argc, char **argv) { return (my_getenv_once() ? 0 : 1); } $ rpm -q gcc-toolset-13-gcc-c++ gcc-toolset-13-gcc-c++-13.1.1-4.3.el8.aarch64 $ /opt/rh/gcc-toolset-13/root/usr/bin/c++ -g -O2 -march=armv8.2-a+rcpc mytest.cpp -o mytest $ /usr/bin/valgrind ./mytest ==210903== Memcheck, a memory error detector ==210903== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==210903== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info ==210903== Command: ./mytest ==210903== ARM64 front end: load_store disInstr(arm64): unhandled instruction 0xF8BFC260 disInstr(arm64): 1111'1000 1011'1111 1100'0010 0110'0000 ==210903== valgrind: Unrecognised instruction at address 0x4101e4. ==210903== at 0x4101E4: my_getenv_once() (myfun.cpp:3) ==210903== by 0x4100AB: main (myfun.cpp:7)
Thanks for the patch. Still looking for a machine to test it on. Although with your patch I can now run the testcase under valgrind while my test machine doesn't actually support LDAPR. I am not really familiar with the arm64 code, but the patch does look reasonable. I find the arm documentation hard to navigate, which variants exactly are we missing and which are required to be supported?
From the ARM docs all the still-missing instructions are only available with extra feature flags FEAT_LRCPC2 or FEAT_LRCPC3, so they are probably not as important. Someone familiar with how the "normal" load/store family of instructions are implemented in valgrind could probably add support for the unscaled immediate / post-index / pre-index variants, but that's out of my reach. FEAT_LRCPC2: LDAPR (unscaled immediate) and STLR (unscaled immediate) FEAT_LRCPC3 additions: post-index LDAPR, LDIAPP, STILP, and pre-index STLR instructions, and if advanced SIMD and floating-point is implemented, also the LDAPUR (SIMD&FP), LDAP1 (SIMD&FP), STLUR (SIMD&FP), and STL1 (SIMD&FP) instructions
Hi, On this patch implementing only LDAPR, is it going to be pushed in the repo or we are waiting for a patch implementing all 3 variants ? Cheers, Romain
(In reply to Mark Wielaard from comment #6) > Thanks for the patch. Still looking for a machine to test it on. Although > with your patch I can now run the testcase under valgrind while my test > machine doesn't actually support LDAPR. > You can use a Hetzner Cloud CAX11 Ampere VM for practically no cost. https://www.hetzner.com/cloud/
I have some credits on AWS, so I could probably test it there.
I was able to verify that the proposed patch does result in a successful valgrind run of Arne's production case on Graviton. That's only verifying that it applies, compiles, doesn't hit the unrecognized instruction error in valgrind, and the program behaves as it does outside of valgrind. I haven't however, dove into the IR and surrounding logic to ensure that it is semantically correct.
I believe I ran into this issue. I applied the patch and my application gets past it, although I don't know 100% if it is correct. Would be nice if this could get reviewed and merged into the valgrind release. I am using ARM Neoverse N2 CPU core: https://www.marvell.com/content/dam/marvell/en/public-collateral/embedded-processors/marvell-octeon-10-dpu-platform-product-brief.pdf According to SIGILL diagnostic the unhandled instruction is: 0xB8BFC000 When I decode this instruction using this link: https://armconverter.com/?disasm&code=00C0BFB8 I see it is also ldapr instruction: ldapr w0, [x0] My backtrace looks like this: ARM64 front end: load_store disInstr(arm64): unhandled instruction 0xB8BFC000 disInstr(arm64): 1011'1000 1011'1111 1100'0000 0000'0000 ==3466== valgrind: Unrecognised instruction at address 0xc5a7330. ==3466== at 0xC5A7330: load (core_arch_ops_gcc_aarch64.hpp:857) ==3466== by 0xC5A7330: load (core_ops_gcc_atomic.hpp:86) ==3466== by 0xC5A7330: load (atomic_impl.hpp:473) ==3466== by 0xC5A7330: boost::thread_detail::enter_once_region(boost::once_flag&) (once_atomic.cpp:39) ==3466== by 0xC5A1187: call_once<void (&)()> (once_atomic.hpp:123) ==3466== by 0xC5A1187: boost::detail::set_current_thread_data(boost::detail::thread_data_base*) (thread.cpp:160) ==3466== by 0xC5A44F3: thread_proxy (thread.cpp:174) ==3466== by 0xC837B97: start_thread (pthread_create.c:481) ==3466== by 0xD00199B: thread_start (clone.S:79) ==3466== Your program just tried to execute an instruction that Valgrind ==3466== did not recognise. There are two possible reasons for this. ==3466== 1. Your program has a bug and erroneously jumped to a non-code ==3466== location. If you are running Memcheck and you just saw a ==3466== warning about a bad jump, it's probably your program's fault. ==3466== 2. The instruction is legitimate but Valgrind doesn't handle it, ==3466== i.e. it's Valgrind's fault. If you think this is the case or ==3466== you are not sure, please let us know and we'll try to fix it. ==3466== Either way, Valgrind will now raise a SIGILL signal which will ==3466== probably kill your program. ==3466== ==3466== Process terminating with default action of signal 4 (SIGILL): dumping core
To add to my previous comment, the unhandled instruction occurs at: line 857 of boost-1.82.0/boost-1.82.0/libs/atomic/include/boost/atomic/detail/core_arch_ops_gcc_aarch64.hpp which you can see below is ldapr: 854 #if defined(BOOST_ATOMIC_DETAIL_AARCH64_HAS_RCPC) 855 if (order == memory_order_consume || order == memory_order_acquire) 856 { 857 __asm__ __volatile__ 858 ( 859 "ldapr %w[value], %[storage]\n\t" 860 : [value] "=r" (v) 861 : [storage] "Q" (storage) 862 : "memory" 863 ); 864 } 865 else 866 #endif 867 { 868 __asm__ __volatile__ 869 ( 870 "ldar %w[value], %[storage]\n\t" 871 : [value] "=r" (v) 872 : [storage] "Q" (storage) 873 : "memory" 874 ); 875 }
(In reply to Rob Bresalier from comment #12) > I believe I ran into this issue. I applied the patch and my application gets > past it, although I don't know 100% if it is correct. Would be nice if this > could get reviewed and merged into the valgrind release. There is also the problem of access to hardware that supports these features.
The patch by Arne Juul fixes the issue for me. Please considering merging the patch.