SUMMARY I tried to run valgrind on the following Rust program on AArch64: ```rust fn main() { let _n = std::time::Instant::now(); } ``` I ran `valgrind` with no flags, just ` /usr/local/bin/valgrind ./target/debug/instant`, and got the error: ``` /usr/local/bin/valgrind ./target/debug/instant ==16560== Memcheck, a memory error detector ==16560== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==16560== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==16560== Command: ./target/debug/instant ==16560== ARM64 front end: load_store disInstr(arm64): unhandled instruction 0xC87F2D89 disInstr(arm64): 1100'1000 0111'1111 0010'1101 1000'1001 ==16560== valgrind: Unrecognised instruction at address 0x11ffa8. ==16560== at 0x11FFA8: std::time::Instant::now (atomic.rs:2574) ==16560== by 0x10EB7B: instant::main (main.rs:2) ==16560== by 0x10ECA3: core::ops::function::FnOnce::call_once (function.rs:227) ==16560== by 0x10EC23: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:125) ==16560== by 0x10ED83: std::rt::lang_start::{{closure}} (rt.rs:63) ==16560== by 0x12219F: std::rt::lang_start_internal (function.rs:259) ==16560== by 0x10ED4B: std::rt::lang_start (rt.rs:62) ==16560== by 0x10EBB7: main (in /local/home/pnkfelix/instant/target/debug/instant) ==16560== Your program just tried to execute an instruction that Valgrind ==16560== did not recognise. There are two possible reasons for this. ==16560== 1. Your program has a bug and erroneously jumped to a non-code ==16560== location. If you are running Memcheck and you just saw a ==16560== warning about a bad jump, it's probably your program's fault. ==16560== 2. The instruction is legitimate but Valgrind doesn't handle it, ==16560== i.e. it's Valgrind's fault. If you think this is the case or ==16560== you are not sure, please let us know and we'll try to fix it. ==16560== Either way, Valgrind will now raise a SIGILL signal which will ==16560== probably kill your program. ``` OBSERVED RESULT A disInstr failure EXPECTED RESULT Program runs with no instructions unhandled. SOFTWARE/OS VERSIONS Windows: macOS: Linux/KDE Plasma: (available in About System) KDE Plasma Version: KDE Frameworks Version: Qt Version: ADDITIONAL INFORMATION
Oh, this is probably a duplicate of https://bugs.kde.org/show_bug.cgi?id=434283 ?
I can't reproduce this, testing on Fedora 33 on Parallels Workstation running on an M1 Mac Mini, with either rustc-1.55 or rustc-1.56. I suspect this is some kind of hardware capabilities problem, in that rustc-generated code is using instructions that V doesn't claim to support. From irc I see that the hardware you used here was a "Graviton 2", which has Neoverse N1 cores, and they support AArch64-v8.4. The M1 is AArch64-v8.6 I think, although which of those extensions are available within Parallels I don't know. That said, I'm still surprised it fails for you, given that it doesn't fail here, *and* given that V doesn't even fully support v8.2 of the instruction set. The output of /usr/bin/lscpu for F33-on-Parallels-on-M1 are below. Can you show the output on the failing target? Architecture: aarch64 CPU op-mode(s): 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 8 Socket(s): 1 NUMA node(s): 1 Vendor ID: ARM Model: 0 Stepping: r0p0 BogoMIPS: 48.00 NUMA node0 CPU(s): 0-7 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdf hm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp flagm2 frint
This seems to be bug in valgrind. Support for some ld*p instructions (which are ARMv8.0) is not implemented. Simple reproducer in C++ for stxp: --- snip --- #include <atomic> int main() { std::atomic<__int128_t> x; x.store(23, std::memory_order_relaxed); } --- snip --- $ clang++-12 main.cxx -o main $ valgrind ./main ==25164== Memcheck, a memory error detector ==25164== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==25164== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==25164== Command: ./main ==25164== ARM64 front end: load_store disInstr(arm64): unhandled instruction 0xC87F3168 disInstr(arm64): 1100'1000 0111'1111 0011'0001 0110'1000 ==25164== valgrind: Unrecognised instruction at address 0x400690. ==25164== at 0x400690: std::atomic<__int128>::store(__int128, std::memory_order) (in /data/dev/clangtest/main) ==25164== by 0x4005F7: main (in /data/dev/clangtest/main) ==25164== Your program just tried to execute an instruction that Valgrind ==25164== did not recognise. There are two possible reasons for this. ==25164== 1. Your program has a bug and erroneously jumped to a non-code ==25164== location. If you are running Memcheck and you just saw a ==25164== warning about a bad jump, it's probably your program's fault. ==25164== 2. The instruction is legitimate but Valgrind doesn't handle it, ==25164== i.e. it's Valgrind's fault. If you think this is the case or ==25164== you are not sure, please let us know and we'll try to fix it. ==25164== Either way, Valgrind will now raise a SIGILL signal which will ==25164== probably kill your program. ==25164== ==25164== Process terminating with default action of signal 4 (SIGILL) ==25164== Illegal opcode at address 0x400690 ==25164== at 0x400690: std::atomic<__int128>::store(__int128, std::memory_order) (in /data/dev/clangtest/main) ==25164== by 0x4005F7: main (in /data/dev/clangtest/main) ==25164== ==25164== HEAP SUMMARY: ==25164== in use at exit: 0 bytes in 0 blocks ==25164== total heap usage: 1 allocs, 1 frees, 72,704 bytes allocated ==25164== ==25164== All heap blocks were freed -- no leaks are possible ==25164== ==25164== For lists of detected and suppressed errors, rerun with: -s ==25164== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Illegal instruction (core dumped)
(In reply to Julian Seward from comment #2) > The output of /usr/bin/lscpu for F33-on-Parallels-on-M1 are below. > Can you show the output on the failing target? $ /usr/bin/lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 1 Core(s) per socket: 32 Socket(s): 1 NUMA node(s): 1 Vendor ID: ARM Model: 1 Model name: Neoverse-N1 Stepping: r3p1 BogoMIPS: 243.75 L1d cache: 2 MiB L1i cache: 2 MiB L2 cache: 32 MiB L3 cache: 32 MiB NUMA node0 CPU(s): 0-31 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
I looked into this a bit. It does indeed appear that LD{A}XP and ST{L}XP exist in AArch64 8.0 but are not implemented in V. I am somewhat surprised by this since I distinctly remember carefully making a list of all instructions that needed to be implemented, when doing the initial AArch64 port, so I'm not sure how these got forgotten. I will fix it, but it may not be an immediate fix. VEX's intermediate representation has a way to represent doubleword CAS, but can only represent single word load-exclusive / store-check, so it will need to be extended accordingly, and that may have some minor knock-on effect on other architectures. I would guess that the immediate cause of the failure is that LLVM 12 has started generating these instructions. That would explain why rustc shows the problem in comment 0 -- presumably that is rustc nightly -- and also why clang++ 12 shows the problem in comment 3.
*** Bug 434283 has been marked as a duplicate of this bug. ***
Created attachment 143328 [details] WIP patch that will possibly get you back on the road. DO NOT LAND. Fixing this is a whole trip because the various IR and arm64 frameworks were not really designed to accommodate it. Anyways, here is a WIP patch. It seems to work for simple tests (in the patch) but is not fully tested. It will not work if you run with `--sim-hints=fallback-llsc` or if the fallback LL/SC implementation is auto-selected, based on your processor, at startup. It applies against the head and also against a vanilla 3.18.1 tarball, although I haven't tested it in the latter case. If anyone wants to test it, and let me know if works, that would be appreciated. I will try to finish it up properly this coming week.
Created attachment 143474 [details] Final proposed patch Final proposed patch. This includes the fix for blocking bug 445354, which is small and which I will land separately and first.
Landed, 530df882b8f60ecacaf2b9b8a719f7ea1c1d1650. I think it's OK, and the patch contains test cases, but .. please test.