Created attachment 122885 [details] patch to avoid SIGILL; doesn't implement cachegrind information SUMMARY Cache flush opcodes dc cvac, cvap, cvadp, civac cause valgrind to SIGILL, despite cvac and civac being in the arch baseline (the other two, cvap and cvadp, were added in 8.2 and 8.5 ISAs respectively). The only flush instruction that works is cvau. STEPS TO REPRODUCE 1. compile a program that includes "dc cvac" 2. test that it works on real hardware 3. run it under valgrind OBSERVED RESULT SIGILL EXPECTED RESULT On any arm64 machine, instructions other than dc cvap, cvadp should work. Valgrind doesn't need to ban opcodes from newer processors by itself, thus there's little point in banning cvap cvadp either. ADDITIONAL INFORMATION The function of those opcodes: * dc cvau makes icache same as dcache (ie, flushes L1 to L2) * dc cvac flushes all cache levels to real memory (L1-L3 to DRAM) * dc cvap flushes to system-defined "point of persistence" -- which might be memory controller or even no operation at all * dc cvadp flushes to actual medium the persistent memory is backed with (3DXpoint on Intel DCPMM, flash on HPE/IBM NVDIMM-N) * dc civac does cvac then evicts the cacheline from L1-L3 Here's a working but incomplete patch that has been applied in Debian (https://bugs.debian.org/930708); I see that valgrind doesn't use Phabricator thus I'm not sure what's the appropriate place for patch submissions. This patch stops the SIGILL, allowing use of tools like memcheck, drd or helgrind. It does not pass appropriate information to cachegrind -- I don't know its representation of cache levels well enough.
In c82d35f6d67ed34cf20d79f90a7400bd7f83ebad just now I added support for 'dc civac', which is the only one of the variants I have a way to test right now. Is that a good enough fix for your use case(s) ?
Alas, nope -- I specifically need the variant without invalidation. On ARMv8.0 and 8.1 that's cvac; what I really want is cvap but that's unsupported on any hardware I have access to. Mainline valgrind (ie, without pmemcheck) doesn't appear to care about flush depth, thus considering them to be the same is ok. Other than civac, which should notify cachegrind the cacheline is gone.
My problem is that it is safer if we don't have untested code in the tree. That's why I didn't take your original patch. If you can post the minimum- change patch that provides |cvac|, relative to the current trunk, and verify it works, then I'm happy to land it.
Created attachment 134289 [details] patch v2, only known instructions
You've added civac which differs from cvac only by trashing the cache entry after committing it to memory ("i" stands for "invalidate"). cvac is the instruction I need, and I'm using it successfully. It works on bare metal, and, with the patch applied, also in valgrind. The codebase I'm using this in is https://github.com/pmem/pmdk/; Or rather, what I really want is cvap/cvacp, but that's available only in ARMv8.2/ARMv8.5 respectively. While I have no access to such hardware, I just confirmed with ARM folks that these instructions work the way I understand. As the difference between flush targets doesn't matter for now (there's pmemcheck which would need this information, but that's not submitted to you yet), as long as the opcodes are accepted without a crash, that's pretty much enough. Unless you have persistent memory hardware, cache-vs-memory doesn't matter much.
I’ve started working on VGP_arm64_freebsd and I’m seeing errors like this. I’ve been in touch with Louis Brunner (working on Darwin arm64) and he is having similar or the same issues. I’ll try to look at the asm and/or try this patch to see if it helps.
In the end it was me being stupid. I missed out the conditional block in VG_(invalidate_icache)
commit d39ba3bade49483d7c1067405a881b214e895cf9 (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Sat May 11 15:21:47 2024 +0200 Bug 412377 - SIGILL on cache flushes on arm64 Patch contributed by Adam Borowski <kilobyte@angband.pl>