Add support for the NNPA facillity introduced with arch14. It includes the instructions nnpa, vclfnh, vclfnl, vcrnf , vcfn, and vcnf.
Created attachment 151228 [details] Preliminary patch for adding NNPA support This introduces NNPA support, but the memory effects of the NNPA instruction are not represented correctly. In particular this means that memcheck will have false positives as well as false negatives. The current approach uses a dirty helper to implement NNPA. Currently, only a single memory effect can be specified for a dirty helper, and the size is fixed. However, the NNPA instruction behaves more like a subroutine that runs on a separate CPU. It reads and writes various memory regions at dynamic addresses with dynamic sizes. In that regard it might be more comparable with a system call. I mainly attach the patch for illustration purposes. Ideas how to do represent the memory effects correctly are very welcome.
Small update: I'm now considering a different approach that's more similar to the way system calls are handled. Instead of calling a dirty helper, we would have the dispatcher call a guest-specific extension and describe memory effects with the PRE/POST_MEM_* macros. I'm still investigating if this is feasible and how to do that exactly.
Extension idea was posted to the mailinglist: https://sourceforge.net/p/valgrind/mailman/message/58753610/
(In reply to Mark Wielaard from comment #3) > Extension idea was posted to the mailinglist: > https://sourceforge.net/p/valgrind/mailman/message/58753610/ Right. I'd like to go ahead with this, if that's OK. There shouldn't be any impact on other platforms at this point, AFAIK, and any potential improvements can likely be performed on top without too much hassle.
I haven't reviewed the fully proposal, but in general I like the idea of using the same mechanism as the syscalls for instrumenting these more complex instructions. It is at least familiar. You might want to rename guest_IP_AT_SYSCALL (it looks like it has the right semantics, but looks funny) and you might want to look at what defines VG_(client_extension). Currently it is only defined for s390x, so building scheduler.c will break on any other arch. It should probably be its own function inside scheduler.c with something like #if defined(VGA_s390x) x390x_... #else VG_(core_panic)("arch doesn't implement...") #endif. It would be nice to get this in this week so it makes valgrind -3.23.0-RC1 and can get a bit wider testing.
Created attachment 168623 [details] Updated version of the extension proposal This updated version of the extension proposal addresses the following: * Add a comment that explains the use of guest_IP_AT_SYSCALL for extension handlers. * Ensure that compilation doesn't break on other platforms. * Fix some issues with the PRNO implementation.
(In reply to Andreas Arnez from comment #6) > Created attachment 168623 [details] > Updated version of the extension proposal I pushed this now, as a preparation to adding NNPA support. If there are still any issues with this, please let me know.
(In reply to Andreas Arnez from comment #7) > (In reply to Andreas Arnez from comment #6) > > Created attachment 168623 [details] > > Updated version of the extension proposal > I pushed this now, as a preparation to adding NNPA support. If there are > still any issues with this, please let me know. Looks good on both x86_64 and s390x (both fedora 39). On s390x the ppno.vgtest succeeds. x86_64 had zero failures. s390x had one failure (also before you patch): drd/tests/tc04_free_lock (stderr) I did push two small fixups: commit 02bca4fa3a62dafbc8f051a0f766f9d5763f6e1c Author: Mark Wielaard <mark@klomp.org> Date: Thu Apr 18 15:58:37 2024 +0200 Add pub_core_extension.h to coregrind/Makefile.am noinst_HEADERS commit 381e2a25fbf8e7d16d66851a5d0e1f919cf69505 (HEAD -> master) Author: Mark Wielaard <mark@klomp.org> Date: Thu Apr 18 17:45:16 2024 +0200 Don't include pub_tool_tooliface.h in priv_types_n_macros.h post-regtest-checks complains: *** File coregrind/m_extension/priv_types_n_macros.h must not include pub_tool_tooliface.h Which isn't really necessary anyway since priv_types_n_macros.h already includes pub_core_tooliface.h.
Support for the NNPA instruction has also been pushed, so NNPA support should be complete now.