436411 – Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

Bug 436411 - Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

Summary: Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

Status:	CLOSED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (other bugs)
Version First Reported In:	unspecified
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	ahashmi

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-04-30 15:30 UTC by ahashmi
Modified:	2021-06-08 10:42 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions (198.16 KB, text/plain) 2021-05-07 16:44 UTC, ahashmi	Details
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions (198.22 KB, text/plain) 2021-05-11 13:10 UTC, ahashmi	Details
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions (201.10 KB, text/plain) 2021-06-02 13:56 UTC, ahashmi	Details
Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions (202.05 KB, text/plain) 2021-06-07 12:38 UTC, ahashmi	Details
Show Obsolete (3) View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description ahashmi 2021-04-30 15:30:24 UTC

This patch is part of work adding arm64 v8.2 support as summarised in container bug https://bugs.kde.org/show_bug.cgi?id=428016

Comment 1 ahashmi 2021-05-07 16:44:45 UTC

Created attachment 138216 [details]
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

This patch is for the scalar instruction variant, the vector patch will follow next.

I have left the 16 bit immediate handling in iselIntExpr_RIL_wrk() as a TODO while I get to grips with v8.2 immediate encodings.

Comment 2 ahashmi 2021-05-11 13:10:03 UTC

Created attachment 138343 [details]
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

This is an updated patch which fixes a bug I found while working on the vector variants of these instructions.

Comment 3 Julian Seward 2021-05-25 11:45:47 UTC

This is for the most part good, and apart from the following I have no
concerns.

I see two issues of concern, both pertaining to CPU feature-gating.  The first
is in the front end:

+   if (bitU == 1 && size == X11 && opcode == BITS5(0,0,0,1,0)) {
+      /* -------- 1,11,00010 FABD h_h_h -------- */
..

and the same for the other added instructions: these need to be feature-gated,
so that they are not accepted on non-8.2 capable hosts.  Consider what would
happen if, with these changes, we tried to run a binary containing these insns
on an 8.0 host.  Then the front end will accept the insns, generate (eg
Iop_Add16F), which the back end will translate back into an 8.2 insn, and we
will take a SIGILL in the generated code.  But that's not the correct
behaviour: what is required is that the front end declines to decode the insn,
and that in turn will cause the existing front end machinery to eventually
deliver a SIGILL to the guest.

The second area of concern is similar but in the back end, regarding the
Iex_Const case for iselIntExpr_RIL_wrk.  To be honest I'm not sure that this
should have been changed at all -- unless the RIL field, with its division into
N/R/S fields, has acquired a new interpretation, but not a new layout, in 8.2
instructions.  Can you clarify: does the existing immediate-for-bitfield
instruction-fragment (here called RIL, and the the ARM/ARM described using the
N/R/S triple) get a new interpretation in 8.2 instructions?

If it doesn't, and is instead a new kind of immediate field for these new
insns, then it seems to me that we'd need a new struct altogether, in similar
style to, but different from, ARM64RI6/ARM64RIL/ARM64RIA/ARM64AMode.

Independent of all of that, there's still the issues that iselIntExpr_RIL_wrk
can't be allowed to have different behaviour than it does now, without first
testing that we're on an 8.2 host rather than an 8.0 host.

In the backend, the top level iselSB_ARM64 passes around a `hwcaps_host`, and
that should contain whatever gating info is needed.  In the front end,
disInstr_ARM64 similarly passes around an `archinfo` and so the
`archinfo->hwcaps` field will contain whatever's needed for gating in the front
end.

+         /* This is a temporary alert while work is done on adding v8.2
+          * support. It will be removed and fixed when Ity_I16 immediate
+          * encodings are understood better as v8.2 work progresses.
+          */
+         vpanic("iselIntExpr_RIL_wrk: TODO: unhandled Ity_I16 for Iex_Const");

Regardless of all of the above, that can't land .. it would potentially
destabilise the back end.  If you don't want to handle the case specially, then
jump to the "default case" bit at the end of the function, which can handle
anything.  (The same scheme applies to many other isel-functions too).

Comment 4 ahashmi 2021-06-02 13:56:51 UTC

Created attachment 138947 [details]
Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

Thanks for reviewing, apologies for the delay in responding.

The first issue pertaining to CPU feature-gating of the front-end was something I did implement in the first v8.2 instruction, FADDP (https://bugs.kde.org/show_bug.cgi?id=413547) but for some reason ommitted in later v8.2 work! I've added the VEX_HWCAPS_ARM64_FP16 check for all half-precision FP instructions.

The second issue to do with the Iex_Const case for iselIntExpr_RIL_wrk() turns out to be some sort of confusion on my part! I can't remember exactly what happened but I think an assert to catch Ity_I16 must've fired partway through development and I just assumed it was the the Iex_Const in iselIntExpr_RIL_wrk(). Anyway, I've removed the unnecessary changes and tested on v8.x where x<3.

I'd prefer to get this patch past review and merged before addressing the same hwcaps-gating issue in https://bugs.kde.org/show_bug.cgi?id=436873.

Comment 5 Julian Seward 2021-06-03 15:53:10 UTC

(In reply to ahashmi from comment #4)
> Created attachment 138947 [details]
> Adds arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

That all looks fine; please land.  Just before you do -- if you haven't
already -- please though run the test cases once by hand on Memcheck,
with --track-origins=yes, and check nothing breaks.  This has been
known to happen in the past, at least to me :-)

Comment 6 ahashmi 2021-06-07 12:38:43 UTC

Created attachment 139074 [details]
Addition of arm64 v8.2 scalar FABD, FACGE, FACGT and FADD instructions

> That all looks fine; please land.  Just before you do -- if you haven't
> already -- please though run the test cases once by hand on Memcheck,
> with --track-origins=yes, and check nothing breaks.  This has been
> known to happen in the past, at least to me :-)

Good point! And now I remember why I made the RIL change <facepalm>. When run with --tool=memcheck:
vex: priv/host_arm64_isel.c:1292 (iselIntExpr_RIL_wrk): Assertion `ty
== Ity_I64 || ty == Ity_I32' failed.

Thanks for the out-of-band email explanation about IR traversal for isel and the RIL functions. The latest patch includes the fix and tests pass with --tool=memcheck and --track-origins=yes.

Comment 7 ahashmi 2021-06-08 10:41:34 UTC

Landed, 3ac86151404e6d018473dc3f6cd8b933be1ee038.