Bug 432161 - Addition of arm64 v8.2 FABS, FNEG and FSQRT instructions
Summary: Addition of arm64 v8.2 FABS, FNEG and FSQRT instructions
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-26 14:05 UTC by ahashmi
Modified: 2021-01-29 13:52 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Adds arm64 v8.2 FABS, FNEG and FSQRT instructions (105.05 KB, text/plain)
2021-01-26 15:14 UTC, ahashmi
Details
Adds arm64 v8.2 FABS, FNEG and FSQRT instructions (105.07 KB, text/plain)
2021-01-27 16:22 UTC, ahashmi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description ahashmi 2021-01-26 14:05:02 UTC
Addition of arm64 half-precision v8.2 FABS, FNEG and FSQRT instructions.

This is part of arm64 v8.2 container bug https://bugs.kde.org/show_bug.cgi?id=428016
Comment 1 ahashmi 2021-01-26 15:14:23 UTC
Created attachment 135211 [details]
Adds arm64 v8.2 FABS, FNEG and FSQRT instructions

Patch ready for review.
Comment 2 ahashmi 2021-01-26 15:16:27 UTC
When ready for checkin, please paste the following text as the Git commit message:

- - - snip
432161 Addition of arm64 v8.2 FADDP, FNEG and FSQRT

This patch adds FP half-precision support for the following:
FABS <Hd>, <Hn>
FABS <Vd>.<T>, <Vn>.<T>
FNEG <Hd>, <Hn>
FNEG <Vd>.<T>, <Vn>.<T>
FSQRT <Hd>, <Hn>
FSQRT <Vd>.<T>, <Vn>.<T>

Fixes https://bugs.kde.org/show_bug.cgi?id=432161
- - - snip

Unless there are any suggestions for improvements, I'll try and keep this Git commit message format for all future arm64 v8.2 instruction patches.
Comment 3 Julian Seward 2021-01-27 08:48:05 UTC
Nicely done!  I did spot one bug though:

@@ -3752,6 +3775,8 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce,
 
       case Iop_I32StoF32x4:
       case Iop_F32toI32Sx4:
+      case Iop_Sqrt16Fx8:
+         return unary16Fx8_w_rm(mce, vatom1, vatom2);
       case Iop_Sqrt32Fx4:
          return unary32Fx4_w_rm(mce, vatom1, vatom2);
       case Iop_Sqrt64Fx2:

This isn't quite right, in that the two added lines change the behaviour for
the fallthrough cases immediately above.  I mean: consider what happens for
Iop_I32StoF32x4 and Iop_F32toI32Sx4 with and without the added lines.

Otherwise it's fine.

One final question: did you check that the relevant test programs run OK
(without any assertion failures) on memcheck, by running them by hand, with
--tool=memcheck and also --tool=memcheck --track-origins=yes ?  If not, could
you please do that and make sure nothing breaks.
Comment 4 ahashmi 2021-01-27 16:22:48 UTC
Created attachment 135237 [details]
Adds arm64 v8.2 FABS, FNEG and FSQRT instructions

Fixes bug raised in review (case Iop_Sqrt16Fx8: position).
Comment 5 ahashmi 2021-01-27 16:27:27 UTC
> This isn't quite right, in that the two added lines change the behaviour for the fallthrough cases immediately above.
Gah, of course! Careless of me. Fixed.

> One final question: did you check that the relevant test programs run OK (without any assertion failures) on memcheck, by running them by hand, with --tool=memcheck and also --tool=memcheck --track-origins=yes ?
Yes, there are no assertions and nothing breaks. Alse checked debug output displays new mnemonics and operands (using --trace-flags=10001111 --tool=none).
Comment 6 Julian Seward 2021-01-29 12:41:00 UTC
(In reply to ahashmi from comment #5)

Looks good to me.  Land!
Comment 7 ahashmi 2021-01-29 13:50:52 UTC
Landed, 7593a4773997807695d6514e9d01a60cb489851c