Created attachment 133458 [details] Functional support for ISA 3.1, New Iops Power PC missing ISA 3.1 support. Additional patches, part 7
Created attachment 133459 [details] Test cases for new IOps The test cases
This is the 7th bugzilla for the ISA 3.1 Power PC support. The bugzilla for part 6 is 427404.
Part 8 patches are in bugzilla 429354
(In reply to Carl Love from comment #0) > Created attachment 133458 [details] > Functional support for ISA 3.1, New Iops > > Power PC missing ISA 3.1 support. Additional patches, part 7 It seems to me that the mc_translate.c sections of the patch aren't right, which could cause Memcheck to assert when running programs containing the new instructions. ------ diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c @@ -3410,8 +3412,7 @@ IRAtom* expr2vbits_Triop ( MCEnv* mce, unary64Fx2_w_rm(mce, vatom1, vatom2), unary64Fx2_w_rm(mce, vatom1, vatom3))); - - default: + default: Nit: return the `default:` to its correct indentation ------ + // CARLL not sure about this yet. Remove this pls ------ - do_And_Or: + /* Int 128-bit Integer two arg */ + // CARLL not sure about this yet. + case Iop_DivU128: + case Iop_DivS128: + case Iop_DivU128E: + case Iop_DivS128E: + case Iop_ModU128: + case Iop_ModS128: + uifu = mkUifUV128; difd = mkDifDV128; + and_or_ty = Ity_V128; improve = mkImproveORV128; goto do_And_Or; + This strikes me as wrong. `do_And_Or` is only intended to instrument AND and OR operations. But here, you're using it to instrument Div/Mod operations, which isn't correct. ------ @@ -5001,15 +5024,18 @@ IRExpr* expr2vbits_Unop ( MCEnv* mce, IROp op, IRAtom* atom ) return assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, vatom)); case Iop_F128LOtoF64: /* F128 -> low half of F128 */ case Iop_D128LOtoD64: /* D128 -> low half of D128 */ + case Iop_TruncF128toI128S: /* F128 -> I128S */ + case Iop_TruncF128toI128U: /* F128 -> I128U */ return assignNew('V', mce, Ity_I64, unop(Iop_128to64, vatom)); This also doesn't seem correct to me, in the sense that it simply ignores the definedness of the most significant 64 bits of the input, and, I am guessing, will create general incorrectly-typed IR. Has it been tested? These cases TruncF128toI128S and TruncF128toI128U do not belong with F128LOtoF64 and D128LOtoD64. I suggest you add them instead to the group that includes NegF128, AbsF128 and RndF128. ------ But even that (present code) doesn't seem correct: case Iop_NegF128: case Iop_AbsF128: case Iop_RndF128: case Iop_TruncF128toI64S: /* F128 -> I64S */ case Iop_TruncF128toI32S: /* F128 -> I32S (result stored in 64-bits) */ case Iop_TruncF128toI64U: /* F128 -> I64U */ case Iop_TruncF128toI32U: /* F128 -> I32U (result stored in 64-bits) */ return mkPCastTo(mce, Ity_I128, vatom); NegF128/AbsF128/RndF128 are correct, because their shadow value will be an I128 type and hence it is correct to do ` mkPCastTo(mce, Ity_I128, vatom)`. But TruncF128toI64S and Iop_TruncF128toI64U need to be PCast-ed to an I64 value -- so they belong somewhere else in this file, not here, and TruncF128toI32S and TruncF128toI32U need to be PCast-ed to an I32 value, again, somewhere else. Similarly these 6 case Iop_ReinterpF64asI64: case Iop_ReinterpI64asF64: case Iop_ReinterpI32asF32: case Iop_ReinterpF32asI32: case Iop_ReinterpI64asD64: case Iop_ReinterpD64asI64: are probably not correct; they need to be PCasted to I64/I32 appropriately. ------ Some smaller things, in other places in the patch: diff --git a/VEX/priv/guest_ppc_toIR.c b/VEX/priv/guest_ppc_toIR.c +static void generate_store_DFP_FPRF_value( ULong irType, IRExpr *src, + const VexAbiInfo* vbi ) ... + } else { + vex_printf("ERROR generate_store_DFP_FPRF_value, unknown value for irType 0x%lx\n", (unsigned long ) irType); + } Uhh .. this isn't good. Please remove it, and either assert/panic here if reaching here indicates a bug in this front end, or cause this to generate a SIGILL (insn-not-decoded) in the normal way. ------ + } + break; } Indent the `break` correctly, so it's easier to see what scope it's relevant to. ------ diff --git a/VEX/priv/host_ppc_defs.c b/VEX/priv/host_ppc_defs.c +PPCInstr* PPCInstr_AvTrinaryInt128 ( PPCAvOp op, HReg dst, + HReg src1, HReg src2, HReg src3 ) { I believe the normal terminology for a 3-way thing is "ternary", not "trinary" (I've never heard the latter word before, AFAIR) ------ @@ -4902,13 +5075,27 @@ Int emit_PPCInstr ( /*MB_MOD*/Bool* is_profInc, case Pfp_IDUTOQ: // xscvudqp p = mkFormVXR0( p, 63, fr_dst, 2, fr_src, 836, 0, endness_host ); break; + case Pfp_IQSTOQ: // xscvsqqp +// vex_printf("CARLL, issue xscvsqqp instruction. If not on P10, I will crash now on illegal inst.\n"); + p = mkFormVXR0( p, 63, fr_dst, 11, fr_src, 836, 0, endness_host ); + break; + case Pfp_IQUTOQ: // xscvuqqp +// vex_printf("CARLL, issue xscvuqqp instruction. If not on P10, I will crash now on illegal inst.\n"); + p = mkFormVXR0( p, 63, fr_dst, 3, fr_src, 836, 0, endness_host ); + break; Remove the commented-out debug printing, pls ------ +// vex_printf("CARLL dctfixqq, if not running on P10, going to crash on illegal inst.\n"); +// FAKE INST CARLL +// p = mkFormVX( p, 4, dstVSR, 10, 10, 1156, endness_host ); //vor +//REAL CALL And here. ------ case Iop_I64StoF128: vex_printf("I64StoF128"); return; + case Iop_I128StoF128: vex_printf("1284StoF128"); return; Typo, "1284"
(In reply to Carl Love from comment #1) > Created attachment 133459 [details] > Test cases for new IOps > > The test cases This seems OK to be. But per comments on https://bugs.kde.org/attachment.cgi?id=133458, you need to ensure that absolutely all of the Power instruction tests run successfully with --tool=memcheck and --tool=memcheck --track-origins=yes.
Created attachment 135082 [details] Functional support for ISA 3.1 New Iops Updated patch per Julian's review comments. Fixed pcast issues in mc_translate.c. Cleaned up/removed debug code. Removed unnecessary comments from development.
Created attachment 135083 [details] Fix pcast for existing code Fix for the pasting of the existing Iops: Iop_TruncF128toI64S Iop_TruncF128toI32S Iop_TruncF128toI64U Iop_TruncF128toI32U Iop_ReinterpF64asI64 Iop_ReinterpI64asF64 Iop_ReinterpI32asF32 Iop_ReinterpF32asI32 Iop_ReinterpI64asD64 Iop_ReinterpD64asI64
The regression tests with the above reworked functional test and patch to fix the pcasting for the existing Iops pass with no new errors. Note, the test valgrind --tool=memcheck --track-origins=yes ./test_isa_3_1_AT still hits an assertion error. All of the other ISA 3.1 tests pass with "--tool=memcheck" and "--tool=memcheck --track-origins=yes". This issue exists before the patches in this bugzilla are applied. Still investigating this issue.
Created attachment 135176 [details] Add ACC support into routine get_otrack_shadow_offset_wrk() Looking up the ACC register offset support is missing in get_otrack_shadow_offset_wrk (). This is the cause of the assertion error when running: valgrind --tool=memcheck --track-origins=yes ./test_isa_3_1_AT This patch fixes the issue. I have run the regression tests with this patch on Power 10 and Power 8 BE with no issues. Please review and if it looks OK, I will commit. Thanks.
================== https://bugs.kde.org/attachment.cgi?id=133459 Test cases for new IOps OK to land ================== https://bugs.kde.org/attachment.cgi?id=135083 Fix pcast for existing code OK to land ================== https://bugs.kde.org/attachment.cgi?id=135176 Add ACC support into routine get_otrack_shadow_offset_wrk() OK to land ================== https://bugs.kde.org/attachment.cgi?id=135082 Functional support for ISA 3.1 New Iops OK to land -- I don't see any significant problems with it -- but first please fix the following comments. +UInt generate_DFP_FPRF_value_helper( UInt gfield, If this is intended to be called from VEX-generated code, add a comment to that effect, and add an extern qualifier. If it isn't, add a static qualifier. (== follow the existing conventions) +PPCInstr* PPCInstr_AvTernaryInt128 ( PPCAvOp op, HReg dst, + HReg src1, HReg src2, HReg src3 ) { + PPCInstr* i = LibVEX_Alloc_inline(sizeof(PPCInstr)); + i->tag = Pin_AvTrinaryInt128; + i->Pin.AvTrinaryInt128.op = op; There's still an inconsistency in the naming: Ternary vs Trinary. I'd prefer you use Ternary consistently, everywhere. @@ -2321,6 +2428,21 @@ void ppPPCInstr ( const PPCInstr* i, Bool mode64 ) ppHRegPPC(i->Pin.Dfp128Cmp.dst); vex_printf(",8,28,31"); return; + + case Pin_XFormUnary994: + if (Px_DFPTOIQS) { This isn't right. It needs to be if (i->Pin.XFormUnary994.op == Px_DFPTOIQS) or something like that. + case Pin_XFormUnary994: { + int inst_sel; Regarding "int", please use the project-defined types, always: Int, Char, UChar etc, not "int", "char" Also, here, "inst_sel" is local to each of the two case-clauses, so move it into each. + case Iop_D128toI128S: { + HReg srcHi = newVRegF(env); + HReg srcLo = newVRegF(env); .. + iselDfp128Expr( &srcHi, &srcLo, env, e->Iex.Binop.arg2, IEndianess ); Isn't it the case that iselDfp128Expr always writes its first two arguments (meaning, it selects the output virtual registers itself) ? If so, then it's pointless to initialise srcHi/srcLo by calling newVRegF, because those vregs will be ignored. Better to initialise them with HReg_INVALID (or is it INVALID_HREG). Similar comment in some other places further down the file. + // store the two 64-bit pars pairs + HReg rHi, rLo; + HReg dst = newVRegV(env); + + iselInt128Expr(&rHi,&rLo, env, e->Iex.Unop.arg, IEndianess); For example here: now there's no initialisation at all. Initialise rHi/rLo to HREG_INVALID here. @@ -314,8 +316,10 @@ void ppIROp ( IROp op ) case Iop_F128LOtoF64: vex_printf("F128LOtoF64"); return; case Iop_I32StoF128: vex_printf("I32StoF128"); return; case Iop_I64StoF128: vex_printf("I64StoF128"); return; + case Iop_I128StoF128: vex_printf("128StoF128"); return; In the output text, the initial "I" is missing @@ -4753,15 +4769,19 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce, uifu = mkUifU1; difd = mkDifD1; and_or_ty = Ity_I1; improve = mkImproveOR1; goto do_And_Or; - do_And_Or: + do_And_Or: Please restore the original indentation for this label. ==================
Created attachment 136158 [details] Functional support for ISA 3.1 New Iops Updated patch based on comments
Created attachment 136159 [details] Change names from trinary to ternary in existing code Per the comment from Julian to use ternary not trinary, this patch was created to change existing uses of trinary to ternary. Adding it to the bugzilla so all patches are documented in the bugzilla.
Patches committed commit 6daaeb0ff45af6b39f14003e392992a5ac7d51c8 (HEAD -> master, origin/master, origin/HEAD) Author: Carl Love <cel@us.ibm.com> Date: Wed Nov 4 12:23:53 2020 -0600 PPC64: 128-bit Binary Integer Operations, part tests. commit affeb57ccd48a77cd7b18e56191e1eb72152c13e Author: Carl Love <cel@us.ibm.com> Date: Fri Jan 22 14:13:04 2021 -0600 PPC64: ISA 3.1 (new Iops) 128-bit Binary Integer Operations Add support for: dcffixqq DFP Convert From Fixed Quadword Quad dctfixqq DFP Convert To Fixed Quadword Quad vdivesq Vector Divide Extended Signed Quadword vdiveuq Vector Divide Extended Unsigned Quadword vdivsq Vector Divide Signed Quadword vdivuq Vector Divide Unsigned Quadword vmodsq Vector Modulo Signed Quadword vmoduq Vector Modulo Unsigned Quadword vmulesd Vector Multiply Even Signed Doubleword vmuleud Vector Multiply Even Unsigned Doubleword vmulosd Vector Multiply Odd Signed Doubleword vmuloud Vector Multiply Odd Unsigned Doubleword vmsumcud Vector Multiply-Sum & write Carry-out Unsigned Doubleword xscvqpsqz VSX Scalar Convert with round to zero Quad-Precision to Signed Quadword xscvqpuqz VSX Scalar Convert with round to zero Quad-Precision toUnsigned Quadword xscvsqqp VSX Scalar Convert Signed Quadword to Quad-Precision xscvuqqp VSX Scalar Convert Unsigned Quadword to Quad-Precision commit 5defeb017f94686aa4c746729ff5eca35aa79fb1 Author: Carl Love <cel@us.ibm.com> Date: Mon Feb 22 12:11:05 2021 -0600 PPC64: Fix naming trinary to ternary commit 953f54085dfb06e53b47d1d08e4014bac2202282 Author: Carl Love <cel@us.ibm.com> Date: Mon Jan 25 11:44:12 2021 -0600 PPC64: Add ACC register file registers to get_otrack_shadow_offset_wrkget_otrack_shadow_offset_wrk() commit 4ebdf8653859ac31f4f9e6c0f4ec4e0114498d7c Author: Carl Love <cel@us.ibm.com> Date: Fri Jan 22 13:15:20 2021 -0600 PPC64: Fix V-bit casting for existing Iops. Iop_TruncF128toI64S, Iop_TruncF128toI32S, Iop_TruncF128toI64U, Iop_TruncF128toI32U, Iop_ReinterpI32asF32, Iop_ReinterpF32asI32, Iop_ReinterpF64asI64, Iop_ReinterpI64asF64, Iop_ReinterpI64asD64, Iop_ReinterpD64asI64
Closing bugzilla
FYI, it has been brought to my attention that in the name changing patch I have a typo Fp128Ternnary not Fp128Ternary. However, the typo is then fixed in the patch to add new IOps. Looks like I merged the spelling fix into the wrong patch during development. The mainline code looks fine.