429352 – PPC ISA 3.1 support is missing, part 7

Bug 429352 - PPC ISA 3.1 support is missing, part 7

Summary: PPC ISA 3.1 support is missing, part 7

Status:	CLOSED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (other bugs)
Version First Reported In:	unspecified
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-11-19 15:51 UTC by Carl Love
Modified:	2021-02-25 20:55 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Functional support for ISA 3.1, New Iops (72.98 KB, patch) 2020-11-19 15:51 UTC, Carl Love	Details
Test cases for new IOps (66.75 KB, patch) 2020-11-19 15:52 UTC, Carl Love	Details
Functional support for ISA 3.1 New Iops (72.38 KB, patch) 2021-01-23 04:50 UTC, Carl Love	Details
Fix pcast for existing code (3.64 KB, patch) 2021-01-23 04:54 UTC, Carl Love	Details
Add ACC support into routine get_otrack_shadow_offset_wrk() (4.43 KB, patch) 2021-01-25 19:35 UTC, Carl Love	Details
Functional support for ISA 3.1 New Iops (75.94 KB, patch) 2021-02-25 17:18 UTC, Carl Love	Details
Change names from trinary to ternary in existing code (5.57 KB, patch) 2021-02-25 17:21 UTC, Carl Love	Details
Show Obsolete (2) View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Carl Love 2020-11-19 15:51:44 UTC

Created attachment 133458 [details]
Functional support for ISA 3.1, New Iops

Power PC missing ISA 3.1 support.  Additional patches, part 7

Comment 1 Carl Love 2020-11-19 15:52:39 UTC

Created attachment 133459 [details]
Test cases for new IOps

The test cases

Comment 2 Carl Love 2020-11-19 15:53:41 UTC

This is the 7th bugzilla for the ISA 3.1 Power PC support.  The bugzilla for part 6 is 427404.

Comment 3 Carl Love 2020-11-19 15:58:41 UTC

Part 8 patches are in bugzilla 429354

Comment 4 Julian Seward 2021-01-22 14:44:39 UTC

(In reply to Carl Love from comment #0)
> Created attachment 133458 [details]
> Functional support for ISA 3.1, New Iops
> 
> Power PC missing ISA 3.1 support.  Additional patches, part 7

It seems to me that the mc_translate.c sections of the patch aren't right,
which could cause Memcheck to assert when running programs containing the new
instructions.

------

diff --git a/memcheck/mc_translate.c b/memcheck/mc_translate.c

@@ -3410,8 +3412,7 @@ IRAtom* expr2vbits_Triop ( MCEnv* mce,
                                 unary64Fx2_w_rm(mce, vatom1, vatom2),
                                 unary64Fx2_w_rm(mce, vatom1, vatom3)));
 
-
-      default:
+   default:

Nit: return the `default:` to its correct indentation

------

+         // CARLL not sure about this yet.

Remove this pls

------

-      do_And_Or:
+      /* Int 128-bit Integer two arg  */
+         // CARLL not sure about this yet.
+      case Iop_DivU128:
+      case Iop_DivS128:
+      case Iop_DivU128E:
+      case Iop_DivS128E:
+      case Iop_ModU128:
+      case Iop_ModS128:
+         uifu = mkUifUV128; difd = mkDifDV128;
+         and_or_ty = Ity_V128; improve = mkImproveORV128; goto do_And_Or;
+

This strikes me as wrong.  `do_And_Or` is only intended to instrument AND and
OR operations.  But here, you're using it to instrument Div/Mod operations,
which isn't correct.

------

@@ -5001,15 +5024,18 @@ IRExpr* expr2vbits_Unop ( MCEnv* mce, IROp op, IRAtom* atom )
          return assignNew('V', mce, Ity_I64, unop(Iop_128HIto64, vatom));
       case Iop_F128LOtoF64:  /* F128 -> low  half of F128 */
       case Iop_D128LOtoD64:  /* D128 -> low  half of D128 */
+      case Iop_TruncF128toI128S: /* F128 -> I128S */
+      case Iop_TruncF128toI128U: /* F128 -> I128U */
          return assignNew('V', mce, Ity_I64, unop(Iop_128to64, vatom));

This also doesn't seem correct to me, in the sense that it simply ignores the
definedness of the most significant 64 bits of the input, and, I am guessing,
will create general incorrectly-typed IR.  Has it been tested?  These cases
TruncF128toI128S and TruncF128toI128U do not belong with F128LOtoF64 and
D128LOtoD64.  I suggest you add them instead to the group that includes
NegF128, AbsF128 and RndF128.

------

But even that (present code) doesn't seem correct:

      case Iop_NegF128:
      case Iop_AbsF128:
      case Iop_RndF128:
      case Iop_TruncF128toI64S: /* F128 -> I64S */
      case Iop_TruncF128toI32S: /* F128 -> I32S (result stored in 64-bits) */
      case Iop_TruncF128toI64U: /* F128 -> I64U */
      case Iop_TruncF128toI32U: /* F128 -> I32U (result stored in 64-bits) */
         return mkPCastTo(mce, Ity_I128, vatom);

NegF128/AbsF128/RndF128 are correct, because their shadow value will be an
I128 type and hence it is correct to do ` mkPCastTo(mce, Ity_I128, vatom)`.
But TruncF128toI64S and Iop_TruncF128toI64U need to be PCast-ed to an I64
value -- so they belong somewhere else in this file, not here, and
TruncF128toI32S and TruncF128toI32U need to be PCast-ed to an I32 value,
again, somewhere else.

Similarly these 6

      case Iop_ReinterpF64asI64:
      case Iop_ReinterpI64asF64:
      case Iop_ReinterpI32asF32:
      case Iop_ReinterpF32asI32:
      case Iop_ReinterpI64asD64:
      case Iop_ReinterpD64asI64:

are probably not correct; they need to be PCasted to I64/I32 appropriately.

------

Some smaller things, in other places in the patch:

diff --git a/VEX/priv/guest_ppc_toIR.c b/VEX/priv/guest_ppc_toIR.c

+static void generate_store_DFP_FPRF_value( ULong irType, IRExpr *src,
+                                           const VexAbiInfo* vbi )
...
+   } else {
+      vex_printf("ERROR generate_store_DFP_FPRF_value, unknown value for irType 0x%lx\n", (unsigned long ) irType);
+   }

Uhh .. this isn't good.  Please remove it, and either assert/panic here if
reaching here indicates a bug in this front end, or cause this to generate a
SIGILL (insn-not-decoded) in the normal way.

------

+     }
+   break;
    }

Indent the `break` correctly, so it's easier to see what scope it's relevant to.

------

diff --git a/VEX/priv/host_ppc_defs.c b/VEX/priv/host_ppc_defs.c

+PPCInstr* PPCInstr_AvTrinaryInt128 ( PPCAvOp op, HReg dst,
+                                     HReg src1, HReg src2, HReg src3 ) {

I believe the normal terminology for a 3-way thing is "ternary", not "trinary"
(I've never heard the latter word before, AFAIR)

------

@@ -4902,13 +5075,27 @@ Int emit_PPCInstr ( /*MB_MOD*/Bool* is_profInc,
       case Pfp_IDUTOQ:         // xscvudqp
          p = mkFormVXR0( p, 63, fr_dst, 2, fr_src, 836, 0, endness_host );
          break;
+      case Pfp_IQSTOQ:         // xscvsqqp
+//         vex_printf("CARLL, issue xscvsqqp instruction. If not on P10, I will crash now on illegal inst.\n");
+         p = mkFormVXR0( p, 63, fr_dst, 11, fr_src, 836, 0, endness_host );
+         break;
+      case Pfp_IQUTOQ:         // xscvuqqp
+//         vex_printf("CARLL, issue xscvuqqp instruction. If not on P10, I will crash now on illegal inst.\n");
+         p = mkFormVXR0( p, 63, fr_dst, 3, fr_src, 836, 0, endness_host );
+         break;

Remove the commented-out debug printing, pls

------

+//          vex_printf("CARLL dctfixqq, if not running on P10, going to crash on illegal inst.\n");
+// FAKE INST CARLL
+//          p = mkFormVX( p, 4, dstVSR, 10, 10, 1156, endness_host );  //vor
+//REAL CALL

And here.

------

       case Iop_I64StoF128: vex_printf("I64StoF128"); return;
+      case Iop_I128StoF128: vex_printf("1284StoF128"); return;

Typo, "1284"

Comment 5 Julian Seward 2021-01-22 14:47:25 UTC

(In reply to Carl Love from comment #1)
> Created attachment 133459 [details]
> Test cases for new IOps
> 
> The test cases

This seems OK to be.  But per comments on 
https://bugs.kde.org/attachment.cgi?id=133458, you need to ensure
that absolutely all of the Power instruction tests run successfully
with --tool=memcheck and --tool=memcheck --track-origins=yes.

Comment 6 Carl Love 2021-01-23 04:50:44 UTC

Created attachment 135082 [details]
Functional support for ISA 3.1 New Iops

Updated patch per Julian's review comments.  

Fixed pcast issues in mc_translate.c.
Cleaned up/removed debug code.
Removed unnecessary comments from development.

Comment 7 Carl Love 2021-01-23 04:54:16 UTC

Created attachment 135083 [details]
Fix pcast for existing code

Fix for the pasting of the existing Iops:

Iop_TruncF128toI64S
Iop_TruncF128toI32S
Iop_TruncF128toI64U
Iop_TruncF128toI32U
Iop_ReinterpF64asI64
Iop_ReinterpI64asF64
Iop_ReinterpI32asF32
Iop_ReinterpF32asI32
Iop_ReinterpI64asD64
Iop_ReinterpD64asI64

Comment 8 Carl Love 2021-01-23 04:59:34 UTC

The regression tests with the above reworked functional test and patch to fix the pcasting for the existing Iops pass with no new errors.

Note, the test  valgrind  --tool=memcheck --track-origins=yes ./test_isa_3_1_AT still hits an assertion error.  All of the other ISA 3.1 tests pass with  "--tool=memcheck" and  "--tool=memcheck --track-origins=yes".  This issue exists before the patches in this bugzilla are applied.  Still investigating this issue.

Comment 9 Carl Love 2021-01-25 19:35:08 UTC

Created attachment 135176 [details]
Add ACC support into routine  get_otrack_shadow_offset_wrk()

Looking up the ACC register offset support is missing in  get_otrack_shadow_offset_wrk ().  This is the cause of the assertion error when running:

   valgrind --tool=memcheck --track-origins=yes ./test_isa_3_1_AT

This patch fixes the issue.  I have run the regression tests with this patch on Power 10 and Power 8 BE with no issues.

Please review and if it looks OK, I will commit.  Thanks.

Comment 10 Julian Seward 2021-02-13 09:07:38 UTC

==================

https://bugs.kde.org/attachment.cgi?id=133459
Test cases for new IOps

OK to land

==================

https://bugs.kde.org/attachment.cgi?id=135083
Fix pcast for existing code

OK to land

==================

https://bugs.kde.org/attachment.cgi?id=135176
Add ACC support into routine get_otrack_shadow_offset_wrk()

OK to land

==================

https://bugs.kde.org/attachment.cgi?id=135082
Functional support for ISA 3.1 New Iops

OK to land -- I don't see any significant problems with it -- but first please
fix the following comments.

+UInt generate_DFP_FPRF_value_helper( UInt gfield,

If this is intended to be called from VEX-generated code, add a comment to
that effect, and add an extern qualifier.  If it isn't, add a static
qualifier.  (== follow the existing conventions)

+PPCInstr* PPCInstr_AvTernaryInt128 ( PPCAvOp op, HReg dst,
+                                     HReg src1, HReg src2, HReg src3 ) {
+   PPCInstr* i = LibVEX_Alloc_inline(sizeof(PPCInstr));
+   i->tag      = Pin_AvTrinaryInt128;
+   i->Pin.AvTrinaryInt128.op   = op;

There's still an inconsistency in the naming: Ternary vs Trinary.  I'd prefer
you use Ternary consistently, everywhere.

@@ -2321,6 +2428,21 @@ void ppPPCInstr ( const PPCInstr* i, Bool mode64 )
       ppHRegPPC(i->Pin.Dfp128Cmp.dst);
       vex_printf(",8,28,31");
       return;
+
+   case Pin_XFormUnary994:
+      if (Px_DFPTOIQS) {

This isn't right.  It needs to be if (i->Pin.XFormUnary994.op == Px_DFPTOIQS)
or something like that.

+   case Pin_XFormUnary994: {
+       int inst_sel;

Regarding "int", please use the project-defined types, always: Int, Char,
UChar etc, not "int", "char"

Also, here, "inst_sel" is local to each of the two case-clauses, so move it
into each.

+      case Iop_D128toI128S: {
+         HReg srcHi = newVRegF(env);
+         HReg srcLo = newVRegF(env);
..
+         iselDfp128Expr( &srcHi, &srcLo, env, e->Iex.Binop.arg2, IEndianess );

Isn't it the case that iselDfp128Expr always writes its first two arguments
(meaning, it selects the output virtual registers itself) ?  If so, then it's
pointless to initialise srcHi/srcLo by calling newVRegF, because those vregs
will be ignored.  Better to initialise them with HReg_INVALID (or is it
INVALID_HREG).  Similar comment in some other places further down the file.

+         // store the two 64-bit pars

pairs

+          HReg rHi, rLo;
+          HReg dst  = newVRegV(env);
+
+          iselInt128Expr(&rHi,&rLo, env, e->Iex.Unop.arg, IEndianess);

For example here: now there's no initialisation at all.  Initialise rHi/rLo to
HREG_INVALID here.

@@ -314,8 +316,10 @@ void ppIROp ( IROp op )
       case Iop_F128LOtoF64: vex_printf("F128LOtoF64"); return;
       case Iop_I32StoF128: vex_printf("I32StoF128"); return;
       case Iop_I64StoF128: vex_printf("I64StoF128"); return;
+      case Iop_I128StoF128: vex_printf("128StoF128"); return;

In the output text, the initial "I" is missing

@@ -4753,15 +4769,19 @@ IRAtom* expr2vbits_Binop ( MCEnv* mce,
          uifu = mkUifU1; difd = mkDifD1; 
          and_or_ty = Ity_I1; improve = mkImproveOR1; goto do_And_Or;
 
-      do_And_Or:
+   do_And_Or:

Please restore the original indentation for this label.

==================

Comment 11 Carl Love 2021-02-25 17:18:07 UTC

Created attachment 136158 [details]
Functional support for ISA 3.1 New Iops

Updated patch based on comments

Comment 12 Carl Love 2021-02-25 17:21:00 UTC

Created attachment 136159 [details]
Change names from trinary to ternary in existing code

Per the comment from Julian to use ternary not trinary, this patch was created to change existing uses of trinary to ternary.  Adding it to the bugzilla so all patches are documented in the bugzilla.

Comment 13 Carl Love 2021-02-25 17:59:27 UTC

Patches committed

commit 6daaeb0ff45af6b39f14003e392992a5ac7d51c8 (HEAD -> master, origin/master, origin/HEAD)
Author: Carl Love <cel@us.ibm.com>
Date:   Wed Nov 4 12:23:53 2020 -0600

    PPC64: 128-bit Binary Integer Operations, part tests.

commit affeb57ccd48a77cd7b18e56191e1eb72152c13e
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Jan 22 14:13:04 2021 -0600

    PPC64: ISA 3.1 (new Iops) 128-bit Binary Integer Operations
    
    Add support for:
    
      dcffixqq DFP Convert From Fixed Quadword Quad
      dctfixqq DFP Convert To Fixed Quadword Quad
    
      vdivesq Vector Divide Extended Signed Quadword
      vdiveuq Vector Divide Extended Unsigned Quadword
      vdivsq Vector Divide Signed Quadword
      vdivuq Vector Divide Unsigned Quadword
    
      vmodsq Vector Modulo Signed Quadword
      vmoduq Vector Modulo Unsigned Quadword
      vmulesd Vector Multiply Even Signed Doubleword
      vmuleud Vector Multiply Even Unsigned Doubleword
      vmulosd Vector Multiply Odd Signed Doubleword
      vmuloud Vector Multiply Odd Unsigned Doubleword
      vmsumcud Vector Multiply-Sum & write Carry-out Unsigned Doubleword
    
      xscvqpsqz VSX Scalar Convert with round to zero Quad-Precision to Signed
        Quadword
      xscvqpuqz VSX Scalar Convert with round to zero Quad-Precision toUnsigned
        Quadword
      xscvsqqp VSX Scalar Convert Signed Quadword to Quad-Precision
      xscvuqqp VSX Scalar Convert Unsigned Quadword to Quad-Precision

commit 5defeb017f94686aa4c746729ff5eca35aa79fb1
Author: Carl Love <cel@us.ibm.com>
Date:   Mon Feb 22 12:11:05 2021 -0600

    PPC64: Fix naming trinary to ternary

commit 953f54085dfb06e53b47d1d08e4014bac2202282
Author: Carl Love <cel@us.ibm.com>
Date:   Mon Jan 25 11:44:12 2021 -0600

    PPC64: Add ACC register file registers to get_otrack_shadow_offset_wrkget_otrack_shadow_offset_wrk()

commit 4ebdf8653859ac31f4f9e6c0f4ec4e0114498d7c
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Jan 22 13:15:20 2021 -0600

    PPC64: Fix V-bit casting for existing Iops.
    
    Iop_TruncF128toI64S, Iop_TruncF128toI32S, Iop_TruncF128toI64U,
    Iop_TruncF128toI32U, Iop_ReinterpI32asF32, Iop_ReinterpF32asI32,
    Iop_ReinterpF64asI64, Iop_ReinterpI64asF64, Iop_ReinterpI64asD64,
    Iop_ReinterpD64asI64

Comment 14 Carl Love 2021-02-25 18:00:40 UTC

Closing bugzilla

Comment 15 Carl Love 2021-02-25 20:55:14 UTC

FYI, it has been brought to my attention that in the name changing patch I have a typo  Fp128Ternnary not Fp128Ternary.  However, the typo is then fixed in the patch to add new IOps.  Looks like I merged the spelling fix into the wrong patch during development.  The mainline code looks fine.