Bug 364948 - Add IBM ISA 3.0 support, patch set 5
Summary: Add IBM ISA 3.0 support, patch set 5
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.12 SVN
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-06-30 21:37 UTC by Carl Love
Modified: 2016-09-14 15:24 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions (180.14 KB, patch)
2016-06-30 21:40 UTC, Carl Love
Details
Patch5 of 5 to add testsuite support for Power ISA 3.0 instructions part 1 (1.55 MB, patch)
2016-06-30 21:58 UTC, Carl Love
Details
Patch 3 of 5 to add VEX support for the POWER ISA 3.0 instructions revised (192.58 KB, patch)
2016-07-25 18:54 UTC, Carl Love
Details
Patch 5 of 5 to add testsuite support for Power ISA3.0, revised (42.01 KB, patch)
2016-07-25 18:56 UTC, Carl Love
Details
Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions, revised (192.58 KB, patch)
2016-07-25 18:58 UTC, Carl Love
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Love 2016-06-30 21:37:14 UTC
The IBM ISA 3.0 has been released. This bugzilla is the fifth in a series of five bugzillas for adding the needed ISA 3.0 support to Valgrind. The series of bugzillas are: 359767, 361207, 362329, 363858.

Reproducible: Always
Comment 1 Carl Love 2016-06-30 21:40:12 UTC
Created attachment 99775 [details]
Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions
Comment 2 Carl Love 2016-06-30 21:58:41 UTC
Created attachment 99776 [details]
Patch5 of 5 to add testsuite support for Power ISA 3.0 instructions part 1

Testsuite patch I have only included the first part of the patch which contains the testsuite code changes and some of the expect files.  The two expected output files are rather large, total size is 33MBytes.  I don't think anyone is going to read them all.  I can post them if wants to see all of the output.
Comment 3 Julian Seward 2016-07-07 12:31:04 UTC
(In reply to Carl Love from comment #2)
> expected output files are rather large, total size is 33MBytes. 

33 MB is pretty large.  That space will be in the distro tarballs for ever more
and also on the SVN server.  Is it possible to make this a bit smaller, without
losing test coverage?
Comment 4 Julian Seward 2016-07-07 12:34:22 UTC
(In reply to Carl Love from comment #1)
> Created attachment 99775 [details]
> Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions

I have a number of concerns here, but nothing that can't be relatively
easily fixed.  They fall into 5 areas:

[1] naming and possible duplication of new IROps

[2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing)

[3] front end: generation of excessively verbose IR for some vector insns

[4] misc other correctness comments/queries

[5] various typos in comments


To go through them in order:


------ [1] naming and possible duplication of new IROps ------

diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h
+      Iop_MulAddF128,    // (A * B) + C
+      Iop_MulSubF128,    // (A * B) - C
+      Iop_NegMulAddF128, // -((A * B) + C)
+      Iop_NegMulSubF128, // -((A * B) + C)
Call these, respectively:  MAddF128, MSubF128, NegMAddF128,
NegMSubF128, so as to be consistent with the 32/64 bit variants


+      Iop_ConvI64UtoF128, /*              signed I64  -> F128 */
+      Iop_ConvI64StoF128, /*              signed I64  -> F128 */
+      Iop_ConvF64toF128,  /*                     F64  -> F128 */
+      Iop_ConvF128toF64,  /* IRRoundingMode(I32) x F128 -> F64        */
+      Iop_ConvF128toF32,  /* IRRoundingMode(I32) x F128 -> F32        */
Remove the leading Conv (eg Iop_I64UtoF128), so as to make them
consistent with other conversion operations.  


+      /* Conversion signed 32-bit value to signed BCD 128-bit */
+      Iop_I128StoBCD128,
+
+      /* Conversion signed BCD 128-bit to signed 32-bit value */
+      Iop_BCD128toI128S,
The comments don't seem to match the names.  Is the integer value
32 bits or 128 bits?


+      /* -- Vector to/from half conversion -- */
+      Iop_F16x4toF32x4, Iop_F32x4toF16x4, Iop_F64x2toF16x2, Iop_F16x2toF64x2,
You only need one lane specifier here, so as to be consistent with
other ops:
  F16toF32x4, F32toF16x4, F64toF16x2, F16toF64x2
and in fact the first two already seem to exist.  Are these different?


---- [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) ----

+static void putC ( IRExpr* e )

+static IRExpr * create_FPCC( IRExpr *NaN,   IRExpr *inf,
+                             IRExpr *zero,  IRExpr *norm,
+                             IRExpr *dnorm, IRExpr *pos,
+                             IRExpr *neg ) {

+static IRExpr * create_C( IRExpr *NaN,   IRExpr *zero,
+                          IRExpr *dnorm, IRExpr *pos,
+                          IRExpr *neg ) {

These functions all use their input trees more than once and so will
duplicate them and the computations done by them.  Please change them so
that the passed arguments are IRTemps instead.

There may be more such functions -- I am not sure if this is all of
them.


---- [3] front end:
         generation of excessively verbose IR for some vector insns ----

+   case 28: // vctzb,  Vector Count Trailing Zeros Byte
Too complex; use a vector IRop

+   case 29: // vctzh,  Vector Count Trailing Zeros Halfword
Ditto

+   case 30: // vctzw,  Vector Count Trailing Zeros Word
Ditto

+   case 31: // vctzd,  Vector Count Trailing Zeros Halfword
Ditto

For the above cases (28/29/30/31), make yourself a set of vector IROps
to do this:
  Iop_Ctz{8x16, 16x8, 32x4}
See existing ops Iop_Clz8x16, Iop_Clz16x8, Iop_Clz32x4 as an example

+         case 0:  // bcdctsq.  (Decimal Convert to Signed Quadword VX-form)
+               /* Check each of the nibbles for a valid digit 0 to 9 */
+               inv_flag[0] = newTemp( Ity_I1 );
+               assign( inv_flag[0], mkU1( 0 ) );
Didn't you do a C helper function for this in the previous patch?
This generates terribly verbose code.


------ [4] misc other correctness comments/queries ------

+static
+Bool FPU_rounding_mode_isOdd (IRExpr* mode) {
+   /* If the rounding mode is set to odd, the the expr must be a constant U8
+    * value equal to 8.  Otherwise, it must be a bin op expressiong that
+    * calculates the value.
+    */
+
+   if (mode->tag != Iex_Const)
+      return False;
+
+   vassert(mode->Iex.Const.con->tag == Ico_U32);
+   if (mode->Iex.Const.con->Ico.U8 == 0x8)
+      return True;
+
+   vex_printf("ERROR: FPU_rounding_mode_isOdd(), constant not equal to expected value\n");
+   return False;
+}
Doesn't seem right to me.  What happens if mode isn't an Iex_Const?
Do you really want to just return False?  Shouldn't the system assert
if that happens?


+++ b/memcheck/mc_translate.c
+#if !(defined(VGA_ppc64be) || defined(VGA_ppc64le))
    tl_assert(ty != Ity_I128);
+#endif
Don't make this conditional


------ [5] various typos in comments ------

Some of these occur several times -- copy n pasted?  Can you search
to find all dups?

+      /* 128-bit multipy by 10 instruction, result is lower 128-bits */
+      Iop_MulI128by10,
Nit: please fix typos (multipy) (in various places)


+++ b/memcheck/mc_translate.c
+      /* Widen 1st arg to I64.  Since 1st arg is typically a rounding
That should say I128, not I64.


+++ b/VEX/priv/guest_ppc_toIR.c
@@ -1,4 +1,3 @@
-
Nit: please don't remove the initial blank line


 /*------------------------------------------------------------*/
+void setup_value_check_args( IRType size, IRTemp *exp_mask, IRTemp *frac_mask,
Nit: blank line between comment and start of code


+/* The following functions check the floating point value to see if it
+   is zero, infinit, NaN, Normalized, Denormalized.
Nit: infinity


+static void generate_store_FPRF( IRType size, IRTemp src ) {
Nit: opening brace on its own line


+   /* Calcuulate the floating point result field FPRF */
Nit: Calculate


+++ b/VEX/pub/libvex_ir.h
+      Iop_NegMulAddF128, // -((A * B) + C)
+      Iop_NegMulSubF128, // -((A * B) + C)
The comment on the second one isn't right.


+            /* Note: PPC only coverts the 16-bt value in the upper word
Nit: typo, bt -> bit


+    * instrution set.
instrution


+  AltiVec 128 bit integer multiyply by 10 Instructions
multiyply
Comment 5 Carl Love 2016-07-25 18:51:58 UTC
> --- Comment #4 from Julian Seward <jseward@acm.org> ---
> (In reply to Carl Love from comment #1)
> > Created attachment 99775 [details]
> > Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions
> 
> I have a number of concerns here, but nothing that can't be relatively
> easily fixed.  They fall into 5 areas:
> 
> [1] naming and possible duplication of new IROps
> 
> [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing)
> 
> [3] front end: generation of excessively verbose IR for some vector insns
> 
> [4] misc other correctness comments/queries
> 
> [5] various typos in comments
> 
> 
> To go through them in order:
> 
> 
> ------ [1] naming and possible duplication of new IROps ------
> 
> diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h
> +      Iop_MulAddF128,    // (A * B) + C
> +      Iop_MulSubF128,    // (A * B) - C
> +      Iop_NegMulAddF128, // -((A * B) + C)
> +      Iop_NegMulSubF128, // -((A * B) + C)
> Call these, respectively:  MAddF128, MSubF128, NegMAddF128,
> NegMSubF128, so as to be consistent with the 32/64 bit variants

Renamed the Iops as requested

> 
> 
> +      Iop_ConvI64UtoF128, /*              signed I64  -> F128 */
> +      Iop_ConvI64StoF128, /*              signed I64  -> F128 */
> +      Iop_ConvF64toF128,  /*                     F64  -> F128 */
> +      Iop_ConvF128toF64,  /* IRRoundingMode(I32) x F128 -> F64        */
> +      Iop_ConvF128toF32,  /* IRRoundingMode(I32) x F128 -> F32        */
> Remove the leading Conv (eg Iop_I64UtoF128), so as to make them
> consistent with other conversion operations.  
> 

Renamed the Iops as requested

> 
> +      /* Conversion signed 32-bit value to signed BCD 128-bit */
> +      Iop_I128StoBCD128,
> +
> +      /* Conversion signed BCD 128-bit to signed 32-bit value */
> +      Iop_BCD128toI128S,
> The comments don't seem to match the names.  Is the integer value
> 32 bits or 128 bits?
> 

fixed comments.  The integer source/result is stored in an I128 value.

> 
> +      /* -- Vector to/from half conversion -- */
> +      Iop_F16x4toF32x4, Iop_F32x4toF16x4, Iop_F64x2toF16x2, Iop_F16x2toF64x2,
> You only need one lane specifier here, so as to be consistent with
> other ops:
>   F16toF32x4, F32toF16x4, F64toF16x2, F16toF64x2
> and in fact the first two already seem to exist.  Are these different?
> 
> 

Renamed the Iops as requested, changed the code in guest_ppc_toIR.c was
then changed to match the Iop arg type with the required changes in
host_ppc_isel.c to handle setting up the src/dest registers for the
backend to issue the instructions.


> ---- [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) ----
> 
> +static void putC ( IRExpr* e )
> 
> +static IRExpr * create_FPCC( IRExpr *NaN,   IRExpr *inf,
> +                             IRExpr *zero,  IRExpr *norm,
> +                             IRExpr *dnorm, IRExpr *pos,
> +                             IRExpr *neg ) {
> 
> +static IRExpr * create_C( IRExpr *NaN,   IRExpr *zero,
> +                          IRExpr *dnorm, IRExpr *pos,
> +                          IRExpr *neg ) {
> 
> These functions all use their input trees more than once and so will
> duplicate them and the computations done by them.  Please change them so
> that the passed arguments are IRTemps instead.
> 
> There may be more such functions -- I am not sure if this is all of
> them.

Fixed, changed the IRExpr to IRTemps in these two functions.
exponent_compare() and fractional_part_compare() use IRExpr but the
value is used once in the true and once in the false path of an if
statement.  I suspect there may be existing functions with this issue.
It is something that needs to be addressed in a follow on cleanup patch.
I have noticed that the formatting of comments through out the code is
not consistent and would like to clean that up as well in a follow
patch.

> 
> 
> ---- [3] front end:
>          generation of excessively verbose IR for some vector insns ----
> 
> +   case 28: // vctzb,  Vector Count Trailing Zeros Byte
> Too complex; use a vector IRop
> 
> +   case 29: // vctzh,  Vector Count Trailing Zeros Halfword
> Ditto
> 
> +   case 30: // vctzw,  Vector Count Trailing Zeros Word
> Ditto
> 
> +   case 31: // vctzd,  Vector Count Trailing Zeros Halfword
> Ditto
> 
> For the above cases (28/29/30/31), make yourself a set of vector IROps
> to do this:
>   Iop_Ctz{8x16, 16x8, 32x4}
> See existing ops Iop_Clz8x16, Iop_Clz16x8, Iop_Clz32x4 as an example

Created Iops  Iop_Ctz8x16, Iop_Ctz16x8, Iop_Ctz32x4, Iop_Ctz64x2 and
re-implemented the instructions using the new Iops rather then using the
existing Iops.


> 
> +         case 0:  // bcdctsq.  (Decimal Convert to Signed Quadword VX-form)
> +               /* Check each of the nibbles for a valid digit 0 to 9 */
> +               inv_flag[0] = newTemp( Ity_I1 );
> +               assign( inv_flag[0], mkU1( 0 ) );
> Didn't you do a C helper function for this in the previous patch?
> This generates terribly verbose code.

Yes, we did forgot to update this patch with that new function.  Fixed

> 
> 
> ------ [4] misc other correctness comments/queries ------
> 
> +static
> +Bool FPU_rounding_mode_isOdd (IRExpr* mode) {
> +   /* If the rounding mode is set to odd, the the expr must be a constant U8
> +    * value equal to 8.  Otherwise, it must be a bin op expressiong that
> +    * calculates the value.
> +    */
> +
> +   if (mode->tag != Iex_Const)
> +      return False;
> +
> +   vassert(mode->Iex.Const.con->tag == Ico_U32);
> +   if (mode->Iex.Const.con->Ico.U8 == 0x8)
> +      return True;
> +
> +   vex_printf("ERROR: FPU_rounding_mode_isOdd(), constant not equal to
> expected value\n");
> +   return False;
> +}
> Doesn't seem right to me.  What happens if mode isn't an Iex_Const?
> Do you really want to just return False?  Shouldn't the system assert
> if that happens?

Rewritten to fix the issues.

> 
> 
> +++ b/memcheck/mc_translate.c
> +#if !(defined(VGA_ppc64be) || defined(VGA_ppc64le))
>     tl_assert(ty != Ity_I128);
> +#endif
> Don't make this conditional

Removed #if

> 
> 
> ------ [5] various typos in comments ------
> 
> Some of these occur several times -- copy n pasted?  Can you search
> to find all dups?
> 
> +      /* 128-bit multipy by 10 instruction, result is lower 128-bits */
> +      Iop_MulI128by10,
> Nit: please fix typos (multipy) (in various places)

Fixed

> 
> 
> +++ b/memcheck/mc_translate.c
> +      /* Widen 1st arg to I64.  Since 1st arg is typically a rounding
> That should say I128, not I64.
> 

Fixed

> 
> +++ b/VEX/priv/guest_ppc_toIR.c
> @@ -1,4 +1,3 @@
> -
> Nit: please don't remove the initial blank line

Fixed

> 
> 
>  /*------------------------------------------------------------*/
> +void setup_value_check_args( IRType size, IRTemp *exp_mask, IRTemp *frac_mask,
> Nit: blank line between comment and start of code
> 
> 
> +/* The following functions check the floating point value to see if it
> +   is zero, infinit, NaN, Normalized, Denormalized.
> Nit: infinity

Fixed

> 
> 
> +static void generate_store_FPRF( IRType size, IRTemp src ) {
> Nit: opening brace on its own line
> 
> 
> +   /* Calcuulate the floating point result field FPRF */
> Nit: Calculate

Fixed
> 
> 
> +++ b/VEX/pub/libvex_ir.h
> +      Iop_NegMulAddF128, // -((A * B) + C)
> +      Iop_NegMulSubF128, // -((A * B) + C)
> The comment on the second one isn't right.

Fixed

> 
> 
> +            /* Note: PPC only coverts the 16-bt value in the upper word
> Nit: typo, bt -> bit

Fixed 

> 
> 
> +    * instrution set.
> instrution
> 
> 
> +  AltiVec 128 bit integer multiyply by 10 Instructions
> multiyply
> 

Fixed

Additionally, the test suite was rewritten to add a #define for Exhaustive testing.  The exhaustive testing generates all the test cases in the previous version of the test function.  When exhaustive is turned off, only a minimal set of test inputs is used reducing the size of the Altivec expect file from ~34MByes to ~8MBytes and the other expect file from ~8MBytes to ~3.1MBytes.
Comment 6 Carl Love 2016-07-25 18:54:03 UTC
Created attachment 100292 [details]
Patch 3 of 5 to add VEX support for the POWER ISA 3.0 instructions revised
Comment 7 Carl Love 2016-07-25 18:56:28 UTC
Created attachment 100293 [details]
Patch 5 of 5 to add testsuite support for Power ISA3.0, revised
Comment 8 Carl Love 2016-07-25 18:58:11 UTC
Created attachment 100294 [details]
Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions, revised

Description of Vex patch was wrong, re-attaching the VEX patch to correct the description
Comment 9 Julian Seward 2016-08-12 16:23:19 UTC
Thanks for the changes.  Both patches look OK to me.   Land.
Comment 10 Julian Seward 2016-09-14 13:22:25 UTC
Carl, can this land now?  Is there anything else that needs to happen before that?
Comment 11 Mark Wielaard 2016-09-14 14:13:33 UTC
This was already committed as VEX svn r3244 and valgrind svn r15938
Comment 12 Carl Love 2016-09-14 15:24:38 UTC
No issues have been seen with the ISA 3.0 support to date.  Closing.