The IBM ISA 3.0 has been released. This bugzilla is the fifth in a series of five bugzillas for adding the needed ISA 3.0 support to Valgrind. The series of bugzillas are: 359767, 361207, 362329, 363858. Reproducible: Always
Created attachment 99775 [details] Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions
Created attachment 99776 [details] Patch5 of 5 to add testsuite support for Power ISA 3.0 instructions part 1 Testsuite patch I have only included the first part of the patch which contains the testsuite code changes and some of the expect files. The two expected output files are rather large, total size is 33MBytes. I don't think anyone is going to read them all. I can post them if wants to see all of the output.
(In reply to Carl Love from comment #2) > expected output files are rather large, total size is 33MBytes. 33 MB is pretty large. That space will be in the distro tarballs for ever more and also on the SVN server. Is it possible to make this a bit smaller, without losing test coverage?
(In reply to Carl Love from comment #1) > Created attachment 99775 [details] > Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions I have a number of concerns here, but nothing that can't be relatively easily fixed. They fall into 5 areas: [1] naming and possible duplication of new IROps [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) [3] front end: generation of excessively verbose IR for some vector insns [4] misc other correctness comments/queries [5] various typos in comments To go through them in order: ------ [1] naming and possible duplication of new IROps ------ diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h + Iop_MulAddF128, // (A * B) + C + Iop_MulSubF128, // (A * B) - C + Iop_NegMulAddF128, // -((A * B) + C) + Iop_NegMulSubF128, // -((A * B) + C) Call these, respectively: MAddF128, MSubF128, NegMAddF128, NegMSubF128, so as to be consistent with the 32/64 bit variants + Iop_ConvI64UtoF128, /* signed I64 -> F128 */ + Iop_ConvI64StoF128, /* signed I64 -> F128 */ + Iop_ConvF64toF128, /* F64 -> F128 */ + Iop_ConvF128toF64, /* IRRoundingMode(I32) x F128 -> F64 */ + Iop_ConvF128toF32, /* IRRoundingMode(I32) x F128 -> F32 */ Remove the leading Conv (eg Iop_I64UtoF128), so as to make them consistent with other conversion operations. + /* Conversion signed 32-bit value to signed BCD 128-bit */ + Iop_I128StoBCD128, + + /* Conversion signed BCD 128-bit to signed 32-bit value */ + Iop_BCD128toI128S, The comments don't seem to match the names. Is the integer value 32 bits or 128 bits? + /* -- Vector to/from half conversion -- */ + Iop_F16x4toF32x4, Iop_F32x4toF16x4, Iop_F64x2toF16x2, Iop_F16x2toF64x2, You only need one lane specifier here, so as to be consistent with other ops: F16toF32x4, F32toF16x4, F64toF16x2, F16toF64x2 and in fact the first two already seem to exist. Are these different? ---- [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) ---- +static void putC ( IRExpr* e ) +static IRExpr * create_FPCC( IRExpr *NaN, IRExpr *inf, + IRExpr *zero, IRExpr *norm, + IRExpr *dnorm, IRExpr *pos, + IRExpr *neg ) { +static IRExpr * create_C( IRExpr *NaN, IRExpr *zero, + IRExpr *dnorm, IRExpr *pos, + IRExpr *neg ) { These functions all use their input trees more than once and so will duplicate them and the computations done by them. Please change them so that the passed arguments are IRTemps instead. There may be more such functions -- I am not sure if this is all of them. ---- [3] front end: generation of excessively verbose IR for some vector insns ---- + case 28: // vctzb, Vector Count Trailing Zeros Byte Too complex; use a vector IRop + case 29: // vctzh, Vector Count Trailing Zeros Halfword Ditto + case 30: // vctzw, Vector Count Trailing Zeros Word Ditto + case 31: // vctzd, Vector Count Trailing Zeros Halfword Ditto For the above cases (28/29/30/31), make yourself a set of vector IROps to do this: Iop_Ctz{8x16, 16x8, 32x4} See existing ops Iop_Clz8x16, Iop_Clz16x8, Iop_Clz32x4 as an example + case 0: // bcdctsq. (Decimal Convert to Signed Quadword VX-form) + /* Check each of the nibbles for a valid digit 0 to 9 */ + inv_flag[0] = newTemp( Ity_I1 ); + assign( inv_flag[0], mkU1( 0 ) ); Didn't you do a C helper function for this in the previous patch? This generates terribly verbose code. ------ [4] misc other correctness comments/queries ------ +static +Bool FPU_rounding_mode_isOdd (IRExpr* mode) { + /* If the rounding mode is set to odd, the the expr must be a constant U8 + * value equal to 8. Otherwise, it must be a bin op expressiong that + * calculates the value. + */ + + if (mode->tag != Iex_Const) + return False; + + vassert(mode->Iex.Const.con->tag == Ico_U32); + if (mode->Iex.Const.con->Ico.U8 == 0x8) + return True; + + vex_printf("ERROR: FPU_rounding_mode_isOdd(), constant not equal to expected value\n"); + return False; +} Doesn't seem right to me. What happens if mode isn't an Iex_Const? Do you really want to just return False? Shouldn't the system assert if that happens? +++ b/memcheck/mc_translate.c +#if !(defined(VGA_ppc64be) || defined(VGA_ppc64le)) tl_assert(ty != Ity_I128); +#endif Don't make this conditional ------ [5] various typos in comments ------ Some of these occur several times -- copy n pasted? Can you search to find all dups? + /* 128-bit multipy by 10 instruction, result is lower 128-bits */ + Iop_MulI128by10, Nit: please fix typos (multipy) (in various places) +++ b/memcheck/mc_translate.c + /* Widen 1st arg to I64. Since 1st arg is typically a rounding That should say I128, not I64. +++ b/VEX/priv/guest_ppc_toIR.c @@ -1,4 +1,3 @@ - Nit: please don't remove the initial blank line /*------------------------------------------------------------*/ +void setup_value_check_args( IRType size, IRTemp *exp_mask, IRTemp *frac_mask, Nit: blank line between comment and start of code +/* The following functions check the floating point value to see if it + is zero, infinit, NaN, Normalized, Denormalized. Nit: infinity +static void generate_store_FPRF( IRType size, IRTemp src ) { Nit: opening brace on its own line + /* Calcuulate the floating point result field FPRF */ Nit: Calculate +++ b/VEX/pub/libvex_ir.h + Iop_NegMulAddF128, // -((A * B) + C) + Iop_NegMulSubF128, // -((A * B) + C) The comment on the second one isn't right. + /* Note: PPC only coverts the 16-bt value in the upper word Nit: typo, bt -> bit + * instrution set. instrution + AltiVec 128 bit integer multiyply by 10 Instructions multiyply
> --- Comment #4 from Julian Seward <jseward@acm.org> --- > (In reply to Carl Love from comment #1) > > Created attachment 99775 [details] > > Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions > > I have a number of concerns here, but nothing that can't be relatively > easily fixed. They fall into 5 areas: > > [1] naming and possible duplication of new IROps > > [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) > > [3] front end: generation of excessively verbose IR for some vector insns > > [4] misc other correctness comments/queries > > [5] various typos in comments > > > To go through them in order: > > > ------ [1] naming and possible duplication of new IROps ------ > > diff --git a/VEX/pub/libvex_ir.h b/VEX/pub/libvex_ir.h > + Iop_MulAddF128, // (A * B) + C > + Iop_MulSubF128, // (A * B) - C > + Iop_NegMulAddF128, // -((A * B) + C) > + Iop_NegMulSubF128, // -((A * B) + C) > Call these, respectively: MAddF128, MSubF128, NegMAddF128, > NegMSubF128, so as to be consistent with the 32/64 bit variants Renamed the Iops as requested > > > + Iop_ConvI64UtoF128, /* signed I64 -> F128 */ > + Iop_ConvI64StoF128, /* signed I64 -> F128 */ > + Iop_ConvF64toF128, /* F64 -> F128 */ > + Iop_ConvF128toF64, /* IRRoundingMode(I32) x F128 -> F64 */ > + Iop_ConvF128toF32, /* IRRoundingMode(I32) x F128 -> F32 */ > Remove the leading Conv (eg Iop_I64UtoF128), so as to make them > consistent with other conversion operations. > Renamed the Iops as requested > > + /* Conversion signed 32-bit value to signed BCD 128-bit */ > + Iop_I128StoBCD128, > + > + /* Conversion signed BCD 128-bit to signed 32-bit value */ > + Iop_BCD128toI128S, > The comments don't seem to match the names. Is the integer value > 32 bits or 128 bits? > fixed comments. The integer source/result is stored in an I128 value. > > + /* -- Vector to/from half conversion -- */ > + Iop_F16x4toF32x4, Iop_F32x4toF16x4, Iop_F64x2toF16x2, Iop_F16x2toF64x2, > You only need one lane specifier here, so as to be consistent with > other ops: > F16toF32x4, F32toF16x4, F64toF16x2, F16toF64x2 > and in fact the first two already seem to exist. Are these different? > > Renamed the Iops as requested, changed the code in guest_ppc_toIR.c was then changed to match the Iop arg type with the required changes in host_ppc_isel.c to handle setting up the src/dest registers for the backend to issue the instructions. > ---- [2] front end: duplication of IR trees (the IRExpr* vs IRTemp thing) ---- > > +static void putC ( IRExpr* e ) > > +static IRExpr * create_FPCC( IRExpr *NaN, IRExpr *inf, > + IRExpr *zero, IRExpr *norm, > + IRExpr *dnorm, IRExpr *pos, > + IRExpr *neg ) { > > +static IRExpr * create_C( IRExpr *NaN, IRExpr *zero, > + IRExpr *dnorm, IRExpr *pos, > + IRExpr *neg ) { > > These functions all use their input trees more than once and so will > duplicate them and the computations done by them. Please change them so > that the passed arguments are IRTemps instead. > > There may be more such functions -- I am not sure if this is all of > them. Fixed, changed the IRExpr to IRTemps in these two functions. exponent_compare() and fractional_part_compare() use IRExpr but the value is used once in the true and once in the false path of an if statement. I suspect there may be existing functions with this issue. It is something that needs to be addressed in a follow on cleanup patch. I have noticed that the formatting of comments through out the code is not consistent and would like to clean that up as well in a follow patch. > > > ---- [3] front end: > generation of excessively verbose IR for some vector insns ---- > > + case 28: // vctzb, Vector Count Trailing Zeros Byte > Too complex; use a vector IRop > > + case 29: // vctzh, Vector Count Trailing Zeros Halfword > Ditto > > + case 30: // vctzw, Vector Count Trailing Zeros Word > Ditto > > + case 31: // vctzd, Vector Count Trailing Zeros Halfword > Ditto > > For the above cases (28/29/30/31), make yourself a set of vector IROps > to do this: > Iop_Ctz{8x16, 16x8, 32x4} > See existing ops Iop_Clz8x16, Iop_Clz16x8, Iop_Clz32x4 as an example Created Iops Iop_Ctz8x16, Iop_Ctz16x8, Iop_Ctz32x4, Iop_Ctz64x2 and re-implemented the instructions using the new Iops rather then using the existing Iops. > > + case 0: // bcdctsq. (Decimal Convert to Signed Quadword VX-form) > + /* Check each of the nibbles for a valid digit 0 to 9 */ > + inv_flag[0] = newTemp( Ity_I1 ); > + assign( inv_flag[0], mkU1( 0 ) ); > Didn't you do a C helper function for this in the previous patch? > This generates terribly verbose code. Yes, we did forgot to update this patch with that new function. Fixed > > > ------ [4] misc other correctness comments/queries ------ > > +static > +Bool FPU_rounding_mode_isOdd (IRExpr* mode) { > + /* If the rounding mode is set to odd, the the expr must be a constant U8 > + * value equal to 8. Otherwise, it must be a bin op expressiong that > + * calculates the value. > + */ > + > + if (mode->tag != Iex_Const) > + return False; > + > + vassert(mode->Iex.Const.con->tag == Ico_U32); > + if (mode->Iex.Const.con->Ico.U8 == 0x8) > + return True; > + > + vex_printf("ERROR: FPU_rounding_mode_isOdd(), constant not equal to > expected value\n"); > + return False; > +} > Doesn't seem right to me. What happens if mode isn't an Iex_Const? > Do you really want to just return False? Shouldn't the system assert > if that happens? Rewritten to fix the issues. > > > +++ b/memcheck/mc_translate.c > +#if !(defined(VGA_ppc64be) || defined(VGA_ppc64le)) > tl_assert(ty != Ity_I128); > +#endif > Don't make this conditional Removed #if > > > ------ [5] various typos in comments ------ > > Some of these occur several times -- copy n pasted? Can you search > to find all dups? > > + /* 128-bit multipy by 10 instruction, result is lower 128-bits */ > + Iop_MulI128by10, > Nit: please fix typos (multipy) (in various places) Fixed > > > +++ b/memcheck/mc_translate.c > + /* Widen 1st arg to I64. Since 1st arg is typically a rounding > That should say I128, not I64. > Fixed > > +++ b/VEX/priv/guest_ppc_toIR.c > @@ -1,4 +1,3 @@ > - > Nit: please don't remove the initial blank line Fixed > > > /*------------------------------------------------------------*/ > +void setup_value_check_args( IRType size, IRTemp *exp_mask, IRTemp *frac_mask, > Nit: blank line between comment and start of code > > > +/* The following functions check the floating point value to see if it > + is zero, infinit, NaN, Normalized, Denormalized. > Nit: infinity Fixed > > > +static void generate_store_FPRF( IRType size, IRTemp src ) { > Nit: opening brace on its own line > > > + /* Calcuulate the floating point result field FPRF */ > Nit: Calculate Fixed > > > +++ b/VEX/pub/libvex_ir.h > + Iop_NegMulAddF128, // -((A * B) + C) > + Iop_NegMulSubF128, // -((A * B) + C) > The comment on the second one isn't right. Fixed > > > + /* Note: PPC only coverts the 16-bt value in the upper word > Nit: typo, bt -> bit Fixed > > > + * instrution set. > instrution > > > + AltiVec 128 bit integer multiyply by 10 Instructions > multiyply > Fixed Additionally, the test suite was rewritten to add a #define for Exhaustive testing. The exhaustive testing generates all the test cases in the previous version of the test function. When exhaustive is turned off, only a minimal set of test inputs is used reducing the size of the Altivec expect file from ~34MByes to ~8MBytes and the other expect file from ~8MBytes to ~3.1MBytes.
Created attachment 100292 [details] Patch 3 of 5 to add VEX support for the POWER ISA 3.0 instructions revised
Created attachment 100293 [details] Patch 5 of 5 to add testsuite support for Power ISA3.0, revised
Created attachment 100294 [details] Patch 5 of 5 to add VEX support for Power ISA 3.0 instructions, revised Description of Vex patch was wrong, re-attaching the VEX patch to correct the description
Thanks for the changes. Both patches look OK to me. Land.
Carl, can this land now? Is there anything else that needs to happen before that?
This was already committed as VEX svn r3244 and valgrind svn r15938
No issues have been seen with the ISA 3.0 support to date. Closing.