Bug 385210 - vpermr instruction could exhaust temporary memory
Summary: vpermr instruction could exhaust temporary memory
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.14 SVN
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-29 18:07 UTC by Carl Love
Modified: 2017-10-05 17:27 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
re-implement vpermr instruction to generate fewer iops (7.91 KB, patch)
2017-09-29 18:26 UTC, Carl Love
Details
re-implement vpermr instruction to generate fewer iops (7.39 KB, patch)
2017-10-03 16:06 UTC, Carl Love
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Love 2017-09-29 18:07:59 UTC
The implementation of the vpermr instruction will generate a lot of Iops which could exhaust temporary memory.  The instruction implementation is similar to the xxperm and xxpermr instructions which exhibited this issue.
Comment 1 Carl Love 2017-09-29 18:26:08 UTC
Created attachment 108095 [details]
re-implement vpermr instruction to generate fewer iops

Patch reworks the vpermr instruction to use the Iop_Perm8x16.  This reduces the Iop generation significantly.
Comment 2 Julian Seward 2017-10-03 10:07:40 UTC
diff --git a/VEX/priv/guest_ppc_toIR.c b/VEX/priv/guest_ppc_toIR.c
index 3897b9f2f..79d530d71 100644
--- a/VEX/priv/guest_ppc_toIR.c
+++ b/VEX/priv/guest_ppc_toIR.c
@@ -24297,12 +24297,12 @@ static Bool dis_av_permute ( UInt theInstr )
       IRTemp vC_andF = newTemp(Ity_V128);
       DIP("vperm v%d,v%d,v%d,v%d\n",
           vD_addr, vA_addr, vB_addr, vC_addr);
-      /* Limit the Perm8x16 steering values to 0 .. 15 as that is what
+      /* Limit the Perm8x16 steering values to 0 .. 31 as that is what
          IR specifies, and also to hide irrelevant bits from
          memcheck */
       assign( vC_andF,
               binop(Iop_AndV128, mkexpr(vC),
-                                 unop(Iop_Dup8x16, mkU8(0xF))) );
+                                 unop(Iop_Dup8x16, mkU8(0x1F))) );
       assign( a_perm,
               binop(Iop_Perm8x16, mkexpr(vA), mkexpr(vC_andF)) );
       assign( b_perm,


Is this hunk necessary?  It seems to be changing the semantics of an
existing instruction, rather than merely re-implementing one.

Otherwise looks OK to me.
Comment 3 Carl Love 2017-10-03 15:00:27 UTC
Yes, it is changing the semantics.  The code a implemented is wrong.  The field for the steering value is 5 bits according to the ISA, not four as implemented.  So values greater then 16 are not handled correctly.

The ISA says:

Let the source vector be the concatenation of the
contents of VR[VRA] followed by the contents of
VR[VRB].

For each integer value i from 0 to 15, do the following.
  Let index be the value specified by bits 3:7 of byte
  element i of VR[VRC].

So, the index value is 5-bits wide.

The code for the vperm instruction was reused with some small tweaks to implement the vpermr, xxperm and xxpermr instructions.  In testing the code for these other instructions this issue was found. 

Probably should have separated it out into its own bug fix.
Comment 4 Carl Love 2017-10-03 16:06:09 UTC
Created attachment 108148 [details]
re-implement vpermr instruction to generate fewer iops

Update patch, split the vpermr implementation and the vperm index mask into separate patches.  The vperm index mask was moved to bugzilla 385334.
Comment 5 Carl Love 2017-10-03 20:21:05 UTC
committed patch:   commit b0aef250a74804423341b3ce804355037211e330