Bug 339182 - POWER AvSplat ought to load destination vector register with 16/16 bytes stored prior.
Summary: POWER AvSplat ought to load destination vector register with 16/16 bytes stor...
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-18 16:17 UTC by Anmol P. Paralkar
Modified: 2014-09-29 18:31 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Further description and analysis. (40.78 KB, text/plain)
2014-09-18 16:19 UTC, Anmol P. Paralkar
Details
Patch to fix the problem. (501 bytes, patch)
2014-09-18 16:26 UTC, Anmol P. Paralkar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Anmol P. Paralkar 2014-09-18 16:17:14 UTC
The translation of the vspltb insn utilizes the vsr insn. Per POWER ISA v2.07, the vector register
 holding the shift ammount in bits [125:127] must do so in bits[5:7] in each of its 16 bytes, else
 the destination register is undefined. Currently, a preceeding lvewx insn defines the vector 
 register that contains the shift ammount - but per the Power ISA v2.07, the lvewx only defines
 4/16 bytes of its destination vector register, 12/16 bytes remain undefined. This causes a 
 problem on the Freescale e6500's optimized memset(3) which uses the vsplatb insn. The
 vsr generated by valgrind in the translation of the vspltb results in .bss initialized to undefined 
 values at program startup. The preceeding instruction ought to be an lvx for the vsr to operate
 correctly.

 [This would be further detailed in an attachment: AvSplat.txt].
 

Reproducible: Always



Expected Results:
Comment 1 Anmol P. Paralkar 2014-09-18 16:19:19 UTC
Created attachment 88742 [details]
Further description and analysis.


 Please see sections:

 VSPLTB_MISTRANSLATION
  WHY_IS_THIS_NOT_A_VG-3.9.0_BUG
  FIX
Comment 2 Anmol P. Paralkar 2014-09-18 16:26:09 UTC
Created attachment 88743 [details]
Patch to fix the problem.
Comment 3 Anmol P. Paralkar 2014-09-18 16:42:32 UTC
 Testing information:

 Regression tested on a IBM POWER7:

processor       :  0-31
cpu             : POWER7 (architected), altivec supported
clock           : 3000.000000MHz
revision        : 2.1 (pvr 003f 0201)

timebase        : 512000000
platform        : pSeries
model           : IBM,8202-E4B
machine         : CHRP IBM,8202-E4B

 running linux 3.1.5-6.fc16.ppc64 with valgrind: 14550, VEX: 2953

 PS: none/tests/ppc64/jm-insns.c already includes unit-tests for vsplt[bwh]. 

 If there is a way to grep for patterns after the internal instruction selection phase,
 (like in GCC's DejaGNU tests) please do let me know and I'll be happy to write a test
 that checks that an lvx is generated prior to the vsr in the code for the vsplt's. Thanks.
Comment 4 Carl Love 2014-09-25 15:59:50 UTC
The issue was confirmed as a bug.  The patch was applied and committed.  The VEX commit id is 2960.

Thanks for finding this bug.
Comment 5 Philippe Waroquiers 2014-09-27 15:19:23 UTC
(In reply to Carl Love from comment #4)
> The issue was confirmed as a bug.  The patch was applied and committed.  The
> VEX commit id is 2960.
If bug is completely fixed, then an entry should be added in NEWS
Comment 6 Carl Love 2014-09-29 18:31:11 UTC
Updated the News file with the fixed bug description.  vex commit 14589.  Closing the bug.