Bug 250038 - ppc64: Altivec lvsr and lvsl instructions fail their regtest (jm-vmx)
Summary: ppc64: Altivec lvsr and lvsl instructions fail their regtest (jm-vmx)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.6 SVN
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-03 16:40 UTC by Julian Seward
Modified: 2014-09-01 14:16 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Julian Seward 2010-09-03 16:40:58 UTC
.. and I reckon they used to work ok, so some sort of regression?

Curiously enough on ppc32 the same test works just fine.

$ perl tests/vg_regtest none/tests/ppc64
-- Running  tests in none/tests/ppc64 ----------------------------------
jm-fp:           valgrind   ./jm-insns -f 
jm-int:          valgrind   ./jm-insns -i 
jm-vmx:          valgrind   ./jm-insns -a 
*** jm-vmx failed (stdout) ***
lsw:             valgrind   ./lsw 
round:           valgrind   ./round 
std_reg_imm:     valgrind   -q ./std_reg_imm 
tw_td:           valgrind   ./tw_td 
twi_tdi:         valgrind   ./twi_tdi 
-- Finished tests in none/tests/ppc64 ----------------------------------

== 8 tests, 0 stderr failures, 1 stdout failure, 0 post failures ==
none/tests/ppc64/jm-vmx                  (stdout)


$ cat none/tests/ppc64/jm-vmx.stdout.diff 
--- jm-vmx.stdout.exp   2009-09-11 23:58:53.000000000 +0200
+++ jm-vmx.stdout.out   2010-09-03 16:05:14.000000000 +0200
@@ -1407,43 +1407,43 @@
       vsldoi: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff, f1f2f3f4f5f6f7f8f9fafbfcfefdfeff, 14
       vsldoi:  => fefff1f2 f3f4f5f6 f7f8f9fa fbfcfefd] (00000000)
 
-        lvsl  -1,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
-        lvsl   0,   0 => 00010203 04050607 08090a0b 0c0d0e0f (00000000)
-        lvsl   1,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
-        lvsl   2,   0 => 02030405 06070809 0a0b0c0d 0e0f1011 (00000000)
-        lvsl   3,   0 => 03040506 0708090a 0b0c0d0e 0f101112 (00000000)
-        lvsl   4,   0 => 04050607 08090a0b 0c0d0e0f 10111213 (00000000)
-        lvsl   5,   0 => 05060708 090a0b0c 0d0e0f10 11121314 (00000000)
-        lvsl   6,   0 => 06070809 0a0b0c0d 0e0f1011 12131415 (00000000)
-        lvsl   7,   0 => 0708090a 0b0c0d0e 0f101112 13141516 (00000000)
-        lvsl   8,   0 => 08090a0b 0c0d0e0f 10111213 14151617 (00000000)
-        lvsl   9,   0 => 090a0b0c 0d0e0f10 11121314 15161718 (00000000)
-        lvsl  10,   0 => 0a0b0c0d 0e0f1011 12131415 16171819 (00000000)
-        lvsl  11,   0 => 0b0c0d0e 0f101112 13141516 1718191a (00000000)
-        lvsl  12,   0 => 0c0d0e0f 10111213 14151617 18191a1b (00000000)
-        lvsl  13,   0 => 0d0e0f10 11121314 15161718 191a1b1c (00000000)
-        lvsl  14,   0 => 0e0f1011 12131415 16171819 1a1b1c1d (00000000)
-        lvsl  15,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
-        lvsl  16,   0 => 00010203 04050607 08090a0b 0c0d0e0f (00000000)
-
-        lvsr  -1,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
-        lvsr   0,   0 => 10111213 14151617 18191a1b 1c1d1e1f (00000000)
-        lvsr   1,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
-        lvsr   2,   0 => 0e0f1011 12131415 16171819 1a1b1c1d (00000000)
-        lvsr   3,   0 => 0d0e0f10 11121314 15161718 191a1b1c (00000000)
-        lvsr   4,   0 => 0c0d0e0f 10111213 14151617 18191a1b (00000000)
-        lvsr   5,   0 => 0b0c0d0e 0f101112 13141516 1718191a (00000000)
-        lvsr   6,   0 => 0a0b0c0d 0e0f1011 12131415 16171819 (00000000)
-        lvsr   7,   0 => 090a0b0c 0d0e0f10 11121314 15161718 (00000000)
-        lvsr   8,   0 => 08090a0b 0c0d0e0f 10111213 14151617 (00000000)
-        lvsr   9,   0 => 0708090a 0b0c0d0e 0f101112 13141516 (00000000)
-        lvsr  10,   0 => 06070809 0a0b0c0d 0e0f1011 12131415 (00000000)
-        lvsr  11,   0 => 05060708 090a0b0c 0d0e0f10 11121314 (00000000)
-        lvsr  12,   0 => 04050607 08090a0b 0c0d0e0f 10111213 (00000000)
-        lvsr  13,   0 => 03040506 0708090a 0b0c0d0e 0f101112 (00000000)
-        lvsr  14,   0 => 02030405 06070809 0a0b0c0d 0e0f1011 (00000000)
-        lvsr  15,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
-        lvsr  16,   0 => 10111213 14151617 18191a1b 1c1d1e1f (00000000)
+        lvsl  -1,   0 => 74650000 00000000 002b636f 72650000 (00000000)
+        lvsl   0,   0 => 69676e6f 72650000 7465726d 696e6174 (00000000)
+        lvsl   1,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
+        lvsl   2,   0 => 6e6f7265 00007465 726d696e 61746500 (00000000)
+        lvsl   3,   0 => 6f726500 00746572 6d696e61 74650000 (00000000)
+        lvsl   4,   0 => 72650000 7465726d 696e6174 65000000 (00000000)
+        lvsl   5,   0 => 65000074 65726d69 6e617465 00000000 (00000000)
+        lvsl   6,   0 => 00007465 726d696e 61746500 00000000 (00000000)
+        lvsl   7,   0 => 00746572 6d696e61 74650000 00000000 (00000000)
+        lvsl   8,   0 => 7465726d 696e6174 65000000 00000000 (00000000)
+        lvsl   9,   0 => 65726d69 6e617465 00000000 0000002b (00000000)
+        lvsl  10,   0 => 726d696e 61746500 00000000 00002b63 (00000000)
+        lvsl  11,   0 => 6d696e61 74650000 00000000 002b636f (00000000)
+        lvsl  12,   0 => 696e6174 65000000 00000000 2b636f72 (00000000)
+        lvsl  13,   0 => 6e617465 00000000 0000002b 636f7265 (00000000)
+        lvsl  14,   0 => 61746500 00000000 00002b63 6f726500 (00000000)
+        lvsl  15,   0 => 74650000 00000000 002b636f 72650000 (00000000)
+        lvsl  16,   0 => 69676e6f 72650000 7465726d 696e6174 (00000000)
+
+        lvsr  -1,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
+        lvsr   0,   0 => 65000000 00000000 2b636f72 65000000 (00000000)
+        lvsr   1,   0 => 74650000 00000000 002b636f 72650000 (00000000)
+        lvsr   2,   0 => 61746500 00000000 00002b63 6f726500 (00000000)
+        lvsr   3,   0 => 6e617465 00000000 0000002b 636f7265 (00000000)
+        lvsr   4,   0 => 696e6174 65000000 00000000 2b636f72 (00000000)
+        lvsr   5,   0 => 6d696e61 74650000 00000000 002b636f (00000000)
+        lvsr   6,   0 => 726d696e 61746500 00000000 00002b63 (00000000)
+        lvsr   7,   0 => 65726d69 6e617465 00000000 0000002b (00000000)
+        lvsr   8,   0 => 7465726d 696e6174 65000000 00000000 (00000000)
+        lvsr   9,   0 => 00746572 6d696e61 74650000 00000000 (00000000)
+        lvsr  10,   0 => 00007465 726d696e 61746500 00000000 (00000000)
+        lvsr  11,   0 => 65000074 65726d69 6e617465 00000000 (00000000)
+        lvsr  12,   0 => 72650000 7465726d 696e6174 65000000 (00000000)
+        lvsr  13,   0 => 6f726500 00746572 6d696e61 74650000 (00000000)
+        lvsr  14,   0 => 6e6f7265 00007465 726d696e 61746500 (00000000)
+        lvsr  15,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
+        lvsr  16,   0 => 65000000 00000000 2b636f72 65000000 (00000000)
 
 Altivec load insns with two register args:
        lvebx   0, 01020304 05060708 090a0b0c 0e0d0e0f => 01000000 00000000 00000000 00000000 (00000000)
Comment 1 Maynard Johnson 2010-09-30 16:11:35 UTC
(In reply to comment #0)
> .. and I reckon they used to work ok, so some sort of regression?
> 
> Curiously enough on ppc32 the same test works just fine.
> 
> $ perl tests/vg_regtest none/tests/ppc64
> -- Running  tests in none/tests/ppc64 ----------------------------------
> jm-fp:           valgrind   ./jm-insns -f 
> jm-int:          valgrind   ./jm-insns -i 
> jm-vmx:          valgrind   ./jm-insns -a 
> *** jm-vmx failed (stdout) ***
> lsw:             valgrind   ./lsw 
> round:           valgrind   ./round 
> std_reg_imm:     valgrind   -q ./std_reg_imm 
> tw_td:           valgrind   ./tw_td 
> twi_tdi:         valgrind   ./twi_tdi 
> -- Finished tests in none/tests/ppc64 ----------------------------------
> 
> == 8 tests, 0 stderr failures, 1 stdout failure, 0 post failures ==
> none/tests/ppc64/jm-vmx                  (stdout)
> 
> 
> $ cat none/tests/ppc64/jm-vmx.stdout.diff 
> --- jm-vmx.stdout.exp   2009-09-11 23:58:53.000000000 +0200
> +++ jm-vmx.stdout.out   2010-09-03 16:05:14.000000000 +0200
> @@ -1407,43 +1407,43 @@
>        vsldoi: f1f2f3f4f5f6f7f8f9fafbfcfefdfeff,
> f1f2f3f4f5f6f7f8f9fafbfcfefdfeff, 14
>        vsldoi:  => fefff1f2 f3f4f5f6 f7f8f9fa fbfcfefd] (00000000)
> 
> -        lvsl  -1,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
> -        lvsl   0,   0 => 00010203 04050607 08090a0b 0c0d0e0f (00000000)
> -        lvsl   1,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
> -        lvsl   2,   0 => 02030405 06070809 0a0b0c0d 0e0f1011 (00000000)
> -        lvsl   3,   0 => 03040506 0708090a 0b0c0d0e 0f101112 (00000000)
> -        lvsl   4,   0 => 04050607 08090a0b 0c0d0e0f 10111213 (00000000)
> -        lvsl   5,   0 => 05060708 090a0b0c 0d0e0f10 11121314 (00000000)
> -        lvsl   6,   0 => 06070809 0a0b0c0d 0e0f1011 12131415 (00000000)
> -        lvsl   7,   0 => 0708090a 0b0c0d0e 0f101112 13141516 (00000000)
> -        lvsl   8,   0 => 08090a0b 0c0d0e0f 10111213 14151617 (00000000)
> -        lvsl   9,   0 => 090a0b0c 0d0e0f10 11121314 15161718 (00000000)
> -        lvsl  10,   0 => 0a0b0c0d 0e0f1011 12131415 16171819 (00000000)
> -        lvsl  11,   0 => 0b0c0d0e 0f101112 13141516 1718191a (00000000)
> -        lvsl  12,   0 => 0c0d0e0f 10111213 14151617 18191a1b (00000000)
> -        lvsl  13,   0 => 0d0e0f10 11121314 15161718 191a1b1c (00000000)
> -        lvsl  14,   0 => 0e0f1011 12131415 16171819 1a1b1c1d (00000000)
> -        lvsl  15,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
> -        lvsl  16,   0 => 00010203 04050607 08090a0b 0c0d0e0f (00000000)
> -
> -        lvsr  -1,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
> -        lvsr   0,   0 => 10111213 14151617 18191a1b 1c1d1e1f (00000000)
> -        lvsr   1,   0 => 0f101112 13141516 1718191a 1b1c1d1e (00000000)
> -        lvsr   2,   0 => 0e0f1011 12131415 16171819 1a1b1c1d (00000000)
> -        lvsr   3,   0 => 0d0e0f10 11121314 15161718 191a1b1c (00000000)
> -        lvsr   4,   0 => 0c0d0e0f 10111213 14151617 18191a1b (00000000)
> -        lvsr   5,   0 => 0b0c0d0e 0f101112 13141516 1718191a (00000000)
> -        lvsr   6,   0 => 0a0b0c0d 0e0f1011 12131415 16171819 (00000000)
> -        lvsr   7,   0 => 090a0b0c 0d0e0f10 11121314 15161718 (00000000)
> -        lvsr   8,   0 => 08090a0b 0c0d0e0f 10111213 14151617 (00000000)
> -        lvsr   9,   0 => 0708090a 0b0c0d0e 0f101112 13141516 (00000000)
> -        lvsr  10,   0 => 06070809 0a0b0c0d 0e0f1011 12131415 (00000000)
> -        lvsr  11,   0 => 05060708 090a0b0c 0d0e0f10 11121314 (00000000)
> -        lvsr  12,   0 => 04050607 08090a0b 0c0d0e0f 10111213 (00000000)
> -        lvsr  13,   0 => 03040506 0708090a 0b0c0d0e 0f101112 (00000000)
> -        lvsr  14,   0 => 02030405 06070809 0a0b0c0d 0e0f1011 (00000000)
> -        lvsr  15,   0 => 01020304 05060708 090a0b0c 0d0e0f10 (00000000)
> -        lvsr  16,   0 => 10111213 14151617 18191a1b 1c1d1e1f (00000000)
> +        lvsl  -1,   0 => 74650000 00000000 002b636f 72650000 (00000000)
> +        lvsl   0,   0 => 69676e6f 72650000 7465726d 696e6174 (00000000)
> +        lvsl   1,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
> +        lvsl   2,   0 => 6e6f7265 00007465 726d696e 61746500 (00000000)
> +        lvsl   3,   0 => 6f726500 00746572 6d696e61 74650000 (00000000)
> +        lvsl   4,   0 => 72650000 7465726d 696e6174 65000000 (00000000)
> +        lvsl   5,   0 => 65000074 65726d69 6e617465 00000000 (00000000)
> +        lvsl   6,   0 => 00007465 726d696e 61746500 00000000 (00000000)
> +        lvsl   7,   0 => 00746572 6d696e61 74650000 00000000 (00000000)
> +        lvsl   8,   0 => 7465726d 696e6174 65000000 00000000 (00000000)
> +        lvsl   9,   0 => 65726d69 6e617465 00000000 0000002b (00000000)
> +        lvsl  10,   0 => 726d696e 61746500 00000000 00002b63 (00000000)
> +        lvsl  11,   0 => 6d696e61 74650000 00000000 002b636f (00000000)
> +        lvsl  12,   0 => 696e6174 65000000 00000000 2b636f72 (00000000)
> +        lvsl  13,   0 => 6e617465 00000000 0000002b 636f7265 (00000000)
> +        lvsl  14,   0 => 61746500 00000000 00002b63 6f726500 (00000000)
> +        lvsl  15,   0 => 74650000 00000000 002b636f 72650000 (00000000)
> +        lvsl  16,   0 => 69676e6f 72650000 7465726d 696e6174 (00000000)
> +
> +        lvsr  -1,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
> +        lvsr   0,   0 => 65000000 00000000 2b636f72 65000000 (00000000)
> +        lvsr   1,   0 => 74650000 00000000 002b636f 72650000 (00000000)
> +        lvsr   2,   0 => 61746500 00000000 00002b63 6f726500 (00000000)
> +        lvsr   3,   0 => 6e617465 00000000 0000002b 636f7265 (00000000)
> +        lvsr   4,   0 => 696e6174 65000000 00000000 2b636f72 (00000000)
> +        lvsr   5,   0 => 6d696e61 74650000 00000000 002b636f (00000000)
> +        lvsr   6,   0 => 726d696e 61746500 00000000 00002b63 (00000000)
> +        lvsr   7,   0 => 65726d69 6e617465 00000000 0000002b (00000000)
> +        lvsr   8,   0 => 7465726d 696e6174 65000000 00000000 (00000000)
> +        lvsr   9,   0 => 00746572 6d696e61 74650000 00000000 (00000000)
> +        lvsr  10,   0 => 00007465 726d696e 61746500 00000000 (00000000)
> +        lvsr  11,   0 => 65000074 65726d69 6e617465 00000000 (00000000)
> +        lvsr  12,   0 => 72650000 7465726d 696e6174 65000000 (00000000)
> +        lvsr  13,   0 => 6f726500 00746572 6d696e61 74650000 (00000000)
> +        lvsr  14,   0 => 6e6f7265 00007465 726d696e 61746500 (00000000)
> +        lvsr  15,   0 => 676e6f72 65000074 65726d69 6e617465 (00000000)
> +        lvsr  16,   0 => 65000000 00000000 2b636f72 65000000 (00000000)
> 
>  Altivec load insns with two register args:
>         lvebx   0, 01020304 05060708 090a0b0c 0e0d0e0f => 01000000 00000000
> 00000000 00000000 (00000000)
Julian, in my testing on a POWER6/SLES 11 SP1 system, this testcase "almost" passes.  I found that printf in newer GLIBC versions supports the concept of negative NaNs (which is valid, according to the C99 standard), so I get several mismatches due to the fact that the current expected output has no negative NaNs.  (I'll have to create a secondary expected output file to allow for this printf difference).  What processor model/distro release are you seeing the above problem on?  Thanks.
Comment 2 Julian Seward 2010-09-30 16:20:36 UTC
Maynard, thanks for looking at this.  This is on a PowerPC 970
(dual processor, but not dual core) -- PPC970FX I guess.
Comment 3 Maynard Johnson 2010-10-12 16:10:46 UTC
(In reply to comment #2)
> Maynard, thanks for looking at this.  This is on a PowerPC 970
> (dual processor, but not dual core) -- PPC970FX I guess.
Finally getting back to this.  Testing on a 970FX now, I've found that the 64-bit jm-vmx testcase passed in valgrind 3.2.1.  Thereafter, the story gets a bit muddy. The valgrind release archive web page is missing the tar balls for all releases between 3.2.1 and 3.4.1.  I tested 3.4.1 and first found that the jm-vmx tests (both 32-bit and 64-bit weren't being built correctly (needed HAVE_ALTIVEC_H and HAS_ALTIVEC to be defined, but neither were).  I hacked the testcases so they would build for running the VMX test code.  Then, running the tests, I found the 32-bit version still worked as it did in 3.2.1, but the 64-bit version now fails with the symptoms described in this bug.  Running the 64-bit jm-insns program natively (without valgrind) produces correct results, so it is apparently a bug in the valgrind core.  Still investigating . . .
Comment 4 Julian Seward 2010-10-12 16:38:29 UTC
(In reply to comment #3)
> The valgrind release archive web page is missing the tar balls for
> all releases between 3.2.1 and 3.4.1

They are all actually still there, in

http://www.valgrind.org/downloads/valgrind-3.2.3.tar.bz2
                                 /valgrind-3.3.0.tar.bz2
                                 /valgrind-3.3.1.tar.bz2
                                 /valgrind-3.4.0.tar.bz2
                                 /valgrind-3.4.1.tar.bz2
                                 /valgrind-3.5.0.tar.bz2

They are not linked to from the web pages but are wget-able.
Comment 5 Maynard Johnson 2010-10-14 17:20:04 UTC
OK, the regression occurred in 3.4.0.  Both the 32-bit and 64-bit jm-vmx tests work OK on 3.3.1.  As I noted above, in order for the jm-vmx tests to build correctly with 3.4.0, I had to hack the testcases to define HAVE_ALTIVEC_H and HAS_ALTIVEC.  Then the 32-bit test passed, and the 64-bit test failed with the symptoms described in this bug.  I've been beating my head against the wall for a couple days trying to debug this.  I freely admit to not having a very good understanding of the valgrind code.  I don't know if this is the right tactic, but using the tips from README_DEVELOPERS, I've been setting debugging_translation at entry to VG_(translate) for ip addrs around the point of failure.  But the debugging output I get (register allocation code) has not been helpful -- or maybe I just don't know how to interpret it.  Julian, do you have any tips for debugging this?  Thanks.
Comment 6 Maynard Johnson 2010-11-16 16:17:41 UTC
Hi, Julian.  In an email you sent me on Oct 28, you provided a patch for this problem.  You indicated in an email the next day that you'd look at it a bit more and "add proper comments and commit it over the weekend".  I don't see any evidence it's been fixed yet upstream, so maybe it slipped through the cracks.  Thanks.
Comment 7 Maynard Johnson 2011-01-07 23:13:33 UTC
Hi, Julian. It's me again, nagging about this bug fix.  Any chance you could commit the patch soon?  Thanks.
-Maynard
Comment 8 Julian Seward 2011-01-10 18:46:07 UTC
Committed as r2073.  Sorry for the delay.
Comment 9 Florian Krohm 2014-09-01 14:16:33 UTC
(In reply to Julian Seward from comment #8)
> Committed as r2073.  Sorry for the delay.

Closing....