Bug 499183

Summary: FreeBSD: differences in avx-vmovq output
Product: [Developer tools] valgrind Reporter: Paul Floyd <pjfloyd>
Component: generalAssignee: Paul Floyd <pjfloyd>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version First Reported In: 3.24 GIT   
Target Milestone: ---   
Platform: FreeBSD Ports   
OS: FreeBSD   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Linux gcc objdump

Description Paul Floyd 2025-01-27 07:07:39 UTC
Created attachment 177708 [details]
Linux gcc objdump

I don't think that this is a Valgrind issue. I get the same diff when not running under Valgrind.
Comment 1 Paul Floyd 2025-01-27 07:41:18 UTC
paulf@green:~/test/valgrind/none/tests/amd64 $ cat avx-vmovq.stdout.diff
--- avx-vmovq.stdout.exp        2025-01-26 16:09:27.689155000 +0100
+++ avx-vmovq.stdout.out        2025-01-26 22:39:32.558711000 +0100
@@ -87,7 +87,7 @@
     c1fbfd8f4d8698c2.cb9dfb4ea5d18713.6489eab2c96df363.d52c4330a7aae391
     9d8e66ea90352a18
   after
-    0000000000000000.0000000000000000.0000000000000000.2525252525252525
+    0000000000000000.0000000000000000.0000000000000000.0000003000000008
     22cf5e4cfad1bdf5.8de2b4a9d799ff5f.0c05cb6ebd128663.d7568e3e8a3ac80e
     4288ae612c0dad40.f0733f448390351b.80ddba7e53e42d12.3208cf9b04b0569c
     c1fbfd8f4d8698c2.cb9dfb4ea5d18713.6489eab2c96df363.d52c4330a7aae391

There are 5 other diffs. No diffs for VMOVQ_XMM_to_XMM_LOW_HIGH, just VMOVQ_XMM_to_XMM_LOW_LOW_HIGH.

I need to single step though test_VMOVQ_XMM_to_XMM_LOW_LOW_HIGH to see what is happening.
Comment 2 Paul Floyd 2025-01-28 15:16:03 UTC
I think that I see the problem.

GEN_test_RandM(VMOVQ_XMM_to_XMM_LOW_LOW_HIGH,
               "vmovq %%xmm0, %%xmm7; vmovq %%xmm8, %%xmm0",
               "vmovq %%xmm0, (%%rsi); vmovq %%xmm9, %%xmm0")

I think that the intent here is to use xmm0 as a temporary to copy xmm7 to xmm8 and *rsi to xmm9. But the order of the registers is wrong.

For the 'reg' part
xmm0 (contiaing junk) gets copied to xmm7
xmm8 gets copied to xmm0 (and then not used in the output). The result is that the top 48 bytes of the first line of the block contain (as expected) but the bottom 16 bytes contain jumk.

Same sort of thing for the 'mem' part but this time because xmm0 got filled with something from 'block' it's no longer random and the results are deterministic.

This should fix it:
diff --git a/none/tests/amd64/avx-vmovq.c b/none/tests/amd64/avx-vmovq.c
index da8a1959b..3512aa53b 100644
--- a/none/tests/amd64/avx-vmovq.c
+++ b/none/tests/amd64/avx-vmovq.c
@@ -6,8 +6,8 @@ GEN_test_RandM(VMOVQ_XMM_to_XMM_LOW_HIGH,
 
 // xmm0 is scratch
 GEN_test_RandM(VMOVQ_XMM_to_XMM_LOW_LOW_HIGH,
-               "vmovq %%xmm0, %%xmm7; vmovq %%xmm8, %%xmm0",
-               "vmovq %%xmm0, (%%rsi); vmovq %%xmm9, %%xmm0")
+               "vmovq %%xmm7, %%xmm0; vmovq %%xmm0, %%xmm8",
+               "vmovq (%%rsi), %%xmm0; vmovq %%xmm0, %%xmm9")
 
 int main ( void )
 {
Comment 3 Paul Floyd 2025-01-28 18:34:11 UTC
commit a005226db8d59dd5c3b35f1011967891c71d9b28 (HEAD -> master, origin/master, origin/HEAD)
Author: Paul Floyd <pjfloyd@wanadoo.fr>
Date:   Tue Jan 28 19:25:52 2025 +0100

    Bug499183 - FreeBSD: differences in avx-vmovq output