Bug 391148 - Unhandled AVX instruction vmovq %xmm9,%xmm1
Summary: Unhandled AVX instruction vmovq %xmm9,%xmm1
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.13.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-27 07:29 UTC by takuyan
Modified: 2024-06-30 18:22 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Patch for the issue (1.05 KB, patch)
2018-08-28 13:10 UTC, Mario Salazar de Torres
Details

Note You need to log in before you can comment on or make changes to this bug.
Description takuyan 2018-02-27 07:29:24 UTC
I ran valgrind 3.13.0 with my AVX program and it died with this message.

vex amd64->IR: unhandled instruction bytes: 0xC5 0x79 0xD6 0xC9 0xC4 0xE3 0x7D 0x18 0xC1 0x1
vex amd64->IR:   REX=0 REX.W=0 REX.R=1 REX.X=0 REX.B=0
vex amd64->IR:   VEX=1 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0
==17131== valgrind: Unrecognised instruction at address 0xe888c95.

I found the instruction was vmovq between xmm.

=> 0x000000000e888c95 <+213>:	vmovq  %xmm9,%xmm1


I can reproduce this problem by this code:

int main() {
    asm("vmovq %xmm9, %xmm1");
    return 0;
}

And found that "%xmm8, %xmm0" and "%xmm15, %xmm7" killed valgrind with similar message but "%xmm15, %xmm8" didn't.
I also tried "%xmm0, %xmm8" and "%xmm8, %xmm9" and valgrind worked with them.

It seems valgrind cannot handle vmovq from xmm8-15 to xmm0-7.
Comment 1 apostolos 2018-05-18 09:57:24 UTC
The same issue exists from at least version 3.11.0 (on Ubuntu 16.04.4 LTS)
Comment 2 Mario Salazar de Torres 2018-08-28 12:59:32 UTC
In effect, within version 3.13 of Valgrind having vmovq xmm[8-15], xmm[0-7] causes it to throw the previously said error.
Comment 3 Mario Salazar de Torres 2018-08-28 13:10:34 UTC
Created attachment 114656 [details]
Patch for the issue

Tracking the issue I noted that hex dissasembly for vmovq xmm[8-15], xmm[0-7] is matches the following pattern: 0xC5 0x79 0xD6 0xC[0-7]

The last value states which register is the source, being the register xmm[value+8].

After analyzing the problem I noted that it was caused as the VEX standard was upgraded to add new XMM registers (from xmm8 to xmm15) and as Intel standard defined only 3bits to indicate both source and destination register they had to add a new VEX opcode in order to operate with the new registers.

Whenever you perform a vmovq xmm[0-7], xmm[0-7] instruction the opcode is 0x7E but instead If you perform vmovq xmm[8-15], xmm[0-7] the opcode is 0xD6.

Thing is D6 VEX opcode apparently were previously used but not having an XMM register as a source, so within the code that was not handled as by that time there was no test case.

So the solution is as easy as implement this specific case , but adding 8 to the source register.

I attach a patch for the issue. It would be great If you could tell me If it's consistent. So far the patch is working for me.
Comment 4 Mario Salazar de Torres 2018-08-30 11:31:03 UTC
Comment on attachment 114656 [details]
Patch for the issue

>diff --git a/VEX/priv/guest_amd64_toIR.c b/VEX/priv/guest_amd64_toIR.c
>index 9073e1d..9229e53 100644
>--- a/VEX/priv/guest_amd64_toIR.c
>+++ b/VEX/priv/guest_amd64_toIR.c
>@@ -26876,15 +26876,19 @@ Long dis_ESC_0F__VEX (
>          UChar modrm = getUChar(delta);
>          UInt  rG    = gregOfRexRM(pfx,modrm);
>          if (epartIsReg(modrm)) {
>-            /* fall through, awaiting test case */
>-            /* dst: lo half copied, hi half zeroed */
>+            // In this case is VEX.128.66.0F.WIG D6 /r = VMOVQ xmm8-15/m64, xmm0-7
>+            UInt rE = eregOfRexRM(pfx,modrm) + 8;
>+            DIP("vmovq %s,%s\n", nameXMMReg(rG), nameIReg64(rE));
>+            putIReg64(rE, getXMMRegLane64(rG, 0));
>+            delta += 1;
>          } else {
>             addr = disAMode ( &alen, vbi, pfx, delta, dis_buf, 0 );
>             storeLE( mkexpr(addr), getXMMRegLane64( rG, 0 ));
>             DIP("vmovq %s,%s\n", nameXMMReg(rG), dis_buf );
>             delta += alen;
>-            goto decode_success;
>          }
>+
>+         goto decode_success;
>       }
>       break;
>
Comment 5 Mark Wielaard 2024-06-30 18:22:03 UTC
commit 10a22445d747817932692b1c1ee3faa726121cb4
Author: Mark Wielaard <mark@klomp.org>
Date:   Sun Jun 30 20:17:32 2024 +0200

    Implement VMOVQ xmm1, xmm2/m64
    
    We implemented the memory variant already, but not the reg variant.
    Add a separate avx-vmovq testcase, because avx-1 is already really big.
    
    https://bugs.kde.org/show_bug.cgi?id=391148
    https://bugs.kde.org/show_bug.cgi?id=417572
    https://bugs.kde.org/show_bug.cgi?id=489088