Bug 476662 - vex amd64->IR: unhandled instruction bytes: 0x66 0x9D (popf)
Summary: vex amd64->IR: unhandled instruction bytes: 0x66 0x9D (popf)
Status: CONFIRMED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (other bugs)
Version First Reported In: 3.21.0
Platform: Ubuntu Other
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-07 10:36 UTC by Tiago Martinho
Modified: 2023-11-09 14:57 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tiago Martinho 2023-11-07 10:36:52 UTC
SUMMARY
Hi there!

I hope this message finds you well. I'm reaching out because I recently experienced a crash in Valgrind while executing the assembly instruction popfw and I'm a bit puzzled about how to resolve it.

Here's the crash message I received:

--------------------------------------------------------------------------------

==4353== Memcheck, a memory error detector
==4353== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==4353== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info
==4353== Command: ./a.out
==4353== 
vex amd64->IR: unhandled instruction bytes: 0x66 0x9D 0x44 0x19 0xC3 0xB8 0x0 0x0 0x0 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0
==4353== valgrind: Unrecognised instruction at address 0x109133.
==4353==    at 0x109133: main (test.cpp:4)
==4353== Your program just tried to execute an instruction that Valgrind
==4353== did not recognise.  There are two possible reasons for this.
==4353== 1. Your program has a bug and erroneously jumped to a non-code
==4353==    location.  If you are running Memcheck and you just saw a
==4353==    warning about a bad jump, it's probably your program's fault.
==4353== 2. The instruction is legitimate but Valgrind doesn't handle it,
==4353==    i.e. it's Valgrind's fault.  If you think this is the case or
==4353==    you are not sure, please let us know and we'll try to fix it.
==4353== Either way, Valgrind will now raise a SIGILL signal which will
==4353== probably kill your program.
==4353== 
==4353== Process terminating with default action of signal 4 (SIGILL): dumping core
==4353==  Illegal opcode at address 0x109133
==4353==    at 0x109133: main (test.cpp:4)
==4353== 
==4353== HEAP SUMMARY:
==4353==     in use at exit: 0 bytes in 0 blocks
==4353==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==4353== 
==4353== All heap blocks were freed -- no leaks are possible
==4353== 
==4353== For lists of detected and suppressed errors, rerun with: -s
==4353== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction

--------------------------------------------------------------------------------

After doing some research online, I came across a Valgrind changelog that mentioned the implementation of the instruction in question (https://valgrind.org/docs/manual/dist.news.old.html release 3.3.1). It seems that the popf instruction is working fine, but the popfw one is causing some issues.

For reference, here's a disassemble of the code I was running:

--------------------------------------------------------------------------------

GNU gdb (Ubuntu 14.0.50.20230907-0ubuntu1) 14.0.50.20230907-git
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...
(gdb) disassemble/r main
Dump of assembler code for function main():
0x0000000000001129 <+0>:	f3 0f 1e fa        	endbr64
0x000000000000112d <+4>:	55                 	push   %rbp
0x000000000000112e <+5>:	48 89 e5           	mov    %rsp,%rbp
0x0000000000001131 <+8>:	66 50              	push   %ax
0x0000000000001133 <+10>:	9d                 	popf
0x0000000000001134 <+11>:	9d                 	popf
0x0000000000001135 <+12>:	66 9d              	popfw
0x0000000000001137 <+14>:	44 19 c3           	sbb    %r8d,%ebx
0x000000000000113a <+17>:	b8 00 00 00 00     	mov    $0x0,%eax
0x000000000000113f <+22>:	5d                 	pop    %rbp
0x0000000000001140 <+23>:	c3                 	ret
End of assembler dump.
(gdb) quit

--------------------------------------------------------------------------------

I'm currently using Valgrind version 3.21.0 on an Ubuntu Docker container:

root@XXXXXX:/# valgrind --version
valgrind-3.21.0

I should mention that I'm not an expert in Valgrind, C, or assembly, so please forgive me if I've made any mistakes or overlooked something.

I would really appreciate any help or guidance you can provide. Thanks so much in advance for your time and assistance!

Warm regards,
Tiago Martinho


STEPS TO REPRODUCE
1. Compile a program with the instruction "popfw" for example:
int main()
{
    asm("popfw");
}
2. use valgrind on the binary file

OBSERVED RESULT
Valgrind crashes

EXPECTED RESULT
Valgrind would not crash and show "invalid instruction"

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Ubuntu 22.04

ADDITIONAL INFORMATION
Comment 1 Mark Wielaard 2023-11-07 13:29:41 UTC
This looks like the size == 2 issue in VEX/priv/guest_amd64_toIR.c (dis_ESC_NONE):

   case 0x9D: /* POPF */
      /* Note.  There is no encoding for a 32-bit popf in 64-bit mode.
         So sz==4 actually means sz==8. */
      if (haveF2orF3(pfx)) goto decode_failure;
      vassert(sz == 2 || sz == 4 || sz == 8);
      if (sz == 4) sz = 8;
      if (sz != 8) goto decode_failure; // until we know a sz==2 test case exists

So here is an example where sz == 2. Question is if it is a valid example.
Comment 2 Mark Wielaard 2023-11-08 10:36:39 UTC
Is this only an issue with this hand assembly?
It would be interesting to see real code that uses this.
Comment 3 Tiago Martinho 2023-11-08 14:57:05 UTC
(In reply to Mark Wielaard from comment #2)
> Is this only an issue with this hand assembly?
> It would be interesting to see real code that uses this.

Hi! I got this code from a library I depend on. Unfortunately I do not have access to the source code, but the disassembled code does have this instruction. I tried to give an example so that the issue could be reproduced.

Thanks!
Comment 4 Paul Floyd 2023-11-08 16:45:30 UTC
What is the library? Is it public?
Comment 5 Tiago Martinho 2023-11-09 14:57:15 UTC
(In reply to Paul Floyd from comment #4)
> What is the library? Is it public?

Unfortunately it's not a public library and I do not have access to the source code.