Created attachment 159661 [details] Example program containing the RORX instruction SUMMARY A program with the "RORX imm8, m32/64, r32/64" instruction will crash Valgrind with an off by one between the assumed next %rip and the actual next %rip. STEPS TO REPRODUCE 1. Assemble the example program which uses the RORX instruction 2. Run with Valgrind OBSERVED RESULT $ ./vg-in-place -v --tool=none ./../mwe ==2174== Nulgrind, the minimal Valgrind tool ==2174== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote. ==2174== Using Valgrind-3.22.0.GIT-3df8a00a4e-20230609 and LibVEX; rerun with -h for copyright info ==2174== Command: ./../mwe ==2174== --2174-- Valgrind options: --2174-- -v --2174-- --tool=none --2174-- Contents of /proc/version: --2174-- Linux version 6.3.5_1 (voidlinux@voidlinux) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP PREEMPT_DYNAMIC Wed May 31 02:29:35 UTC 2023 --2174-- --2174-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand-rdseed --2174-- Page sizes: currently 4096, max supported 4096 --2174-- Valgrind library directory: /tmp/tmp/valgrind/./.in_place --2174-- Reading syms from /tmp/tmp/valgrind/none/none-amd64-linux --2174-- object doesn't have a dynamic symbol table --2174-- Scheduler: using generic scheduler lock implementation. ==2174== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-2174-by-redoste-on-??? ==2174== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-2174-by-redoste-on-??? ==2174== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-2174-by-redoste-on-??? ==2174== ==2174== TO CONTROL THIS PROCESS USING vgdb (which you probably ==2174== don't want to do, unless you know exactly what you're doing, ==2174== or are doing some strange experiment): ==2174== /tmp/tmp/valgrind/./.in_place/../../bin/vgdb --pid=2174 ...command... ==2174== ==2174== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==2174== /path/to/gdb ./../mwe ==2174== and then give GDB the following command ==2174== target remote | /tmp/tmp/valgrind/./.in_place/../../bin/vgdb --pid=2174 ==2174== --pid is optional if only one valgrind process is running ==2174== assumed next %rip = 0x401009 actual next %rip = 0x40100A vex: the `impossible' happened: disInstr_AMD64: disInstr miscalculated next %rip vex storage: T total 0 bytes allocated vex storage: P total 0 bytes allocated valgrind: the 'impossible' happened: LibVEX called failure_exit(). host stacktrace: ==2174== at 0x5802873A: show_sched_status_wrk (m_libcassert.c:406) ==2174== by 0x58028857: report_and_quit (m_libcassert.c:477) ==2174== by 0x58028AAB: vgPlain_core_panic_at (m_libcassert.c:553) ==2174== by 0x58028ACA: vgPlain_core_panic (m_libcassert.c:563) ==2174== by 0x5803EC24: failure_exit (m_translate.c:761) ==2174== by 0x580F319A: vpanic (main_util.c:253) ==2174== by 0x581801CE: disInstr_AMD64 (guest_amd64_toIR.c:32698) ==2174== by 0x5810B759: disassemble_basic_block_till_stop.constprop.0 (guest_generic_bb_to_IR.c:956) ==2174== by 0x5810C2F4: bb_to_IR (guest_generic_bb_to_IR.c:1365) ==2174== by 0x580F0252: LibVEX_FrontEnd (main_main.c:583) ==2174== by 0x580F0BAC: LibVEX_Translate (main_main.c:1235) ==2174== by 0x580414AB: vgPlain_translate (m_translate.c:1831) ==2174== by 0x580131B4: vgPlain_scheduler (scheduler.c:1136) ==2174== by 0x5807EF19: run_a_thread_NORETURN (syswrap-linux.c:102) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 2174) ==2174== at 0x401000: ??? (in /tmp/tmp/mwe) client stack range: [0x1FFEFFF000 0x1FFF000FFF] client SP: 0x1FFF0000C0 valgrind stack range: [0x100268E000 0x100278DFFF] top usage: 5960 of 1048576 EXPECTED RESULT $ ./vg-in-place -v --tool=none ./../mwe ==2890== Nulgrind, the minimal Valgrind tool ==2890== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote. ==2890== Using Valgrind-3.22.0.GIT-3df8a00a4e-20230609X and LibVEX; rerun with -h for copyright info ==2890== Command: ./../mwe ==2890== --2890-- Valgrind options: --2890-- -v --2890-- --tool=none --2890-- Contents of /proc/version: --2890-- Linux version 6.3.5_1 (voidlinux@voidlinux) (gcc (GCC) 12.2.0, GNU ld (GNU Binutils) 2.39) #1 SMP PREEMPT_DYNAMIC Wed May 31 02:29:35 UTC 2023 --2890-- --2890-- Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand-rdseed --2890-- Page sizes: currently 4096, max supported 4096 --2890-- Valgrind library directory: /tmp/tmp/valgrind/./.in_place --2890-- Reading syms from /tmp/tmp/valgrind/none/none-amd64-linux --2890-- object doesn't have a dynamic symbol table --2890-- Scheduler: using generic scheduler lock implementation. ==2890== embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-2890-by-redoste-on-??? ==2890== embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-2890-by-redoste-on-??? ==2890== embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-2890-by-redoste-on-??? ==2890== ==2890== TO CONTROL THIS PROCESS USING vgdb (which you probably ==2890== don't want to do, unless you know exactly what you're doing, ==2890== or are doing some strange experiment): ==2890== /tmp/tmp/valgrind/./.in_place/../../bin/vgdb --pid=2890 ...command... ==2890== ==2890== TO DEBUG THIS PROCESS USING GDB: start GDB like this ==2890== /path/to/gdb ./../mwe ==2890== and then give GDB the following command ==2890== target remote | /tmp/tmp/valgrind/./.in_place/../../bin/vgdb --pid=2890 ==2890== --pid is optional if only one valgrind process is running ==2890== ==2890== SOFTWARE/OS VERSIONS $ uname -a Linux hostname 6.3.5_1 #1 SMP PREEMPT_DYNAMIC Wed May 31 02:29:35 UTC 2023 x86_64 GNU/Linux ADDITIONAL INFORMATION I'm not completely sure about the fix but a patch for VEX/priv/guest_amd64_toIR.c correcting the off by one on RORX disassembly is attached.
Created attachment 159662 [details] Example patch for guest_amd64_toIR
Created attachment 165688 [details] Show more context (rip and instr bytes) for this panic case
Created attachment 165690 [details] Add rorx rip-relative testcase to bmi.c Adding these instructions to the bmi testcase: 1abe: c4 e3 fb f0 0d 60 45 rorx $0x43,0x4560(%rip),%rcx # 6028 <g_ulong_arg> 1ac5: 00 00 43 1bf9: c4 e3 7b f0 0d 2d 44 rorx $0x43,0x442d(%rip),%ecx # 6030 <g_uint_arg> 1c00: 00 00 43
I tested the attachment "Example patch for guest_amd64_toIR" https://bugs.kde.org/attachment.cgi?id=159662 It perfectly fixes the problem. Without fix (but extended context-printing) the extended bmi testcase fails like this: shrx32 0000000000000000 0000000000000000 -> 0000000000000000 current %rip = 0x109ABE assumed next %rip = 0x109AC7 actual next %rip = 0x109AC8 instruction bytes: 0xC4 0xE3 0xFB 0xF0 0xD 0x60 0x45 0x0 0x0 0x43 vex: the `impossible' happened: disInstr_AMD64: disInstr miscalculated next %rip vex storage: T total 171126616 bytes allocated vex storage: P total 512 bytes allocated valgrind: the 'impossible' happened: LibVEX called failure_exit(). host stacktrace: ==20396== at 0x5804383A: show_sched_status_wrk (m_libcassert.c:407) ==20396== by 0x58043957: report_and_quit (m_libcassert.c:478) ==20396== by 0x58043BAB: panic (m_libcassert.c:554) ==20396== by 0x58043BAB: vgPlain_core_panic_at (m_libcassert.c:559) ==20396== by 0x58043BCA: vgPlain_core_panic (m_libcassert.c:564) ==20396== by 0x58058034: failure_exit (m_translate.c:761) ==20396== by 0x5813068A: vpanic (main_util.c:253) ==20396== by 0x581BBDDD: disInstr_AMD64 (guest_amd64_toIR.c:32714) ==20396== by 0x58148E76: disassemble_basic_block_till_stop.constprop.0 (guest_generic_bb_to_IR.c:956) ==20396== by 0x5814965C: bb_to_IR (guest_generic_bb_to_IR.c:1365) ==20396== by 0x5812D6AF: LibVEX_FrontEnd (main_main.c:583) ==20396== by 0x5812E00C: LibVEX_Translate (main_main.c:1235) ==20396== by 0x5805A791: vgPlain_translate (m_translate.c:1831) ==20396== by 0x58097F3B: handle_chain_me (scheduler.c:1164) ==20396== by 0x5809A42B: vgPlain_scheduler (scheduler.c:1531) ==20396== by 0x580E5569: thread_wrapper (syswrap-linux.c:102) ==20396== by 0x580E5569: run_a_thread_NORETURN (syswrap-linux.c:155) sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 20396) ==20396== at 0x1099FC: do_rorx64 (bmi.c:379) ==20396== by 0x10AEDC: main (bmi.c:1012) client stack range: [0x1FFEFFD000 0x1FFF000FFF] client SP: 0x1FFEFFF258 valgrind stack range: [0x1002DEB000 0x1002EEAFFF] top usage: 10960 of 1048576
Created attachment 165691 [details] Show more context (rip, instr bytes, vex-disasm trace) for this panic case Even more context for the error case (idea how to rerun disInstr_AMD64_WRK taken from the expect_CAS case below): current %rip = 0x109ABE assumed next %rip = 0x109AC7 actual next %rip = 0x109AC8 instruction bytes: 0xC4 0xE3 0xFB 0xF0 0xD 0x60 0x45 0x0 0x0 0x43 0x109ABE: rorx 67,17760(%rip),%rcx vex: the `impossible' happened: disInstr_AMD64: disInstr miscalculated next %rip
The patches look straightforward. It'll take me a little while to write a testcase and feature test.
Didn't look at patch 3. Even better.
Thanks for contributing the patches! commit 7aab480167b4480dbd0aae7009184e1f2b36cc28 Author: Matthias Schwarzott <zzam@gentoo.org> Date: Fri Feb 9 08:04:51 2024 +0100 Bug 471036 - Add test for rorx with rip-relative address commit 0e223c150c3385956e10d08b7f5bfb5a2ace9cd0 Author: Matthias Schwarzott <zzam@gentoo.org> Date: Fri Feb 9 08:22:58 2024 +0100 Bug 471036 - Print more context for amd64 disasm rip mismatch commit dc5316bc70059e4e0e127395e06c71e1d20936e0 Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Sat Feb 10 08:19:20 2024 +0100 Bug 471036 - disInstr_AMD64: disInstr miscalculated next %rip on RORX imm8, m32/64, r32/6 Patch contributed by redoste@redoste.xyz