Summary: | inconsistent RDTSCP support on x86_64 | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | bugzilla |
Component: | vex | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | giecrilj, lukasz.wojnilowicz, philippe.waroquiers, pjfloyd, tom |
Priority: | NOR | ||
Version First Reported In: | 3.12.0 | ||
Target Milestone: | --- | ||
Platform: | RedHat Enterprise Linux | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
g++ source for testing RDTSCP support
modified with RDTCSP in separate non-inlined function valgrind -v --tool=memcheck ./rdtscp2 /proc/cpuinfo of Intel Core2Duo [PATCH] Don't use SSE4.2 on Core2Duo |
As far as I can see RDTSCP was implemented in VEX r2701 for BZ#251569. Are you trying to use it in 32 bit code? No, this is on a 12-core 64-bit system, apparently running under libvirt. /etc/redhat-release = CentOS release 6.6 /proc/cpuinfo = processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : QEMU Virtual CPU version (cpu64-rhel6) stepping : 3 microcode : 1 cpu MHz : 2933.436 cache size : 4096 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 4 wp : yes flags : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pse36 clflush mmx fxsr sse sse2 syscall nx lm unfair_spinlock pni cx16 hypervisor lahf_lm bogomips : 5866.87 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ...etc. Sure, but is the program compiled as 64 bit or 32 bit? It's using 32 bit register names in the assembly but that might be normal for RDTSCP which is why I asked how you were compiling it. Ah sorry I misunderstood your original report... You're saying that valgrind aborts on the instruction even though you don't try and execute it. My guess is that it's happening because that will be reported at translation time and valgrind translates instructions in blocks so it may translate an instruction that never gets executed. (In reply to Tom Hughes from comment #3) > Sure, but is the program compiled as 64 bit or 32 bit? It's using 32 bit > register names in the assembly but that might be normal for RDTSCP which is > why I asked how you were compiling it. 64 bit platform and tools. The register names are only specifying "clobbers" to the assembler template. The 'e' prefix for CPUID is appropriate (CPUID clobbers ecx). The 'r' prefix on a register name indicates 64 bit. (In reply to Tom Hughes from comment #4) > Ah sorry I misunderstood your original report... > > You're saying that valgrind aborts on the instruction even though you don't > try and execute it. My guess is that it's happening because that will be > reported at translation time and valgrind translates instructions in blocks > so it may translate an instruction that never gets executed. This is contrary to how the processor works. A program can have potentially any number of regions in the code segment that do not contain valid opcodes and are never executed (despite routinely making their way into the processor's prefetch/decode queue.) An illegal instruction exception only arises from an actual attempted execution. But let's suppose your guess about valgrind's behavior is correct. How would one rewrite this test program to ensure that the inclusion (but not execution) of the RDTSCP opcode would not provoke this problem under valgrind? I'm not saying it isn't a bug, just explaining what I think is causing it. What I do know is it's not likely to be easy to fix, but it probably needs Julian to comment in more detail about whether it might be fixable and whether there is any way to word around it. I would guess that putting the RDTSCP in a separate function from the check might work, so long as the compiler doesn't optimise them back together... (In reply to Tom Hughes from comment #7) > I'm not saying it isn't a bug, just explaining what I think is causing it. > > What I do know is it's not likely to be easy to fix, but it probably needs > Julian to comment in more detail about whether it might be fixable and > whether there is any way to word around it. > > I would guess that putting the RDTSCP in a separate function from the check > might work, so long as the compiler doesn't optimise them back together... In the second attachment rdtscp2.cpp, the instruction is relegated to a separate function rdtcsp(), with inlining disabled. Execution proceeds through the negative flow path (the output "RDTCSP not supported" proves this), meaning we never call that function. But we still get the SIGILL from valgrind: RDTSCP not supported 3:28pm mrec-build2.812 ~/dev% ~/bin/valgrind ./a.out ==14720== Memcheck, a memory error detector ==14720== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==14720== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==14720== Command: ./a.out ==14720== vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xF9 0xC9 0xC3 0x55 0x48 0x89 0xE5 0x48 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==14720== valgrind: Unrecognised instruction at address 0x400818. ==14720== at 0x400818: rdtscp() (in /home/joev/dev/a.out) ==14720== by 0x400848: main (in /home/joev/dev/a.out) ==14720== Your program just tried to execute an instruction that Valgrind ==14720== did not recognise. There are two possible reasons for this. ==14720== 1. Your program has a bug and erroneously jumped to a non-code ==14720== location. If you are running Memcheck and you just saw a ==14720== warning about a bad jump, it's probably your program's fault. ==14720== 2. The instruction is legitimate but Valgrind doesn't handle it, ==14720== i.e. it's Valgrind's fault. If you think this is the case or ==14720== you are not sure, please let us know and we'll try to fix it. ==14720== Either way, Valgrind will now raise a SIGILL signal which will ==14720== probably kill your program. ==14720== ==14720== Process terminating with default action of signal 4 (SIGILL) ==14720== Illegal opcode at address 0x400818 ==14720== at 0x400818: rdtscp() (in /home/joev/dev/a.out) ==14720== by 0x400848: main (in /home/joev/dev/a.out) ==14720== ==14720== HEAP SUMMARY: ==14720== in use at exit: 0 bytes in 0 blocks ==14720== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==14720== ==14720== All heap blocks were freed -- no leaks are possible ==14720== ==14720== For counts of detected and suppressed errors, rerun with: -v ==14720== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4) Illegal instruction Created attachment 103244 [details]
modified with RDTCSP in separate non-inlined function
I guess that the problem is because VEX (somewhat) examines the cpu it is running on, to advertise to the guest program another model of cpu, chosen in a limited nr of predefined models : see guest_amd64_toIR.c handling of the CPUID instruction. I am however wondering what VEX advertises on this qemu cpu. According to the VEX code, in your case, it should advertise a basic cpu that has no RDTSCP. Can you run valgrind --trace-flags=10000000 --trace-notbelow=1 --tool=none cpuid|&grep -i 'dirty.*cpuid' and see what this gives ? I am also wondering if m_machine.c sets have_rdtscp to True. Can you also do: valgrind --tool=none -v -v -v -d -d -d date|&grep 'arch =' (for me, these 2 commands give: DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_avx_and_cx16{0x3817bd10}(BBPTR) --7119:1: main ... arch = AMD64, hwcaps = amd64-cx16-lzcnt-rdtscp-sse3-avx-avx2-bmi I find the above bizarre: the reported arch has sse3/cx16/avx2 but the called dirty helper is amd64g_dirtyhelper_CPUID_avx_and_cx16, while I was expecting amd64g_dirtyhelper_CPUID_avx2 No that's not the problem at all. Yes we may sometimes advertise different flags from the real CPU but the issue here is that we advertise that we don't support an instruction and the client program acts on that but valgrind still tries to translate the instruction (because it is translating a whole block) and faults on translating it because it thinks it is emulating a CPU that doesn't have it. So the issue is that valgrind is translating (and faulting an instruction) that is never doing to be executed. At least that is the conclusion I came to. (In reply to Tom Hughes from comment #11) > No that's not the problem at all. Yes we may sometimes advertise different > flags from the real CPU but the issue here is that we advertise that we > don't support an instruction Do we ? See below extract from comment 1: > Strangely (due to VEX simulating a different CPU stepping) if we comment out > the code that executes RDTCSP, the program under valgrind then reports the > instruction as being supported. So, I am wondering what Valgrind really detects and reports. Maybe there is something strange there (as in my case, even if my strange case is avx2 related, not rdtscp related : for me, it reports an avx2 flag, but calls a non avx2 dirty helper. (In reply to Philippe Waroquiers from comment #10) > I guess that the problem is because VEX (somewhat) examines the > cpu it is running on, to advertise to the guest program another model of > cpu, chosen in a limited nr of predefined models : see guest_amd64_toIR.c > handling of the CPUID instruction. > I am however wondering what VEX advertises on this qemu cpu. > According to the VEX code, in your case, it should advertise a basic cpu > that has no RDTSCP. > > Can you run > valgrind --trace-flags=10000000 --trace-notbelow=1 --tool=none cpuid|&grep > -i 'dirty.*cpuid' > and see what this gives ? > > I am also wondering if m_machine.c sets have_rdtscp to True. > Can you also do: > valgrind --tool=none -v -v -v -d -d -d date|&grep 'arch =' > > > (for me, these 2 commands give: > DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: > amd64g_dirtyhelper_CPUID_avx_and_cx16{0x3817bd10}(BBPTR) > > --7119:1: main ... arch = AMD64, hwcaps = > amd64-cx16-lzcnt-rdtscp-sse3-avx-avx2-bmi > > I find the above bizarre: the reported arch has sse3/cx16/avx2 but the > called dirty helper > is amd64g_dirtyhelper_CPUID_avx_and_cx16, while I was expecting > amd64g_dirtyhelper_CPUID_avx2 There was no cpuid utility available on our host, so we substituted an internal 'procinfo' utility that emits similar details; I hope that it gives you the information you wanted for that case: % ./valgrind --trace-flags=10000000 --trace-notbelow=1 --tool=none procinfo |& grep -i 'dirty.*cpuid' DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) Illegal instruction 4:24pm mrec-build2.873 ~/bin% ./valgrind --tool=none -v -v -v -d -d -d date | & grep 'arch =' --28512:1: main ... arch = AMD64, hwcaps = amd64-cx16-sse3 % ./valgrind --version valgrind-3.12.0 % lsb_release -i Distributor ID: CentOS % lsb_release -r Release: 6.6 % uname -a Linux mrec-build2 2.6.32-504.12.2.el6.x86_64 #1 SMP Wed Mar 11 22:03:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux (In reply to bugzilla from comment #13) > (In reply to Philippe Waroquiers from comment #10) > There was no cpuid utility available on our host, so we substituted an > internal 'procinfo' utility that emits similar details; I hope that it gives > you the information you wanted for that case: > > % ./valgrind --trace-flags=10000000 --trace-notbelow=1 --tool=none procinfo > |& grep -i 'dirty.*cpuid' > DIRTY 1:I1 MoFX-gst(16,8) WrFX-gst(40,8) MoFX-gst(24,8) > WrFX-gst(32,8) ::: amd64g_dirtyhelper_CPUID_sse42_and_cx16{0x380eea60}(BBPTR) So, valgrind pretends to your program to be an sse42/cx16 machine, having RDTSCP > 4:24pm mrec-build2.873 ~/bin% ./valgrind --tool=none -v -v -v -d -d -d date > | & grep 'arch =' > --28512:1: main ... arch = AMD64, hwcaps = amd64-cx16-sse3 But has not detected RDTSCP on the 'real cpu' hwcaps. And when it decodes the instruction, it examines the hwcaps, and not what it has pretended to be to the guest application. In other words, when your application calls the CPUID instruction, valgrind executes amd64g_dirtyhelper_CPUID_sse42_and_cx16, which tells RDTSCP is available. Then your application (correctly) assumes it can call RDTSCP, but then Valgrind refuses to decode it, because the hwcaps it has derived from cpuid call indicates there is no RDTSCP (which is the case: your QEMU simulated cpu does not have RDTSCP). What I still do not understand is that valgrind calls amd64g_dirtyhelper_CPUID_sse42_and_cx16 only if hwcaps contains SSE3 and CX16. These 2 flags are reported by by the '.... | grep 'arch =' command. However, your cat /proc/cpuinfo shows a cx16 flag but does not show an sse3 flag. So, I am wondering by which miracle m_machine.c has found the sse3 indicator by calling cpuid. Maybe there is a bug in QEMU cpuid instruction ? What is your procinfo procedure giving ? Does this report the same flags as cat /proc/cpuinfo ? In particular, does it tell that sse3 is available ? It would be nice if you could install the cpuid rpm : as far as I can see, it should be available under centos. Then we can check the consistency between cat /proc/cpuinfo (no sse3 found) valgrind (that seems to find sse3) your procinfo program : ???? cpuid : .... If the (wrong) detection of sse3 is really the root cause of wrongly pretending being RDTSCP, you might bypass the problem in m_machine.c by assigning False to have_sse3, rather than deriving it from ecx. So, in the amd64 section, replace have_sse3 = (ecx & (1<<0)) != 0; /* True => have sse3 insns */ by have_sse3 = False; If this solves the problem, then we can be reasonably sure the decoding is not the problem, but is is purely related to cpu model. If after that patch, we still have a decoding problem, then we might have both some cpu model problem and/or a basic problem that valgrind decodes an instruction that it will not execute. In order to reproduce: { valgrind kontact; } vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xF9 0x48 0xC1 0xE2 0x20 0x48 0x9 0xD0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==28292== valgrind: Unrecognised instruction at address 0x389653d2. ==28292== at 0x389653D2: hwy::platform::TimerResolution() (in /usr/lib64/libhwy.so.1.0.1) processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz stepping : 11 microcode : 0xba cpu MHz : 2394.207 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl cpuid aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm pti dtherm bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bogomips : 4788.41 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Created attachment 156080 [details] valgrind -v --tool=memcheck ./rdtscp2 Code from comment #9 compiled with command: "gcc -lstdc++ rdtscp2.cpp -o rdtscp2" where: "gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)" gives: "vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xF9 0x90 0x5D 0xC3 0x55 0x48 0x89 0xE5" when executed by: "valgrind -v --tool=memcheck ./rdtscp2" where: valgrind-3.20.0 on: "Intel(R) Core(TM)2 Duo CPU P8600 @ 2.40GHz" Terminal log attached. Please fix it. Created attachment 156081 [details]
/proc/cpuinfo of Intel Core2Duo
Created attachment 156085 [details]
[PATCH] Don't use SSE4.2 on Core2Duo
Attached patch fixes the bug. Please commit it.
Shouldn't this be something like else if ((archinfo->hwcaps & VEX_HWCAPS_AMD64_SSSE3) && (archinfo->hwcaps & VEX_HWCAPS_AMD64_CX16) (archinfo->hwcaps & VEX_HWCAPS_AMD64_RDTSCP)) { fName = "amd64g_dirtyhelper_CPUID_sse42_and_cx16"; fAddr = &amd64g_dirtyhelper_CPUID_sse42_and_cx16; } else if ((archinfo->hwcaps & VEX_HWCAPS_AMD64_SSSE3) && (archinfo->hwcaps & VEX_HWCAPS_AMD64_CX16)) { fName = "amd64g_dirtyhelper_CPUID_sse3_and_cx16"; fAddr = &amd64g_dirtyhelper_CPUID_sse3_and_cx16; } else { As it stands the patch drops sse3 && cx16 && !rdtscp from amd64g_dirtyhelper_CPUID_sse42_and_cx16 to baseline. I guess so. Originally I didn't dig deeper to find out that amd64g_dirtyhelper_CPUID_sse3_and_cx16 exists. commit 54982ab5c5325a02304eccb0e16a51ad6ef9a0e3 (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Mon Apr 17 22:57:39 2023 +0200 Forgot to add the modified file for 374596 and commit 41a7f59a8838a042813ac20fe1472e55e9bd5697 Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Mon Apr 17 21:53:23 2023 +0200 Bug 374596 - inconsistent RDTSCP support on x86_64 |
Created attachment 103212 [details] g++ source for testing RDTSCP support The attached test program attempts to determine support for, and then use, RDTSCP instruction. On CPU that does not support RDTSCP is correctly reports that the instruction is not supported and does not execute it. Otherwise it does, without error. Under valgrind, on a CPU that does not support RDTSCP the opcode is reported as unsupported even though the program never executes it: vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xF9 0xBE 0xD8 0x9 0x40 0x0 0xBF 0x80 Strangely (due to VEX simulating a different CPU stepping) if we comment out the code that executes RDTCSP, the program under valgrind then reports the instruction as being supported. Expected behavior: under valgrind on a CPU that does not support RDTSCP the program should not crash. valgrind (vex) should simulate the instruction successfully since it advertises support for it.