Bug 323431 - Unhandled AMD XOP instruction vpcmov VEX: unhandled instruction bytevex amd64->IR: 0x8F 0xE8 0x78 0xA2 0xC1 0x40 0xC5 0xFB
Summary: Unhandled AMD XOP instruction vpcmov VEX: unhandled instruction bytevex amd64...
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.9.0.SVN
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
: 327285 (view as bug list)
Depends on:
Blocks: 339596
  Show dependency treegraph
 
Reported: 2013-08-12 21:12 UTC by Tommy Janjusic
Modified: 2024-09-28 06:55 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
crash output (16.04 KB, application/octet-stream)
2013-08-12 21:12 UTC, Tommy Janjusic
Details
support vpcmov (27.10 KB, patch)
2019-07-21 00:40 UTC, Anthony Romano
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tommy Janjusic 2013-08-12 21:12:04 UTC
Created attachment 81678 [details]
crash output

I'm running into un-handled instructions, due to -O2 flag (program runs when flag is omitted). This happens on both a pgi c compiler and gnu c compiler.

$uname -a:
Linux nid02382 2.6.32.59-0.7.1_1.0401.6845-cray_gem_c #1 SMP Thu Feb 21 21:03:43 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

$proc/cpuinfo | grep flags
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_goo|   19 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow cons

model name  : AMD Opteron(TM) Processor 6274
Comment 1 Julian Seward 2013-09-12 14:17:32 UTC
Almost certainly an amd-specific instruction.
Comment 2 Mark Wielaard 2013-09-12 20:18:48 UTC
It is vpcmov part of the  AMD XOP and FMA4 Instructions.
https://en.wikipedia.org/wiki/XOP_instruction_set
http://support.amd.com/us/Embedded_TechDocs/43479.pdf
Comment 3 Tommy Janjusic 2013-09-12 20:30:20 UTC
Not sure if you guys need more info or potential patch tests from me, if so, let me know.
I posted a similar crash output here: http://old.nabble.com/question-on-valgrind's-configure-and-build-process-tc35866077.html
Comment 4 Tom Hughes 2013-11-07 17:20:22 UTC
*** Bug 327285 has been marked as a duplicate of this bug. ***
Comment 5 Thomas Eschenbacher 2016-01-09 09:35:18 UTC
I am affected by the same here:

> uname -a
Linux lisa 4.3.3-gentoo #1 SMP PREEMPT Fri Jan 8 07:01:39 CET 2016 x86_64 AMD A10-7870K Radeon R7, 12 Compute Cores 4C+8G AuthenticAMD GNU/Linux

> cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 21
model           : 56
model name      : AMD A10-7870K Radeon R7, 12 Compute Cores 4C+8G
stepping        : 1
cpu MHz         : 1700.000
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 16
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext arat cpb hw_pstate proc_feedback npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold vmmcall fsgsbase bmi1 xsaveopt
bugs            : fxsave_leak sysret_ss_attrs

Any news about this bug?
Comment 6 Thomas Eschenbacher 2016-01-09 09:37:30 UTC
using valgrind-3.11.0
Comment 7 Mark Wielaard 2016-01-09 10:54:51 UTC
This is a AMD specific XOP instruction VPCMOV described in http://support.amd.com/TechDocs/43479.pdf
Comment 8 Anthony Romano 2019-07-21 00:40:52 UTC
Created attachment 121654 [details]
support vpcmov

Hi, I've been hitting this one recently and it's in a lot of my system libraries, so I added vpcmov support.

After I wrote this patch I poked around the bug tracker and realized I wrote a broken XOP decoding patch for TBM bextr under a different account nearly six years ago. It was ignored for a while and by the time it got any attention from maintainers I had stopped relying on libvex internals for work. The bug was later consolidated to another issue (https://bugs.kde.org/show_bug.cgi?id=381819); there was modest traffic about 2 years ago, but I wasn't copied on it. Since there wasn't any progress at the time, I changed my compiler settings to use bdver1 to avoid XOP TBM instructions so it never came up again for me. My understanding is there's now a VEX encoding for bextr.

The new attached patch piggybacks on PFX_VEX, introducing ESC_M8 instead of incorrectly reusing the VEX ESC constants and adding PFX_XOP; it correctly handles the pop case where XOP decoding should be abandoned. Adding the alternative XOP bextr encoding from the other issue should be simple once there's an XOP decode path.

Thanks.
Comment 9 Mark Wielaard 2021-02-20 21:14:36 UTC
Note there is a more generic XOP bug, also with patch at https://bugs.kde.org/show_bug.cgi?id=339596
Comment 10 Paul Floyd 2024-09-23 08:27:29 UTC
I'll give the patch a go.
Comment 11 Paul Floyd 2024-09-24 07:35:40 UTC
(In reply to Paul Floyd from comment #10)
> I'll give the patch a go.

After reworking it to resolve merge conflicts.
Comment 12 Paul Floyd 2024-09-25 07:49:33 UTC
The merge conflict is minor. The resteer params need to be removed.

There's still a small issue with CPU detection. The testcase builds and runs on an Intel CPU under Valgrind but the xop test exe generates sigill when running standalone.
Comment 13 Paul Floyd 2024-09-28 06:55:53 UTC
I also need to look more at #339596 (which has XOP and FMA4 but FMA4 already got added in #369000).

As Mark has commented, finding suitable hardware is difficult. I just looked at a AMD EPYC 7773X box ant it no longer has xop.

Here is a patch for the feature test for any eventual regression test

diff --git a/tests/x86_amd64_features.c b/tests/x86_amd64_features.c
index 488f155b6..33fd55d2d 100644
--- a/tests/x86_amd64_features.c
+++ b/tests/x86_amd64_features.c
@@ -127,6 +127,10 @@ static Bool go(char* cpu)
      level = 1;
      cmask = (1 << 27) | (1 << 28);
      require_xgetbv = True;
+   } else if (strcmp (cpu,  "amd64-xop" ) == 0) {
+     level = 0x80000001;
+     cmask = 1 << 11;
+     require_amd = True;
    } else if (strcmp (cpu,  "amd64-fma4" ) == 0) {
      level = 0x80000001;
      cmask = 1 << 16;