Summary: | Support for SHA instruction on Ryzen | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Eric Hoffman <ehoffman> |
Component: | vex | Assignee: | Julian Seward <jseward> |
Status: | REPORTED --- | ||
Severity: | normal | CC: | hy110001, katyaberezyaka, michal.privoznik, pjfloyd, rurban, sam, tom |
Priority: | NOR | ||
Version: | 3.14 SVN | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: |
Description
Eric Hoffman
2018-09-12 15:18:17 UTC
That's why we remove that flag from the CPUID response - because we don't support it. Ok, that answer question 1 (by design), but question 2 still remain (and cpuid will certainly follow the fix). I have not looked at the code yet, but is there a reason why it's not supported? Is it because of implementation issues, or because it's "just not yet implemented"? This probably could be classified as 'feature implementation' rather than a bug then, i guess... Best regards, Eric Well the most obvious would be because nobody has submitted an implementation yet... If that's an AMD specific instruction then in general I'm not sure we have anything much in the way of support for those - not sure how much of that is deliberate and how much is just that the Intel ones are much more popular, These new SHA extensions are supported on amd since epyc, on intel since Goldmont (2017), and on recent arm's and power8. https://software.intel.com/en-us/articles/intel-sha-extensions How to add vex support for it? Sounds trivial. binutils/objdump can do it for a long time. > How to add vex support for it? Sounds trivial. > binutils/objdump can do it for a long time. I started with that at https://github.com/rurban/valgrind linux names it sha_ni, freebsd SHA1,SHA2, on Windows it's Family 3, cpu Model >= 92 on Intel and cpu Model >= 23 on amd. But for adding the necessary logic stubs my 30 min self-intro into the code is certainly not enough. There shouldn't be much logic needed I think. Similar to the aesdec and crc insn, which do exist already. I haven't even found the location where hwcaps are set. Just bought a new machine (Ryzen 9 3900X) and hit exactly this bug. I've tried to write a patch, but my VEX skills are poor. Another case: Setup: processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 49 model name : AMD EPYC 7302P 16-Core Processor stepping : 0 microcode : 0x8301034 cpu MHz : 1499.828 cache size : 512 KB physical id : 0 siblings : 32 core id : 0 cpu cores : 16 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 16 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass bogomips : 6000.01 TLB size : 3072 4K pages clflush size : 64 cache_alignment : 64 address sizes : 43 bits physical, 48 bits virtual power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] Issue: 28 vex amd64->IR: unhandled instruction bytes: 0xF 0x38 0xCC 0xFA 0xF 0x38 0xCB 0xD9 0xC5 0xF9 29 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 30 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F38 31 vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 32 ==10543== valgrind: Unrecognised instruction at address 0x1b9fbcc. 33 ==10543== at 0x1B9FBCC: _mm_sha256msg1_epu32 (sha.rs:100) (In reply to Reini Urban from comment #5) > > How to add vex support for it? Sounds trivial. > > binutils/objdump can do it for a long time. > > I started with that at https://github.com/rurban/valgrind > linux names it sha_ni, freebsd SHA1,SHA2, > on Windows it's Family 3, cpu Model >= 92 on Intel and cpu Model >= 23 on > amd. > > But for adding the necessary logic stubs my 30 min self-intro into the code > is certainly not enough. There shouldn't be much logic needed I think. > Similar to the aesdec and crc insn, which do exist already. > I haven't even found the location where hwcaps are set. Thanks for the great work! After tested the patch "amd64: WIP start implementing the amd64 SHA extensions" based on 3.16.1 and get the error as following comments, https://github.com/rurban/valgrind/commit/f0fc15e32bba3fdd9d84e1ea7fd44916c4ff3d54 |