Created attachment 96622 [details] patch to decode both invalid instruction bytes Valgrind terminates for application which is using the Intel IPP library. vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5D 0x0 0x49 0x83 0xC0 vex amd64->IR: REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=1 PFX.F3=0 I tried then to patch the code from SVN (r15755). My approach was similiar to bug #278744 and bug #307637. Then i started Valgrind again. This time I've got another unhandled instruction: vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5F 0x0 0x49 0x83 0xC0 vex amd64->IR: REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=1 PFX.F3=0 Again I patched the code. Now Valgrind does not terminate anymore but it spills out a lot of invalid reads in the IPPs log function (which uses SSE) Invalid read of size 8 ippsLn_32f_A11 Although, I trust Valgrind a lot I doubt that our application is doing invalid reads here because I double checked with GCCs address sanitizer and it didn't report anything. Thus, I guess my patches are incorrect. I have attached the patches and would be happy if someone with more competence than me would have a look at them.
f2 49 0f 5d 00 rex.WB minsd (%r8),%xmm0 f2 49 0f 5f 00 rex.WB maxsd (%r8),%xmm0 I'm sure these insns are handled really. It's just the redundant rex.WB prefix that is causing them not to get decoded.
(In reply to Julian Seward from comment #1) > f2 49 0f 5d 00 rex.WB minsd (%r8),%xmm0 > f2 49 0f 5f 00 rex.WB maxsd (%r8),%xmm0 > > I'm sure these insns are handled really. It's just the redundant rex.WB > prefix > that is causing them not to get decoded. So what would you suggest? To be honest I only modified another patch without really understanding the code.
(In reply to Axel Müller from comment #2) > (In reply to Julian Seward from comment #1) > > f2 49 0f 5d 00 rex.WB minsd (%r8),%xmm0 > > f2 49 0f 5f 00 rex.WB maxsd (%r8),%xmm0 > > > > I'm sure these insns are handled really. It's just the redundant rex.WB > > prefix > > that is causing them not to get decoded. > So what would you suggest? To be honest I only modified another patch > without really understanding the code. I think you patch is ok. Maybe you could say: if (haveF2no66noF3(pfx) && (sz == 4 || /* ignore redundant REX.W */ sz == 8 && getRexW(pfx) == 1)) { But we would need a testcase. What generates these minsd/maxsd with extra REX.W?
Or maybe Julian means the "redundant" REXB is altering the e register field and so it needs further adjustment to get the correct register if REXB is set. Anyway, a testcase and how to (or what) generates it would be appreciated.
Mark, I think the patch is OK. In these insns we have, redundantly: REX.W=1, which says that this insn is 64-bits wide w.r.t. how it interacts with the integer register set, which is irrelevant because it doesn't interact with the integer register set at all, apart from the (%r8) address, for which REX.W isn't relevant, and REX.B=1, which supplies the "extra" (4th) register-number bit for some 3rd register field which the insn doesn't have -- there are only 2 registers mentioned. As to where these come from, I have often wondered that. I suspect it is a buggy non-GNU assembler, but really I don't know. The hardware ignores these redundant prefixes but V is stricter. Fortunately GNU as is very good about not generating them.
The assembler code is from the Intel Performance Primitives (IPP) library when using the ippsLn_32f_A11 function (calculates natural logarithm).
Committed, vex r3274. Thanks for the patch.