Bug 357932 - vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5D and 0xF2 0x49 0xF 0x5F
Summary: vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5D and 0xF2 0x49...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.10 SVN
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-13 16:13 UTC by Axel Müller
Modified: 2016-10-19 15:38 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
patch to decode both invalid instruction bytes (915 bytes, patch)
2016-01-13 16:13 UTC, Axel Müller
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Axel Müller 2016-01-13 16:13:21 UTC
Created attachment 96622 [details]
patch to decode both invalid instruction bytes

Valgrind terminates for application which is using the Intel IPP library.

vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5D 0x0 0x49 0x83 0xC0
vex amd64->IR:   REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=1 PFX.F3=0

I tried then to patch the code from SVN (r15755). My approach was similiar to bug #278744 and bug #307637.

Then i started Valgrind again. This time I've got another unhandled instruction:
vex amd64->IR: unhandled instruction bytes: 0xF2 0x49 0xF 0x5F 0x0 0x49 0x83 0xC0
vex amd64->IR:   REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=1 PFX.F3=0

Again I patched the code. Now Valgrind does not terminate anymore but it spills out a lot of invalid reads in the IPPs log function (which uses SSE)
Invalid read of size 8
ippsLn_32f_A11

Although, I trust Valgrind a lot I doubt that our application is doing invalid reads here because I double checked with GCCs address sanitizer and it didn't report anything. Thus, I guess my patches are incorrect. I have attached the patches and would be happy if someone with more competence than me would have a look at them.
Comment 1 Julian Seward 2016-07-20 17:47:29 UTC
f2 49 0f 5d 00       	rex.WB minsd (%r8),%xmm0
f2 49 0f 5f 00       	rex.WB maxsd (%r8),%xmm0

I'm sure these insns are handled really.  It's just the redundant rex.WB prefix
that is causing them not to get decoded.
Comment 2 Axel Müller 2016-09-09 11:45:04 UTC
(In reply to Julian Seward from comment #1)
> f2 49 0f 5d 00       	rex.WB minsd (%r8),%xmm0
> f2 49 0f 5f 00       	rex.WB maxsd (%r8),%xmm0
> 
> I'm sure these insns are handled really.  It's just the redundant rex.WB
> prefix
> that is causing them not to get decoded.
So what would you suggest? To be honest I only modified another patch without really understanding the code.
Comment 3 Mark Wielaard 2016-09-22 13:52:35 UTC
(In reply to Axel Müller from comment #2)
> (In reply to Julian Seward from comment #1)
> > f2 49 0f 5d 00       	rex.WB minsd (%r8),%xmm0
> > f2 49 0f 5f 00       	rex.WB maxsd (%r8),%xmm0
> > 
> > I'm sure these insns are handled really.  It's just the redundant rex.WB
> > prefix
> > that is causing them not to get decoded.
> So what would you suggest? To be honest I only modified another patch
> without really understanding the code.

I think you patch is ok. Maybe you could say:

      if (haveF2no66noF3(pfx)
          && (sz == 4
                 || /* ignore redundant REX.W */ sz == 8 && getRexW(pfx) == 1)) {

But we would need a testcase. What generates these minsd/maxsd with extra REX.W?
Comment 4 Mark Wielaard 2016-09-22 13:59:10 UTC
Or maybe Julian means the "redundant" REXB is altering the e register field and so it needs further adjustment to get the correct register if REXB is set.

Anyway, a testcase and how to (or what) generates it would be appreciated.
Comment 5 Julian Seward 2016-09-23 09:13:16 UTC
Mark, I think the patch is OK.  In these insns we have, redundantly:

REX.W=1, which says that this insn is 64-bits wide w.r.t. how it interacts
with the integer register set, which is irrelevant because it doesn't interact
with the integer register set at all, apart from the (%r8) address, for which
REX.W isn't relevant,

and REX.B=1, which supplies the "extra" (4th) register-number bit for some
3rd register field which the insn doesn't have -- there are only 2 registers
mentioned.

As to where these come from, I have often wondered that.  I suspect it is
a buggy non-GNU assembler, but really I don't know.  The hardware ignores these
redundant prefixes but V is stricter.  Fortunately GNU as is very good about
not generating them.
Comment 6 Axel Müller 2016-10-08 11:47:41 UTC
The assembler code is from the Intel Performance Primitives (IPP) library when using the ippsLn_32f_A11 function (calculates natural logarithm).
Comment 7 Julian Seward 2016-10-19 15:38:17 UTC
Committed, vex r3274.  Thanks for the patch.