Bug 348387 - Valgrind does not recognize a variant of the VCMPPD instruction
Summary: Valgrind does not recognize a variant of the VCMPPD instruction
Status: RESOLVED DUPLICATE of bug 342571
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.9.0
Platform: unspecified Linux
: NOR grave
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-29 09:44 UTC by christian.hoff
Modified: 2015-08-13 12:52 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description christian.hoff 2015-05-29 09:44:48 UTC
Hello,

I am getting a "unrecognized instruction" error while running my application with Valgrind:

.vex amd64->IR: unhandled instruction bytes: 0xC4 0x41 0x5 0xC2 0xDA 0x19 0xC4 0xC1
vex amd64->IR:   REX=0 REX.W=0 REX.R=1 REX.X=0 REX.B=1
vex amd64->IR:   VEX=1 VEX.L=1 VEX.nVVVV=0xF ESC=0F
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0
==26330== valgrind: Unrecognised instruction at address 0x24876a3f.
==26330==    at 0x24876A3F: mkl_vml_kernel_dAcos_E9HAynn (in /opt/hp93000rt/el7/x86_64/imkl_3/libmath_library.so)
==26330==    by 0x242409CA: vdAcos (in /opt/hp93000rt/el7/x86_64/imkl_3/libmath_library.so)

I was trying to find out which instruction is located at this position, so I started GDB: 
 > gdb /opt/hp93000rt/el7/x86_64/imkl_3/libmath_library.so
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-51.el7
(gdb) x/i 0x7e3a3f     (NOTE: address was calculated by subtracting the load address of the "libmath_library.so" from the instruction address)
   0x7e3a3f <mkl_vml_kernel_dAcos_E9HAynn+1919>:	vcmpnge_uqpd %ymm10,%ymm15,%ymm11

According to the documentation at "http://www.felixcloutier.com/x86/CMPPD.html" "vcmpnge_uqpd" seems to be some special form of the "VCMPPD" instruction.

As can be seen from the stack trace above, the Invalid instruction comes from the file "libmath_library.so". This belongs to the Intel Math Kernel Library that my application is using. It is possible that the Intel Math Kernel Library was compiled with an Intel compiler and not with GCC, which would explain why not so many users face that error.

Please fix this.

Reproducible: Always

Steps to Reproduce:
See "Details" above

Actual Results:  
Unrecognized instruction error is shown
Comment 1 Tom Hughes 2015-05-29 10:14:55 UTC
Are you sure you are using 3.9.0 only it looks to me like that was implemented in VEX r2404 against BZ#273475 which was in the 3.8.0 release.
Comment 2 christian.hoff 2015-05-29 11:10:31 UTC
Hello Tom,

thank you for looking into this. Yes, I double-checked and I am only using Valgrind 3.9.0.

I came to the same conclusion as you: It's an AVX instruction and should work since Valgrind 3.8.0. But as I said: The library that contains the invalid instruction might have been compiled with an Intel compiler. So maybe GCC never generates that instruction and thus the bug remained undetected.

Best regards,

  Christian
Comment 3 christian.hoff 2015-05-29 11:38:13 UTC
I just reran Valgrind with --demangle="no" --sym-offsets="yes" to rule out the possibility that I have calculated the relative address wrongly. This time it failed at a (slightly) different location:

.vex amd64->IR: unhandled instruction bytes: 0xC5 0xCD 0xC2 0x9C 0x24 0x80 0x2 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=1 VEX.L=1 VEX.nVVVV=0x6 ESC=0F
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0
==1494== valgrind: Unrecognised instruction at address 0x244765d0.
==1494==    at 0x244765D0: mkl_vml_kernel_dAcos_E9HAynn+784 (in /opt/hp93000rt/el7/x86_64/imkl_3/libmath_library.so)
==1494==    by 0x23E409CA: vdAcos+90 (in /opt/hp93000rt/el7/x86_64/imkl_3/libmath_library.so)

Again I ran GDB to find the instruction:
 > gdb
[?1034h(gdb) x/i mkl_vml_kernel_dAcos_E9HAynn+784
   0x7e35d0 <mkl_vml_kernel_dAcos_E9HAynn+784>:	vcmpnge_uqpd 0x280(%rsp),%ymm6,%ymm3

It's the same instruction as before, only the byte sequence is different.

Hope this helps.

Best regards,

  Christian
Comment 4 Tom Hughes 2015-05-29 11:43:26 UTC
My comment was not theoretical (and I'm not sure if 3.8.0 is complete for AVX anyway) it was based on looking at the instruction decoder for that specific instruction and identifying the commit where support for it was added.

I wouldn't worry about the compiler - most likely an instruction like that has come from handwritten assembly or compiler intrinsics anyway.

The problem is probably that only some operand modes are being handled - the basic instruction is definitely recognised.
Comment 5 Julian Seward 2015-05-29 12:01:16 UTC
The insn is decoded, but findSSECmpOp() doesn't provide a comparison
op for the case "nge_uq", which is 0x19 (look at the code).  Also,
that is consistent with the insn bytes, 0xC4 0x41 0x5 0xC2 0xDA 0x19,
which has 0x19 as the last byte (the imm8 field).

Try this: in findSSECmpOp, copy case 0x9 (NGE_US) for case 0x19
(NGE_UQ) and see if that works.  Unless the instruction is dealing
with NaNs it should behave identically to the NGE_US variant.
Comment 6 christian.hoff 2015-06-08 09:04:52 UTC
Hello Julian,

thank you for the investigation and the proposed fix. To try it out, however, I would need to recompile Valgrind from source, which I never did before.

I am currently checking with my company whether we can allocate some time for fixing Valgrind to work with the Intel Math Kernel Library(IMKL). We also suspect that - after fixing this error - Valgrind will find more, subsequent "illegal instructions". So the question is whether fixing Valgrind to work with the IMKL is worth the effort for us. I will let you know when we have come to a conclusion.

Best regards,

  Christian
Comment 7 Julian Seward 2015-08-13 12:52:05 UTC
Fixed, vex r3170.

*** This bug has been marked as a duplicate of bug 342571 ***