Bug 255963

Summary: unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 (ROUNDPD)
Product: [Developer tools] valgrind Reporter: Arnaud Desitter <arnaud.desitter>
Component: vexAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: crash CC: tom
Priority: NOR    
Version: 3.6.0   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Bug Depends on:    
Bug Blocks: 253451    
Attachments: test case that uses ROUNDPD
ROUNDPD implementation for x86_64

Description Arnaud Desitter 2010-11-03 14:09:15 UTC
valgrind 3.6.0 on x64 Linux emits the message (1) when processing a proprietary code. valgrind 3.5.0 works flawlessly.

Further information:
- valgrind --tool=none produces a crash.
- The code in question has been generated by the Intel Fortran compiler 11.x with optimisation on. valgrind 3.6.0 can process the unoptimised version of the code. 

Let me know if I can help.

(1)
vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0
==4624== valgrind: Unrecognised instruction at address 0x1cbc04e.
==4624== Your program just tried to execute an instruction that Valgrind
==4624== did not recognise.  There are two possible reasons for this.
==4624== 1. Your program has a bug and erroneously jumped to a non-code
==4624==    location.  If you are running Memcheck and you just saw a
==4624==    warning about a bad jump, it's probably your program's fault.
==4624== 2. The instruction is legitimate but Valgrind doesn't handle it,
==4624==    i.e. it's Valgrind's fault.  If you think this is the case or
==4624==    you are not sure, please let us know and we'll try to fix it.
==4624== Either way, Valgrind will now raise a SIGILL signal which will
==4624== probably kill your program.
Comment 1 Julian Seward 2010-11-03 15:10:02 UTC
That's the SSE4.2 ROUNDPD instruction, not implemented yet.  The
reason 3.5.0 handles it is that 3.5.0 claims only SSSE3 support in
its CPUID implementation, whereas 3.6.0 claims SSE4.2, and I guess
your program is doing CPUID probing on startup.

One way to continue working with 3.6.0 until this is fixed is to
make its CPUID implementation claim only SSE3 support, by changing
the fName/fAddr pair at guest_amd64_toIR.c:17624 back to the
settings for "Core-2-like machine".  (If you look at the source,
it's obvious what to do).
Comment 2 Arnaud Desitter 2010-11-03 17:09:33 UTC
I modified valgrind as explained and this indeed addresses the problem. Thank you very much for providing so fast a workaround.

Aside, ROUNDPD is a SSE4.1 instruction. The valgrind release notes are not quite accurate ("SSE4.2 is supported in 64-bit mode", "Some exceptions: SSE4.2 AES instructions are not supported"). No big deal.

I did not expect at all my application to take different code paths depending on some CPUID probing. Is there any way to get a stack trace of the crash?
Comment 3 Julian Seward 2010-11-03 17:16:31 UTC
> I did not expect at all my application to take different code paths depending
> on some CPUID probing.

This is pretty normal with the Intel compilers, when optimizing --
they can/will generate multiple code versions for (eg) vectorizable
loops, and choose which one to use at run time depending on results of
CPUID.

> Is there any way to get a stack trace of the crash?

Not sure.  Maybe try "-v --trace-signals=yes"
Comment 4 Arnaud Desitter 2010-11-03 17:45:46 UTC
Thanks: "--trace-signals=yes" gives me the relevant stack trace. For record, the offending function is __svml_exp2.N which comes from "libsvml.a", the Intel short vector library. 

Many thanks for everything.
Comment 5 Arnaud Desitter 2010-11-03 23:28:32 UTC
Created attachment 53113 [details]
test case that uses ROUNDPD
Comment 6 Arnaud Desitter 2010-11-03 23:36:27 UTC
I attached a test case that demonstrates the problem. The executable has been obtained by "ifort qq.f". "ifort" is the Intel Fortran compiler 11.1.

Aside, forcing the executable to use only SSE2 instructions (-xSSE2) makes no difference. So libsvml.a uses dynamic CPUID probing unconditionally. Therefore I expect that most numerical codes compiled with a recent enough Intel compiler will fail under valgrind 3.6.0.
Comment 7 Julian Seward 2010-11-05 01:16:56 UTC
Created attachment 53150 [details]
ROUNDPD implementation for x86_64

Can you try this patch?  (you'll need to restore the CPUID 
implementation to what it was originally, of course).
Comment 8 Arnaud Desitter 2010-11-05 16:01:48 UTC
The patch addresses the problem and enables the application to run successfully under valgrind. Thank you very much.
Comment 9 Julian Seward 2011-01-10 16:18:56 UTC
Fixed, at least for the immediate-rounding-mode case, vex r2072.
Comment 10 Julian Seward 2011-01-11 19:28:35 UTC
Fixed (r2074) for all rounding mode cases.