255963 – unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 (ROUNDPD)

Bug 255963 - unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 (ROUNDPD)

Summary: unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 (ROUNDPD)

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (show other bugs)
Version:	3.6.0
Platform:	Compiled Sources Linux

Importance:	NOR crash
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:	253451
	Show dependency tree / graph

Reported:	2010-11-03 14:09 UTC by Arnaud Desitter
Modified:	2011-08-10 12:50 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
test case that uses ROUNDPD (250.97 KB, application/octet-stream) 2010-11-03 23:28 UTC, Arnaud Desitter	Details
ROUNDPD implementation for x86_64 (2.80 KB, patch) 2010-11-05 01:16 UTC, Julian Seward	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Arnaud Desitter 2010-11-03 14:09:15 UTC

valgrind 3.6.0 on x64 Linux emits the message (1) when processing a proprietary code. valgrind 3.5.0 works flawlessly.

Further information:
- valgrind --tool=none produces a crash.
- The code in question has been generated by the Intel Fortran compiler 11.x with optimisation on. valgrind 3.6.0 can process the unoptimised version of the code. 

Let me know if I can help.

(1)
vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0
==4624== valgrind: Unrecognised instruction at address 0x1cbc04e.
==4624== Your program just tried to execute an instruction that Valgrind
==4624== did not recognise.  There are two possible reasons for this.
==4624== 1. Your program has a bug and erroneously jumped to a non-code
==4624==    location.  If you are running Memcheck and you just saw a
==4624==    warning about a bad jump, it's probably your program's fault.
==4624== 2. The instruction is legitimate but Valgrind doesn't handle it,
==4624==    i.e. it's Valgrind's fault.  If you think this is the case or
==4624==    you are not sure, please let us know and we'll try to fix it.
==4624== Either way, Valgrind will now raise a SIGILL signal which will
==4624== probably kill your program.

Comment 1 Julian Seward 2010-11-03 15:10:02 UTC

That's the SSE4.2 ROUNDPD instruction, not implemented yet.  The
reason 3.5.0 handles it is that 3.5.0 claims only SSSE3 support in
its CPUID implementation, whereas 3.6.0 claims SSE4.2, and I guess
your program is doing CPUID probing on startup.

One way to continue working with 3.6.0 until this is fixed is to
make its CPUID implementation claim only SSE3 support, by changing
the fName/fAddr pair at guest_amd64_toIR.c:17624 back to the
settings for "Core-2-like machine".  (If you look at the source,
it's obvious what to do).

Comment 2 Arnaud Desitter 2010-11-03 17:09:33 UTC

I modified valgrind as explained and this indeed addresses the problem. Thank you very much for providing so fast a workaround.

Aside, ROUNDPD is a SSE4.1 instruction. The valgrind release notes are not quite accurate ("SSE4.2 is supported in 64-bit mode", "Some exceptions: SSE4.2 AES instructions are not supported"). No big deal.

I did not expect at all my application to take different code paths depending on some CPUID probing. Is there any way to get a stack trace of the crash?

Comment 3 Julian Seward 2010-11-03 17:16:31 UTC

> I did not expect at all my application to take different code paths depending
> on some CPUID probing.

This is pretty normal with the Intel compilers, when optimizing --
they can/will generate multiple code versions for (eg) vectorizable
loops, and choose which one to use at run time depending on results of
CPUID.

> Is there any way to get a stack trace of the crash?

Not sure.  Maybe try "-v --trace-signals=yes"

Comment 4 Arnaud Desitter 2010-11-03 17:45:46 UTC

Thanks: "--trace-signals=yes" gives me the relevant stack trace. For record, the offending function is __svml_exp2.N which comes from "libsvml.a", the Intel short vector library. 

Many thanks for everything.

Comment 5 Arnaud Desitter 2010-11-03 23:28:32 UTC

Created attachment 53113 [details]
test case that uses ROUNDPD

Comment 6 Arnaud Desitter 2010-11-03 23:36:27 UTC

I attached a test case that demonstrates the problem. The executable has been obtained by "ifort qq.f". "ifort" is the Intel Fortran compiler 11.1.

Aside, forcing the executable to use only SSE2 instructions (-xSSE2) makes no difference. So libsvml.a uses dynamic CPUID probing unconditionally. Therefore I expect that most numerical codes compiled with a recent enough Intel compiler will fail under valgrind 3.6.0.

Comment 7 Julian Seward 2010-11-05 01:16:56 UTC

Created attachment 53150 [details]
ROUNDPD implementation for x86_64

Can you try this patch?  (you'll need to restore the CPUID 
implementation to what it was originally, of course).

Comment 8 Arnaud Desitter 2010-11-05 16:01:48 UTC

The patch addresses the problem and enables the application to run successfully under valgrind. Thank you very much.

Comment 9 Julian Seward 2011-01-10 16:18:56 UTC

Fixed, at least for the immediate-rounding-mode case, vex r2072.

Comment 10 Julian Seward 2011-01-11 19:28:35 UTC

Fixed (r2074) for all rounding mode cases.