Summary: | unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 (ROUNDPD) | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Arnaud Desitter <arnaud.desitter> |
Component: | vex | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | tom |
Priority: | NOR | ||
Version: | 3.6.0 | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Bug Depends on: | |||
Bug Blocks: | 253451 | ||
Attachments: |
test case that uses ROUNDPD
ROUNDPD implementation for x86_64 |
Description
Arnaud Desitter
2010-11-03 14:09:15 UTC
That's the SSE4.2 ROUNDPD instruction, not implemented yet. The reason 3.5.0 handles it is that 3.5.0 claims only SSSE3 support in its CPUID implementation, whereas 3.6.0 claims SSE4.2, and I guess your program is doing CPUID probing on startup. One way to continue working with 3.6.0 until this is fixed is to make its CPUID implementation claim only SSE3 support, by changing the fName/fAddr pair at guest_amd64_toIR.c:17624 back to the settings for "Core-2-like machine". (If you look at the source, it's obvious what to do). I modified valgrind as explained and this indeed addresses the problem. Thank you very much for providing so fast a workaround. Aside, ROUNDPD is a SSE4.1 instruction. The valgrind release notes are not quite accurate ("SSE4.2 is supported in 64-bit mode", "Some exceptions: SSE4.2 AES instructions are not supported"). No big deal. I did not expect at all my application to take different code paths depending on some CPUID probing. Is there any way to get a stack trace of the crash? > I did not expect at all my application to take different code paths depending > on some CPUID probing. This is pretty normal with the Intel compilers, when optimizing -- they can/will generate multiple code versions for (eg) vectorizable loops, and choose which one to use at run time depending on results of CPUID. > Is there any way to get a stack trace of the crash? Not sure. Maybe try "-v --trace-signals=yes" Thanks: "--trace-signals=yes" gives me the relevant stack trace. For record, the offending function is __svml_exp2.N which comes from "libsvml.a", the Intel short vector library. Many thanks for everything. Created attachment 53113 [details]
test case that uses ROUNDPD
I attached a test case that demonstrates the problem. The executable has been obtained by "ifort qq.f". "ifort" is the Intel Fortran compiler 11.1. Aside, forcing the executable to use only SSE2 instructions (-xSSE2) makes no difference. So libsvml.a uses dynamic CPUID probing unconditionally. Therefore I expect that most numerical codes compiled with a recent enough Intel compiler will fail under valgrind 3.6.0. Created attachment 53150 [details]
ROUNDPD implementation for x86_64
Can you try this patch? (you'll need to restore the CPUID
implementation to what it was originally, of course).
The patch addresses the problem and enables the application to run successfully under valgrind. Thank you very much. Fixed, at least for the immediate-rounding-mode case, vex r2072. Fixed (r2074) for all rounding mode cases. |