valgrind 3.6.0 on x64 Linux emits the message (1) when processing a proprietary code. valgrind 3.5.0 works flawlessly. Further information: - valgrind --tool=none produces a crash. - The code in question has been generated by the Intel Fortran compiler 11.x with optimisation on. valgrind 3.6.0 can process the unoptimised version of the code. Let me know if I can help. (1) vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x9 0xDB 0x0 ==4624== valgrind: Unrecognised instruction at address 0x1cbc04e. ==4624== Your program just tried to execute an instruction that Valgrind ==4624== did not recognise. There are two possible reasons for this. ==4624== 1. Your program has a bug and erroneously jumped to a non-code ==4624== location. If you are running Memcheck and you just saw a ==4624== warning about a bad jump, it's probably your program's fault. ==4624== 2. The instruction is legitimate but Valgrind doesn't handle it, ==4624== i.e. it's Valgrind's fault. If you think this is the case or ==4624== you are not sure, please let us know and we'll try to fix it. ==4624== Either way, Valgrind will now raise a SIGILL signal which will ==4624== probably kill your program.
That's the SSE4.2 ROUNDPD instruction, not implemented yet. The reason 3.5.0 handles it is that 3.5.0 claims only SSSE3 support in its CPUID implementation, whereas 3.6.0 claims SSE4.2, and I guess your program is doing CPUID probing on startup. One way to continue working with 3.6.0 until this is fixed is to make its CPUID implementation claim only SSE3 support, by changing the fName/fAddr pair at guest_amd64_toIR.c:17624 back to the settings for "Core-2-like machine". (If you look at the source, it's obvious what to do).
I modified valgrind as explained and this indeed addresses the problem. Thank you very much for providing so fast a workaround. Aside, ROUNDPD is a SSE4.1 instruction. The valgrind release notes are not quite accurate ("SSE4.2 is supported in 64-bit mode", "Some exceptions: SSE4.2 AES instructions are not supported"). No big deal. I did not expect at all my application to take different code paths depending on some CPUID probing. Is there any way to get a stack trace of the crash?
> I did not expect at all my application to take different code paths depending > on some CPUID probing. This is pretty normal with the Intel compilers, when optimizing -- they can/will generate multiple code versions for (eg) vectorizable loops, and choose which one to use at run time depending on results of CPUID. > Is there any way to get a stack trace of the crash? Not sure. Maybe try "-v --trace-signals=yes"
Thanks: "--trace-signals=yes" gives me the relevant stack trace. For record, the offending function is __svml_exp2.N which comes from "libsvml.a", the Intel short vector library. Many thanks for everything.
Created attachment 53113 [details] test case that uses ROUNDPD
I attached a test case that demonstrates the problem. The executable has been obtained by "ifort qq.f". "ifort" is the Intel Fortran compiler 11.1. Aside, forcing the executable to use only SSE2 instructions (-xSSE2) makes no difference. So libsvml.a uses dynamic CPUID probing unconditionally. Therefore I expect that most numerical codes compiled with a recent enough Intel compiler will fail under valgrind 3.6.0.
Created attachment 53150 [details] ROUNDPD implementation for x86_64 Can you try this patch? (you'll need to restore the CPUID implementation to what it was originally, of course).
The patch addresses the problem and enables the application to run successfully under valgrind. Thank you very much.
Fixed, at least for the immediate-rounding-mode case, vex r2072.
Fixed (r2074) for all rounding mode cases.