Version: 3.6.0 OS: Linux I see a similar issue when trying to use valgrind 3.6.0 on Ubuntu 10.10 (via the natty packaging of it, here: https://launchpad.net/ubuntu/natty/amd64/valgrind/1:3.6.0-0ubuntu1). When I try to valgrind g++ from the gcc-snapshot package (via the natty version of gcc-snapshot, here: https://launchpad.net/ubuntu/natty/amd64/gcc-snapshot/20110106-1ubuntu1): matt@matt-desktop:~/src/devirt$ valgrind --trace-children=yes -q /usr/lib/gcc-snapshot/bin/g++ -O3 -fwhole-program -flto s.cpp vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x61 0x7 0x0 ==13008== valgrind: Unrecognised instruction at address 0xe0ea54. s.cpp contains: #include <string> int main() { std::string s("bob"); return s.length(); } I can't continue testing GCC trunk with valgrind until this is fixed. I would prefer to continue using valgrind 3.6.0, as it is *much* faster in 3.5.x in my typical scenarios. Reproducible: Always Steps to Reproduce: valgrind --trace-children=yes -q /usr/lib/gcc-snapshot/bin/g++ -O3 -fwhole-program -flto s.cpp Actual Results: vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x61 0x7 0x0 ==2901== valgrind: Unrecognised instruction at address 0xe0f5a4. ==2901== Your program just tried to execute an instruction that Valgrind ==2901== did not recognise. There are two possible reasons for this. ==2901== 1. Your program has a bug and erroneously jumped to a non-code ==2901== location. If you are running Memcheck and you just saw a ==2901== warning about a bad jump, it's probably your program's fault. ==2901== 2. The instruction is legitimate but Valgrind doesn't handle it, ==2901== i.e. it's Valgrind's fault. If you think this is the case or ==2901== you are not sure, please let us know and we'll try to fix it. ==2901== Either way, Valgrind will now raise a SIGILL signal which will ==2901== probably kill your program. s.cpp:1:0: internal compiler error: Illegal instruction Expected Results: shouldn't crash
This is PCMPESTRI, of which some variants are implemented, but not all. I can't tell from the failure message which one I need to implement here. Can you rerun with the patch below? Also, an objdump -d of the failing instruction would be a useful cross-check. Index: VEX/priv/guest_amd64_toIR.c =================================================================== --- VEX/priv/guest_amd64_toIR.c (revision 2076) +++ VEX/priv/guest_amd64_toIR.c (working copy) @@ -18392,13 +18392,15 @@ decode_failure: /* All decode failures end up here. */ vex_printf("vex amd64->IR: unhandled instruction bytes: " - "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n", + "0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x 0x%x\n", (Int)getUChar(delta_start+0), (Int)getUChar(delta_start+1), (Int)getUChar(delta_start+2), (Int)getUChar(delta_start+3), (Int)getUChar(delta_start+4), - (Int)getUChar(delta_start+5) ); + (Int)getUChar(delta_start+5), + (Int)getUChar(delta_start+6), + (Int)getUChar(delta_start+7) ); /* Tell the dispatcher that this insn cannot be decoded, and so has not been executed, and (is currently) the next to be executed.
(In reply to comment #1) > This is PCMPESTRI, of which some variants are implemented, but > not all. I can't tell from the failure message which one I need > to implement here. Can you rerun with the patch below? > > Also, an objdump -d of the failing instruction would be a useful > cross-check. this is the sse-4.2 enabled fragment of gcc's lexer. there're only 2 pcmpestri opcodes in the whole cc1plus binary. 00000000011e4e00 <search_line_sse42(unsigned char const*, unsigned char const*)>: 11e4e00: 40 f6 c7 0f test $0xf,%dil 11e4e04: 75 32 jne 11e4e38 <search_line_sse42(unsigned char const*, unsigned char const*)+0x38> 11e4e06: 66 0f 6f 05 a2 ba 2f movdqa 0x2fbaa2(%rip),%xmm0 # 14e08b0 <repl_chars+0x40> 11e4e0d: 00 11e4e0e: ba 10 00 00 00 mov $0x10,%edx 11e4e13: b8 04 00 00 00 mov $0x4,%eax 11e4e18: 48 83 ef 10 sub $0x10,%rdi 11e4e1c: 0f 1f 40 00 nopl 0x0(%rax) 11e4e20: 48 83 c7 10 add $0x10,%rdi 11e4e24: 66 0f 3a 61 07 00 pcmpestri $0x0,(%rdi),%xmm0 11e4e2a: 73 f4 jae 11e4e20 <search_line_sse42(unsigned char const*, unsigned char const*)+0x20> 11e4e2c: 48 8d 04 0f lea (%rdi,%rcx,1),%rax 11e4e30: c3 retq 11e4e31: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 11e4e38: 48 89 f0 mov %rsi,%rax 11e4e3b: 48 29 f8 sub %rdi,%rax 11e4e3e: 48 83 f8 0f cmp $0xf,%rax 11e4e42: 7e 28 jle 11e4e6c <search_line_sse42(unsigned char const*, unsigned char const*)+0x6c> 11e4e44: 66 0f 6f 05 64 ba 2f movdqa 0x2fba64(%rip),%xmm0 # 14e08b0 <repl_chars+0x40> 11e4e4b: 00 11e4e4c: ba 10 00 00 00 mov $0x10,%edx 11e4e51: b8 04 00 00 00 mov $0x4,%eax 11e4e56: 66 0f 3a 61 07 00 pcmpestri $0x0,(%rdi),%xmm0 11e4e5c: 48 83 f9 0f cmp $0xf,%rcx 11e4e60: 76 ca jbe 11e4e2c <search_line_sse42(unsigned char const*, unsigned char const*)+0x2c> 11e4e62: 48 83 c7 10 add $0x10,%rdi 11e4e66: 48 83 e7 f0 and $0xfffffffffffffff0,%rdi 11e4e6a: eb a2 jmp 11e4e0e <search_line_sse42(unsigned char const*, unsigned char const*)+0xe> 11e4e6c: 48 89 f8 mov %rdi,%rax 11e4e6f: 25 ff 0f 00 00 and $0xfff,%eax 11e4e74: 48 3d f0 0f 00 00 cmp $0xff0,%rax 11e4e7a: 76 c8 jbe 11e4e44 <search_line_sse42(unsigned char const*, unsigned char const*)+0x44> 11e4e7c: e9 ef fe ff ff jmpq 11e4d70 <search_line_sse2(unsigned char const*, unsigned char const*)> 11e4e81: 66 66 66 66 66 66 2e data32 data32 data32 data32 data32 nopw %cs:0x0(%rax,%rax,1) 11e4e88: 0f 1f 84 00 00 00 00 11e4e8f: 00
This should work (it passes my testing). Try it and LMK if it works for you. Index: VEX/priv/guest_amd64_toIR.c =================================================================== --- VEX/priv/guest_amd64_toIR.c (revision 2079) +++ VEX/priv/guest_amd64_toIR.c (working copy) @@ -15583,6 +15583,7 @@ any cases for which the helper function has not been verified. */ switch (imm) { + case 0x00: case 0x02: case 0x08: case 0x0A: case 0x0C: case 0x12: case 0x1A: case 0x3A: case 0x44: case 0x4A: break; Index: VEX/priv/guest_generic_x87.c =================================================================== --- VEX/priv/guest_generic_x87.c (revision 2075) +++ VEX/priv/guest_generic_x87.c (working copy) @@ -715,6 +715,7 @@ even if they would probably work. Life is too short to have unvalidated cases in the code base. */ switch (imm8) { + case 0x00: case 0x02: case 0x08: case 0x0A: case 0x0C: case 0x12: case 0x1A: case 0x3A: case 0x44: case 0x4A: break;
(In reply to comment #3) > This should work (it passes my testing). Try it and LMK if it > works for you. works fine for me.
Fixed, vex r2080.