valgrind fails to execute a legitimate handcrafted x86 assembly instruction. We use gcc-4.4 to compile the binary. The instruction is : __asm__ __volatile__("xaddl %1,%0\n\t" "pushf\n\t" "pop %2" : "=rm"(to32), "=r"(*(int32_t *)vfrom), "=rm"(eflags) : "0"(to32), "1"(*(int32_t *)vfrom) : "cc"); The result within gdb is : 1: x/i $pc 0x5558ce19 <ia32_operationL_rmr+1854>: xadd %eax,%esi (gdb) x/4bx $pc 0x5558ce19 <ia32_operationL_rmr+1854>: 0x0f 0xc1 0xc6 0x9c The valgrind error message is : ==20006== Memcheck, a memory error detector. ==20006== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==20006== Using LibVEX rev 1884, a library for dynamic binary translation. ==20006== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==20006== Using valgrind-3.4.1, a dynamic binary instrumentation framework. ==20006== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==20006== For more details, rerun with: -v ==20006== ... Testing sigsegv : vex x86->IR: unhandled instruction bytes: 0xF 0xC1 0xC6 0x9C ==20006== valgrind: Unrecognised instruction at address 0x47fbe19. ==20006== Your program just tried to execute an instruction that Valgrind ==20006== did not recognise. There are two possible reasons for this. ==20006== 1. Your program has a bug and erroneously jumped to a non-code ==20006== location. If you are running Memcheck and you just saw a ==20006== warning about a bad jump, it's probably your program's fault. ==20006== 2. The instruction is legitimate but Valgrind doesn't handle it, ==20006== i.e. it's Valgrind's fault. If you think this is the case or ==20006== you are not sure, please let us know and we'll try to fix it. ==20006== Either way, Valgrind will now raise a SIGILL signal which will ==20006== probably kill your program. ==20006== ==20006== Process terminating with default action of signal 4 (SIGILL) ==20006== Illegal opcode at address 0x47FBE19 ==20006== at 0x47FBE19: ia32_operationL_rmr (ia32_operation.c:611) ... valgrind version is : bash-3.00$ valgrind --version valgrind-3.4.1 Apparently, the instruction seems to be legitimate, and executes fine in our testsuite and is properly disassembled by gdb.. The problem is that valgrind sends a SIGILL signal and kill the program..
I'm surprised to hear this. Can you send a complete test program that shows the problem?
Here is a working minimal test program that reproduces the problem. #include <inttypes.h> typedef uint64_t addr_t; int main() { int32_t var = 0; int32_t *vfrom = &var; addr_t eflags; uint32_t to32; *vfrom = 456; to32 = 123; eflags = 0; __asm__ __volatile__( "xaddl %1,%0\n\t" "pushf\n\t" "pop %2" : "=rm"(to32),"=r"(*(int32_t *)vfrom), "=rm"(eflags) : "0"(to32), "1"(*(int32_t *)vfrom) : "cc"); return 42; } To compile it : gcc -O2 -m32 -Wall -Werror a.c System used : Linux 2.6.9-78.10.ELsmp #1 SMP Wed Sep 17 13:58:24 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux Compiler : bash-3.00$ gcc --version gcc (GCC) 4.4.0 Copyright (C) 2009 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. bash-3.00$ valgrind --version valgrind-3.4.1 Error log : bash-3.00$ valgrind ./a.out ==10108== Memcheck, a memory error detector. ==10108== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==10108== Using LibVEX rev 1884, a library for dynamic binary translation. ==10108== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==10108== Using valgrind-3.4.1, a dynamic binary instrumentation framework. ==10108== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al. ==10108== For more details, rerun with: -v ==10108== vex x86->IR: unhandled instruction bytes: 0xF 0xC1 0xD0 0x9C ==10108== valgrind: Unrecognised instruction at address 0x804835e. ==10108== Your program just tried to execute an instruction that Valgrind ==10108== did not recognise. There are two possible reasons for this. ==10108== 1. Your program has a bug and erroneously jumped to a non-code ==10108== location. If you are running Memcheck and you just saw a ==10108== warning about a bad jump, it's probably your program's fault. ==10108== 2. The instruction is legitimate but Valgrind doesn't handle it, ==10108== i.e. it's Valgrind's fault. If you think this is the case or ==10108== you are not sure, please let us know and we'll try to fix it. ==10108== Either way, Valgrind will now raise a SIGILL signal which will ==10108== probably kill your program. ==10108== ==10108== Process terminating with default action of signal 4 (SIGILL) ==10108== Illegal opcode at address 0x804835E ==10108== at 0x804835E: main (in /prj/ipd_spg/users/sn24/tmp/build-tlmhce-2.0-lin32/a.out) ==10108== ==10108== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 12 from 1) ==10108== malloc/free: in use at exit: 0 bytes in 0 blocks. ==10108== malloc/free: 0 allocs, 0 frees, 0 bytes allocated. ==10108== For counts of detected errors, rerun with: -v ==10108== All heap blocks were freed -- no leaks are possible.
Did you manage to at least reproduce/confirm the issue with my minimal test program ?
Created attachment 39040 [details] fixes the xadd insn emulation to support {r,r} operands NOTE : this patch was made against valgrind-3.5.0, because I was unable to checkout the SVN repository last night.. Sorry if it doesn't apply cleanly against current SVN Note also that I didn'
Ok, in the mean time I investigated a bit more the issue and found out that valgrind seems to not fully emulate the xadd instruction. The xadd instruction with {m,r} operands works, but not with {r,r} operands. I attached a patch (comment #4) that implements it. The bug can be shown with the following uber-simple program : (works on 32/674 bits, with -O0 or -O2 optimisation, the program returns 42) int main() { long d = 20, s = 2; asm volatile( "xadd %1,%0" :"=r"(d),"=r"(s) :"0"(d),"1"(s) :"flags" ); return s + d; }
I finally managed to checkout the current SVN of valgrind/VEX, here are the two clean rebased patches...
Created attachment 39057 [details] xadd_r_r valgrind patch 1/2
Created attachment 39058 [details] xadd_r_r VEX patch 2/2
Can somebody evaluate my patch ? Thanks..
hello ?
ping ? Anybody out there to try and confirm this bug ?
still unconfirmed ? PING valgrind (12.34.56.78) 56(84) bytes of data. 64 bytes from bugreport (123.4.56.78): icmp_seq=0 ttl=62 time=1 month
Sorry for the delay. The VEX patch looks fine. The Valgrind patch (test program) is not really Helgrind-specific and so it should really go in none/tests/x86 and none/tests/amd64 (2 copies). But apart from that it also looks fine. Can you do that?
Thank you for your feedback. I'll try to move the xadd test to the two places you mentioned.
Created attachment 43220 [details] valgrind tests of xadd in none/tests/{x86|amd64} based on valgrind-3.5.0, but should apply also to recent (SVN) note : the xadd test is not automatically run (only built) by make regtest please fix note : tested only on linux/x86 and linux/amd64, please test on OSX, etc..
added "valgrind tests of xadd in none/tests/{x86|amd64}" attachment that adds "xadd" tests to none/tests/x86 and none/tests/amd64 (should be applied instead of "xadd_r_r valgrind patch 1/2") make regtest properly builds the two binaries (x86 or amd64) but does not automatically runs them (please fix it) also I only tested on linux/x86 and linux/amd64, please test for non-regression on OSX, other archs, etc..
Committed as vex r1981 and valgrind r11127. Thanks for the patches.