Summary: | print unrecognized instuction on MIPS | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | John Reiser <jreiser> |
Component: | memcheck | Assignee: | Petar Jovanovic <mips32r2> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | dejanjevtic87, jseward, mips32r2, Quanfu.Wang |
Priority: | NOR | ||
Version: | 3.8.0 | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Linux | ||
URL: | http://news.gmane.org/find-root.php?message_id=%3c39afce22.93f.13ab5cffca8.Coremail.kx%5fruan%40163.com%3e | ||
Latest Commit: | Version Fixed In: |
Description
John Reiser
2012-10-31 15:12:24 UTC
Valgrind for mips should print the instruction opcode. What's your command line look like? Can you run your code with: --sigill-diagnostics=yes Check out code from the current repository. The original mail to [valgrind-users] from kx_ruan@163.com with subject "Unrecognised instruction in _dl_sysdep_start" dated 10/31/2012 (about 4 months ago) is copied at the end of this comment. The command line is "./bin/valgrind /bin/true". No, _I_ cannot run with "--sigill-diagnostics=yes", or run anything at all in _his_ environment. The point of this bug report is that memcheck is unfriendly to users because the original report lacks essential information that is reasonably needed in order to get help and advice from [valgrind-users], and which memcheck could provide easily. If memcheck printed the instruction bits which could not be decoded, then other users could decode those bits, and offer *informed* suggestions about changing the environment (such as re-compiling libc with a more generic target that does not use the offending instruction, or discovering an actual bug that VEX forgot a specific case) in order to get memcheck working. But memcheck did not print the instruction itself, only its address in the instantiated shared library. Using gdb to find such an instruction is cumbersome. So by omitting the printing of the instruction bits, then memcheck creates a dissatisfied user (memcheck does not work and the process of using it cannot recover quickly) and disappointed onlookers who cannot reasonably help. ----- original mail from kx_ruan@163.com ----- I cross compiled valgrind-3.8.1 for my MIPS platform with: # CC=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-gcc \ CC=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-gcc \ NM=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-nm \ STRIP=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-strip \ AR=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-ar \ RANLIB=/opt/cross-uxl/bin/mipsel-uxl-linux-gnu-ranlib \ ./configure --host=mipsel-linux --prefix=/opt/nfsdir/ --with-pagesize=4 my mips tool chain version: /opt/valgrind-3.8.1# mipsel-uxl-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=mipsel-uxl-linux-gnu-gcc COLLECT_LTO_WRAPPER=/opt/cross-uxl/libexec/gcc/mipsel-uxl-linux-gnu/4.6.3/lto-wrapper Target: mipsel-uxl-linux-gnu Configured with: //home/zyf/work/uxl//build//src/gcc-4.6.3/configure --build=i686-build_pc-linux-gnu --host=i686-build_pc-linux-gnu --target=mipsel-uxl-linux-gnu --prefix=/opt/cross-uxl --with-sysroot=/opt/cross-uxl/mipsel-uxl-linux-gnu/sysroot --enable-languages=c,c++ --disable-multilib --with-arch=74kf1_1 --with-abi=32 --with-tune=74kf1_1 --with-pkgversion='crosstool-NG 1.13.2' --disable-sjlj-exceptions --disable-__cxa_atexit --disable-libmudflap --disable-libgomp --disable-libssp --with-gmp=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-mpfr=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-mpc=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-ppl=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-cloog=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-libelf=//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++ -lm -L//home/zyf/work/uxl//build//mipsel-uxl-linux-gnu/build/static/lib -lpwl' --enable-threads=posix --enable-target-optspace --without-long-double-128 --with-local-prefix=/opt/cross-uxl/mipsel-uxl-linux-gnu/sysroot --disable-nls --enable-c99 --enable-long-long Thread model: posix gcc version 4.6.3 (crosstool-NG 1.13.2) when I run valgrind on my embedded system, I meet the below issue: /opt # ./bin/valgrind /bin/true ==527== Memcheck, a memory error detector ==527== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==527== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==527== Command: /bin/true ==527== ==527== valgrind: Unrecognised instruction at address 0x4016348. ==527== at 0x4016348: _dl_sysdep_start (in /lib/ld-2.13.so) ==527== by 0x4001F48: _dl_start_final (in /lib/ld-2.13.so) ==527== Your program just tried to execute an instruction that Valgrind [snip] ----- end original mail ----- @John Reiser It is very hard to understand what you are actually complaining about. Memcheck *does* report the content of the instruction it can not handle. Only if you run it with "--sigill-diagnostics=no", it will skip that part. Here is what I have just done. $ cat ./unknown_instr.c #include <stdio.h> int main() { printf("hello!\n"); asm volatile (".word 0x446e6700"); return 42; } $ gcc unknown_instr.c -o unknown_instr.exe $ ./unknown_instr.exe hello! Illegal instruction $ ../valgrind/install-dir/bin/valgrind ./unknown_instr.exe ==5325== Memcheck, a memory error detector ==5325== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==5325== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info ==5325== Command: ./unknown_instr.exe ==5325== ==5325== Invalid write of size 4 ==5325== at 0x4000960: _dl_start_user (in /lib/mipsel-linux-gnu/ld-2.13.so) ==5325== by 0x40008E0: __start (in /lib/mipsel-linux-gnu/ld-2.13.so) ==5325== Address 0x7eeb368c is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes ==5325== hello! vex mips->IR: unhandled instruction bytes: 0x0 0x67 0x6E 0x44 ==5325== valgrind: Unrecognised instruction at address 0x400680. ==5325== at 0x400680: main (in /home/petarj/radni/ubrisi/unknown_instr.exe) ==5325== Your program just tried to execute an instruction that Valgrind ==5325== did not recognise. There are two possible reasons for this. ==5325== 1. Your program has a bug and erroneously jumped to a non-code ==5325== location. If you are running Memcheck and you just saw a ==5325== warning about a bad jump, it's probably your program's fault. ==5325== 2. The instruction is legitimate but Valgrind doesn't handle it, ==5325== i.e. it's Valgrind's fault. If you think this is the case or ==5325== you are not sure, please let us know and we'll try to fix it. ==5325== Either way, Valgrind will now raise a SIGILL signal which will ==5325== probably kill your program. ==5325== ==5325== Process terminating with default action of signal 4 (SIGILL) ==5325== Illegal opcode at address 0x400680 ==5325== at 0x400680: main (in /home/petarj/radni/ubrisi/unknown_instr.exe) ==5325== ==5325== HEAP SUMMARY: ==5325== in use at exit: 0 bytes in 0 blocks ==5325== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==5325== ==5325== All heap blocks were freed -- no leaks are possible ==5325== ==5325== For counts of detected and suppressed errors, rerun with: -v ==5325== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 15 from 7) Illegal instruction As you can see from the above, memcheck correctly prints the content of the unknown instruction: "vex mips->IR: unhandled instruction bytes: 0x0 0x67 0x6E 0x44" I am complaining that the diagnostic given by valgrind is not good enough to help fix the problem. The diagnostic should identify the offending instruction by value [as well as by address] so that an appropriate fix can be surmised by the user or an on-looker: either change the [library] code to use a different instruction, or implement a case that VEX forgot. Specifically: 1) the line "vex mips->IR: unhandled instruction bytes: 0x0 0x67 0x6E 0x44" was not there last October using valgrind-3.8.1, and does not appear today using any valgrind release. ( I wasn't there in person, but the original reporter did not find such a line. See Comment 2 for the original text.) 2) The offending bytes of the instruction stream are not printed in the two logical places where they could be printed, namely (from comment 3): ==5325== valgrind: Unrecognised instruction at address 0x400680. and ==5325== Illegal opcode at address 0x400680 What I want is the opcode bytes to be printed on the same line as the address, such as: ==5325== Illegal opcode at address 0x400680, value: 0x44 0x6e 0x67 0x00 What is important is that the value of the offending bytes be printed on a line that is in the same format as other valgrind messages (has a prefix of "==<pid>== ", so that it is obvious that the line was generated by valgrind and pertains to this process), and that the byte values be very closely associated with the address. I suppose that putting the byte values on a line adjacent to the line that contains address would be OK, but using the same line is preferable. The default case for illegal opcode for all architectures should be to print the address and then the 8 bytes which reside at that address and the 7 byte addresses which follow, all on the same line. The bytes are printed in the byte order of the target (guest), not the host. To sum up what you have just said: Today's version of Valgrind prints the unrecognized opcode, but not exactly where you think it should be. I do not have a strong opinion on this. You may want to ask other developer's on the mailing list what they think about this, but this surely is not MIPS- specific issue as it seemed from your initial report. I met the same issue also, no exact words on which the illegal instruction is. ./valgrind ./test ==3269== Memcheck, a memory error detector ==3269== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==3269== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==3269== Command: ./test ==3269== ==3269== valgrind: Unrecognised instruction at address 0x40c038. ==3269== at 0x40C038: _dl_aux_init (in /tmp/wqf/Install/bin/test) ==3269== by 0x400400: (below main) (in /tmp/wqf/Install/bin/test) ==3269== Your program just tried to execute an instruction that Valgrind ==3269== did not recognise. There are two possible reasons for this. ==3269== 1. Your program has a bug and erroneously jumped to a non-code ==3269== location. If you are running Memcheck and you just saw a ==3269== warning about a bad jump, it's probably your program's fault. ==3269== 2. The instruction is legitimate but Valgrind doesn't handle it, ==3269== i.e. it's Valgrind's fault. If you think this is the case or ==3269== you are not sure, please let us know and we'll try to fix it. ==3269== Either way, Valgrind will now raise a SIGILL signal which will ==3269== probably kill your program. ==3269== ==3269== Process terminating with default action of signal 4 (SIGILL) ==3269== Illegal opcode at address 0x40C038 ==3269== at 0x40C038: _dl_aux_init (in /tmp/wqf/Install/bin/test) ==3269== by 0x400400: (below main) (in /tmp/wqf/Install/bin/test) ==3269== ==3269== HEAP SUMMARY: ==3269== in use at exit: 0 bytes in 0 blocks ==3269== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==3269== ==3269== All heap blocks were freed -- no leaks are possible ==3269== ==3269== For counts of detected and suppressed errors, rerun with: -v ==3269== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Illegal instruction @Quanfu Wang Can you check out code from the current repository. You can download the latest version of Valgrind with: $ svn co svn://svn.valgrind.org/valgrind/trunk valgrind When I tried the latest code as Dejan Jevtic suggested and got dump as follows: 192@:/tmp/wqf/Install/bin # ./valgrind ./test ==2941== Memcheck, a memory error detector ==2941== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==2941== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info ==2941== Command: ./test -v ==2941== vex mips->IR: unhandled instruction bytes: 0x7C 0x65 0x18 0xA ==2941== valgrind: Unrecognised instruction at address 0x4016fd8. ==2941== at 0x4016FD8: ??? (in /lib/ld-2.11.1.so) ==2941== by 0x4000E5C: ??? (in /lib/ld-2.11.1.so) ==2941== Your program just tried to execute an instruction that Valgrind ==2941== did not recognise. There are two possible reasons for this. ==2941== 1. Your program has a bug and erroneously jumped to a non-code ==2941== location. If you are running Memcheck and you just saw a ==2941== warning about a bad jump, it's probably your program's fault. ==2941== 2. The instruction is legitimate but Valgrind doesn't handle it, ==2941== i.e. it's Valgrind's fault. If you think this is the case or ==2941== you are not sure, please let us know and we'll try to fix it. ==2941== Either way, Valgrind will now raise a SIGILL signal which will ==2941== probably kill your program. ==2941== ==2941== Process terminating with default action of signal 4 (SIGILL) ==2941== Illegal opcode at address 0x4016FD8 ==2941== at 0x4016FD8: ??? (in /lib/ld-2.11.1.so) ==2941== by 0x4000E5C: ??? (in /lib/ld-2.11.1.so) ==2941== ==2941== HEAP SUMMARY: ==2941== in use at exit: 0 bytes in 0 blocks ==2941== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==2941== ==2941== All heap blocks were freed -- no leaks are possible ==2941== ==2941== For counts of detected and suppressed errors, rerun with: -v ==2941== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Illegal instruction 192@:/tmp/wqf/Install/bin # Timeout waiting for PADO packets Unable to complete PPPoE Discovery 192@:/tmp/wqf/Install/bin # ./valgrind ./test --vgdb=yes ==2985== Memcheck, a memory error detector ==2985== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==2985== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info ==2985== Command: ./test --vgdb=yes ==2985== vex mips->IR: unhandled instruction bytes: 0x7C 0x65 0x18 0xA ==2985== valgrind: Unrecognised instruction at address 0x4016fd8. ==2985== at 0x4016FD8: ??? (in /lib/ld-2.11.1.so) ==2985== by 0x4000E5C: ??? (in /lib/ld-2.11.1.so) ==2985== Your program just tried to execute an instruction that Valgrind ==2985== did not recognise. There are two possible reasons for this. ==2985== 1. Your program has a bug and erroneously jumped to a non-code ==2985== location. If you are running Memcheck and you just saw a ==2985== warning about a bad jump, it's probably your program's fault. ==2985== 2. The instruction is legitimate but Valgrind doesn't handle it, ==2985== i.e. it's Valgrind's fault. If you think this is the case or ==2985== you are not sure, please let us know and we'll try to fix it. ==2985== Either way, Valgrind will now raise a SIGILL signal which will ==2985== probably kill your program. ==2985== ==2985== Process terminating with default action of signal 4 (SIGILL) ==2985== Illegal opcode at address 0x4016FD8 ==2985== at 0x4016FD8: ??? (in /lib/ld-2.11.1.so) ==2985== by 0x4000E5C: ??? (in /lib/ld-2.11.1.so) ==2985== ==2985== HEAP SUMMARY: ==2985== in use at exit: 0 bytes in 0 blocks ==2985== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==2985== ==2985== All heap blocks were freed -- no leaks are possible ==2985== ==2985== For counts of detected and suppressed errors, rerun with: -v ==2985== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Illegal instruction So "disassemble" this instruction and find out what it is: vex mips->IR: unhandled instruction bytes: 0x7C 0x65 0x18 0xA One way is to generate a small file ----- foo.S foo: .byte 0x7C,0x65,0x18,0xA .byte 0x0a,0x18,0x65,0x7c nop ----- where the second word is just in case byte order has been reversed, and the 'nop' is to allow for running off the end. Then process with: $ gcc -c foo.S $ gdb foo.o (gdb) x/3i 0 @John Reiser We got the instruction: [quanfuw@aont2 wqf]$ /opt/tools/broadlight/sysroot/broadlight_lilac-glibc_small/x86-linux2/mips-wrs-linux-gnu-mips_74k_softfp-glibc_small-gdb dasm.o GNU gdb (Wind River Linux Sourcery G++ 4.4a-323) 7.2.50.20100908-cvs Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=i686-pc-linux-gnu --target=mips-wrs-linux-gnu". For bug reporting instructions, please see: <support@windriver.com>... Reading symbols from /tmp/wqf/dasm.o...(no debugging symbols found)...done. (gdb) (gdb) x/3i 0 0x0 <foo>: lwx v1,a1(v1) 0x4 <foo+4>: j 0x86195f0 0x8 <foo+8>: nop (gdb) According to https://www.mips.com/media/files/MD00566-2B-MIPSDSP-QRC-01.00.pdf, then " lwx v1,a1(v1)" is v1 = mem(v1 + a1), which is ordinary double indexing: fetch from memory at address which is the sum of two registers. This should be easy to implement in VEX/priv/guest_mips_toIR.c. Find the correct decoding case, insert code based on the code for double indexing in i686 or amd64. (In reply to comment #10) > @John Reiser > We got the instruction: > [quanfuw@aont2 wqf]$ > /opt/tools/broadlight/sysroot/broadlight_lilac-glibc_small/x86-linux2/mips- > wrs-linux-gnu-mips_74k_softfp-glibc_small-gdb dasm.o > GNU gdb (Wind River Linux Sourcery G++ 4.4a-323) 7.2.50.20100908-cvs > Copyright (C) 2010 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "--host=i686-pc-linux-gnu > --target=mips-wrs-linux-gnu". > For bug reporting instructions, please see: > <support@windriver.com>... > Reading symbols from /tmp/wqf/dasm.o...(no debugging symbols found)...done. > (gdb) > (gdb) x/3i 0 > 0x0 <foo>: lwx v1,a1(v1) > 0x4 <foo+4>: j 0x86195f0 > 0x8 <foo+8>: nop > (gdb) @Quanfu Wang Valgrind for MIPS does not support DSP ASE yet, but it will very soon. So if your program makes use of instructions from DSP ASE instruction set, you will have to wait for the patches to be ready. I think this issue should be closed. As there are no more comments, I see no reason not to mark this as resolved/fixed. Please open a new issue if you come across this page and think you are seeing similar issue. @Quanfu Wang MIPS32 DSP ASE has been implemented in Valgrind. Download the latest trunk and try it. Please, when closing a bug report as RESOLVED/FIXED, put in the revision number(s) for the fix. Thanks. (In reply to comment #15) > Please, when closing a bug report as RESOLVED/FIXED, put in the > revision number(s) for the fix. Thanks. @Julian There was no fix for the reported issue, I do not think it was valid when it was reported. It certainly was valid when reported, and the evidence is plain: ==527== valgrind: Unrecognised instruction at address 0x4016348. ##### WHAT WAS IT? ##### ==527== at 0x4016348: _dl_sysdep_start (in /lib/ld-2.13.so) ==527== by 0x4001F48: _dl_start_final (in /lib/ld-2.13.so) ==527== Your program just tried to execute an instruction that Valgrind See Comment 4 in this current bug report for a line-by-line analysis of the exact literal report by valgrind-3.8.1 as of last October 31, 2012. The opcode value (instruction word or bytes) did not appear in the report, and this made it impossible for a remote responder [me] to determine if the problem could be worked-around by re-compiling the target for a different sub-architecture, or otherwise provide advice to triage the situation. The user interface that is used to report detected exceptions matters very much. The report begins with the PID surrounded by double equal signs. Any line that does not contain those characters is NOT part of the report. Users copy+paste what LOOKS LIKE the entire complaint, and a line such as "vex mips->IR: unhandled instruction bytes: 0x0 0x67 0x6E 0x44" that does not contain the PID surrounded by double equal signs does not LOOK LIKE part of the report. Furthermore, a line such as "vex mips->IR ..." (even without the PID surrounded by double equal signs) was NOT EVEN THERE last October: see Comment 6 for a complete, unmodified console log that does not show such a line. THE OPCODE WAS NOT THERE. (In reply to comment #17) > It certainly was valid when reported, and the evidence is plain: > ==527== valgrind: Unrecognised instruction at address 0x4016348. ##### WHAT > WAS IT? ##### > ==527== at 0x4016348: _dl_sysdep_start (in /lib/ld-2.13.so) > ==527== by 0x4001F48: _dl_start_final (in /lib/ld-2.13.so) > ==527== Your program just tried to execute an instruction that Valgrind > See Comment 4 in this current bug report for a line-by-line analysis of the > exact literal report by valgrind-3.8.1 as of last October 31, 2012. > > The opcode value (instruction word or bytes) did not appear in the report, > and this made it impossible for a remote responder [me] to determine if the > problem could be worked-around by re-compiling the target for a different > sub-architecture, or otherwise provide advice to triage the situation. > > The user interface that is used to report detected exceptions matters very > much. The report begins with the PID surrounded by double equal signs. Any > line that does not contain those characters is NOT part of the report. > Users copy+paste what LOOKS LIKE the entire complaint, and a line such as > "vex mips->IR: unhandled instruction bytes: 0x0 0x67 0x6E 0x44" that does > not contain the PID surrounded by double equal signs does not LOOK LIKE part > of the report. Furthermore, a line such as "vex mips->IR ..." (even without > the PID surrounded by double equal signs) was NOT EVEN THERE last October: > see Comment 6 for a complete, unmodified console log that does not show such > a line. THE OPCODE WAS NOT THERE. @John Reiser I am not trying to spawn a discussion here. I could not reproduce the issue when it was brought to my attention (March 3). I guess it was existent at some point before, and I believe you were able to reproduce it last Halloween. |