Summary: | disInstr(arm): unhandled instruction: 0xF1010200, valgrind: Unrecognised instruction on Raspbian | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Sum <keansum> |
Component: | memcheck | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED INTENTIONAL | ||
Severity: | normal | CC: | alexander.ressler, mark, minfrin, mugnyte, noloader, noone.junkmail, peter.maydell, pjfloyd, sam, vasily.golubev, Werner.Frick |
Priority: | NOR | ||
Version: | 3.7.0 | ||
Target Milestone: | --- | ||
Platform: | Debian stable | ||
OS: | Linux | ||
URL: | http://stackoverflow.com/questions/17430731/valgrind-returning-an-unhandled-instruction-on-raspberry-pi | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
std::cout << std::endl valgrind error
int redirected to std::cout valgrind error test1.cpp objdump test2.cpp objdump Valgrind Trunk with patch make check failure Valgrind Trunk with patch test1.cpp failure Valgrind Trunk with patch test2.cpp failure test1 binary test2 binary test1.cpp test2.cpp libcofi_rpi.so objdump valgrind trunk configure step error |
Description
Sum
2013-07-29 07:33:12 UTC
Created attachment 81411 [details]
std::cout << std::endl valgrind error
Created attachment 81412 [details]
int redirected to std::cout valgrind error
Can you provide the output of "objdump - d /path/to/your/compiled/binary"? It will be helpful to identify which instruction means 0xF1010200. Unfortunately, I haven't device, so I can't compile your example directly. But it looks like this instruction is: f101 0200 add.w r2, r1, #0. For me it works OK with latest Valgrind. Can you test it with trunk version http://valgrind.org/downloads/repository.html ? In the case of error, please, attach compiled binary with example here. > disInstr(arm): unhandled instruction: 0xF1010200 This is "SETEND BE" (encoding A1), which means "switch to big-endian mode". So (a) this program is doing something pretty weird and (b) I'm not surprised valgrind isn't supporting it. (Vasily: that is the Thumb decoding, and we're in ARM mode here.) Looking at the backtrace I suspect this is the following memcmp-for-rpi implementation: https://github.com/bavison/arm-mem/blob/master/memcmp.S#L214 (I'm a bit dubious that that is actually the fastest way to do memcmp on this CPU, since as far as i'm aware SETEND is a fairly slow instruction.) Created attachment 81434 [details]
test1.cpp objdump
Created attachment 81435 [details]
test2.cpp objdump
Hi Vasily, sure, please find the two objdumps of compiled form of test1.cpp and test2.cpp. (In reply to comment #4) > Unfortunately, I haven't device, so I can't compile your example directly. > But it looks like this instruction is: f101 0200 add.w r2, r1, #0. For me > it works OK with latest Valgrind. Can you test it with trunk version > http://valgrind.org/downloads/repository.html ? In the case of error, > please, attach compiled binary with example here. Hi when I do the configure step I get this error: checking for a supported CPU... no (armv6l) configure: error: Unsupported host architecture. Sorry ... now I will follow instructions to apply the patch: http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=7689 Created attachment 81437 [details]
Valgrind Trunk with patch make check failure
Created attachment 81438 [details]
Valgrind Trunk with patch test1.cpp failure
Created attachment 81439 [details]
Valgrind Trunk with patch test2.cpp failure
Created attachment 81440 [details]
test1 binary
Created attachment 81441 [details]
test2 binary
Created attachment 81442 [details]
test1.cpp
Created attachment 81443 [details]
test2.cpp
I was able to follow instructions from here successfully to apply the patch http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=7689 .... except for make check failiing (log: "Valgrind Trunk with patch make check failure"). Compiled binaries are also attached (test1 binary, test2 binary) along with the example C++ files (test1.cpp, test2.cpp). The Valgrind checked out from svn version is: $ valgrind --version valgrind-3.7.0.SVN I am not sure how to modify the patch so the latest version is used. Thanks Vasily and Peter Maydell for all your help! Hello, Mr. Sum. Please, attach here also the output of objdump -d /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so. And one question - why 3.7.0? Did you use the instruction from http://valgrind.org/downloads/repository.html ? Current trunk is valgrind-3.9.0.SVN. Created attachment 81474 [details]
libcofi_rpi.so objdump
Created attachment 81475 [details]
valgrind trunk configure step error
Hi Mr Vasily. Thanks for your reply. Please find attached the output from (libcofi_rpi.so objdump): objdump -d /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so ... it's a tar.bz2 file as it is relatively large. Yes I used the instructions from http://valgrind.org/downloads/repository.html. I got an error at the configure step: ./configure checking for a supported CPU... no (armv6l) configure: error: Unsupported host architecture. Sorry I've attached the output from configure "valgrind trunk configure step error". As I was not able to run the configure command, that's when I tried the patch at http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=7689 I'll try again to see if I can compile it for the latest 3.9.0 tomorrow. Thanks. (In reply to comment #5) > > disInstr(arm): unhandled instruction: 0xF1010200 > > This is "SETEND BE" (encoding A1), which means "switch to big-endian mode". > So (a) this program is doing something pretty weird and (b) I'm not > surprised valgrind isn't supporting it. Urk! Can't we persuade the RPI people not to ship such a bizarre hack? I have exceedingly little enthusiasm to try and support this. (In reply to comment #22) > Urk! Can't we persuade the RPI people not to ship such a bizarre > hack? I have exceedingly little enthusiasm to try and support this. Me neither :-) [it doesn't work in QEMU either]. I'm surprised that it's faster, given that as I understand it SETEND is a pretty slow instruction in hardware. I assume whoever implemented it benchmarked it though... (In reply to comment #22) > I have exceedingly little enthusiasm to try and support this. On further consideration it's not merely a question of "little enthusiasm", but more like "would require major rework of the ARM-level JIT machinery to fix". So I'm going to WONTFIX this. Please, Raspbian people, take this horrible hack out of your libc. Hi Mr Julian, Mr Peter and Vasily, thanks for your time and the information. Given the comments from the Raspbian folks on this: http://www.raspberrypi.org/phpBB3/viewtopic.php?f=66&t=60166 It seems the SETEND instruction is open for use, and a valid instruction. It's still in v5.03 of their toolchain: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489e/Cjacabbf.html So I'm guessing ARMv6/ARMv7 are not actually a supported platforms. This doesn't completely jive with the supported platforms page or rankings (circa 2011): http://valgrind.org/info/platforms.html I would suggest reopening or clarifying on the Platforms page. I appreciate your work folks, really. But if this is WONTFIX, please don't advertise ARM is supported. I would put in a kind word to consider it worthy of supporting, but you know your priorities best. This keeps cropping up, for example most recently in bug 366464. Maybe I should explain more why this isn't supported. It's because we don't have a feasible way to do it. Valgrind's JIT instruments code blocks as they are first visited, and the endianness of the current blocks are "baked in" to the instrumentation. So there are two options: (1) when a SETEND instruction is executed, throw away all the JITted code that Valgrind has created, and JIT new code blocks with the new endianness. (2) JIT code blocks in an endian-agnostic way and have a runtime test for each memory access, to decide on whether to call a big or little endian instrumentation helper function. (1) gives zero performance overhead for code that doesn't use SETEND but a gigantic (completely infeasible) hit for code that does. (2) makes endian changes free, but penalises all memory traffic regardless of whether SETEND is actually used. So I don't find either of those acceptable. And I can't think of any other way to implement it. Truth be told, I don't believe this is really even necessary, either. In the old days, on x86 (32-bit) linux and ppc32-linux (note: 32-bit, little- and big-endian respectively) glibc used platform-specific code -- sometimes in C, sometimes in assembly -- to implement str* functions, and these normally process data in 32 bit chunks. For example strlen on x86 was done with 32 bit loads and some tricks to do with carry bit propagation, by adding magic constants 0x80808080 and/or 0xFEFEFEFF to the loaded values. So I don't get why rpi has to be special about this. Why can't it just follow existing practice? *** Bug 366464 has been marked as a duplicate of this bug. *** The way QEMU's JIT handles this kind of thing is that we track each translated code block by (start PC, cpu state flags), where the flags track the subset of the CPU's current state that we've baked into the translation. One of those state flags is "is CPSR.E set?", so when we later come to check whether we've already translated a code block we won't return one that was translated assuming the "wrong" endianness. (We also use this to be able to generate code that makes assumptions about the current setting of VFP vector length and stride, Thumb condexec bits, and some other stuff that only matters for kernel-mode code emulation.) This avoids the downsides of your options (1) and (2), though it does require that you're doing lookup of code blocks by something other than raw PC, which QEMU does anyway. I agree that rpi's memset implementation is a bit weird, but on the hardware they use (ARM1176) SETEND is pretty nearly free and it turns out to be fastest. They're not going to change it now, it's more likely that the rpi1 will vanish into the mists of history first. (In reply to Peter Maydell from comment #29) > The way QEMU's JIT handles this kind of thing [..] Thanks for the explanation. I was indeed wondering how QEMU handled this. Yes .. if I could redo the basic JIT architecture, I would indeed by very tempted to add some facility for speculative and multiversioned block translations. I think that would be useful from a performance standpoint. If your JIT architecture doesn't permit a QEMU-style approach I would be tempted to go with "implement SETEND to throw away JITted code and print a warning about poor performance". At the moment people trying to valgrind code that uses it find valgrind doesn't run their code at all, which you could define as infinitely slow :-) Alternatively, if valgrind could do a redirection of memcmp() in the offending .so file to its own implementation (the way it already does for a bunch of other functions) that would be a very raspi-specific hack but would cover 90%+ of the complaints I suspect (and you could combine this with the slow-SETEND implementation to handle the last 10%). I don't have a raspi though so this is all just commentary from the peanut gallery. (In reply to Peter Maydell from comment #31) > Alternatively, if valgrind could do a redirection of memcmp() in the > offending .so file to its own implementation (the way it already does for a > bunch of other functions) that would be a very raspi-specific hack but would > cover 90%+ of the complaints I suspect valgrind should already intercept the memcmp from glibc. This one however is in a different library libcofi_rpi.so which looks like some kind of hack to interpose some standard libc functions. It seems this is actually preloaded somehow. So it might be as simple as removing the preload hack when running under valgrind? (In reply to Mark Wielaard from comment #32) > valgrind should already intercept the memcmp from glibc. This one however is > in a different library libcofi_rpi.so which looks like some kind of hack to > interpose some standard libc functions. It seems this is actually preloaded > somehow. So it might be as simple as removing the preload hack when running > under valgrind? The idea would be to do something which works out of the box; otherwise you won't stop the trickle of bug reports (and probably larger set of users who just decide valgrind doesn't work without reporting a bug). (In reply to Peter Maydell from comment #33) > (In reply to Mark Wielaard from comment #32) > > valgrind should already intercept the memcmp from glibc. This one however is > > in a different library libcofi_rpi.so which looks like some kind of hack to > > interpose some standard libc functions. It seems this is actually preloaded > > somehow. So it might be as simple as removing the preload hack when running > > under valgrind? > > The idea would be to do something which works out of the box; otherwise you > won't stop the trickle of bug reports (and probably larger set of users who > just decide valgrind doesn't work without reporting a bug). Sure. But that requires someone with a raspi and knowledge of what this libcofi_rpi.so hackery really is. (In reply to Julian Seward from comment #27) > This keeps cropping up, for example most recently in bug 366464. Maybe > I should explain more why this isn't supported.... > > Truth be told, I don't believe this is really even necessary, either... Here' another alternative that was not available in 2013: run a different distro on the RPI, like openSUSE or CentOS. Here's a compelling reason to do so for some RPI devices, like the Raspberry Pi 3 (if the incompatibilities were not enough): performance. The RPI3 uses an ARM-32 armhf image and its under-performing on the ARMv8/Cortex-A53 processor it has. Also see https://stackoverflow.com/questions/41956400/thread-performance-issues-for-java-on-raspberry-pi?noredirect=1#comment71101862_41956400 . *** Bug 358620 has been marked as a duplicate of this bug. *** Thanks for your explanation. After all this history and comments like "This keeps cropping up..." why don't you put a comment on the valgrind download page saying that Raspian ist not supported. Everybody could benefit from it and save a lot of time. Apart from that, I really do appreciate your work on valgrind. Great tool. Having smashed headlong into this issue yet again, I have raised the following issue in an effort to get the RPi people to fix this bug: https://github.com/RPi-Distro/repo/issues/68 For the record, moving /etc/ld.so.preload out of the way and in the process disabling the RPI's memcpy optimisations causes valgrind to run correctly on the RPi. This is also fixed in more recent Raspberry Pi OS versions of libarmmem. See https://stackoverflow.com/questions/17430731/valgrind-returning-an-unhandled-instruction-on-raspberry-pi and the contained link https://github.com/bavison/arm-mem/pull/5 |