Bug 308573 - Internal Valgrind error on 64-bit instruction executed in 32-bit mode
Summary: Internal Valgrind error on 64-bit instruction executed in 32-bit mode
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: unspecified
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-17 19:23 UTC by Carl Love
Modified: 2012-11-02 14:48 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Fix missing checks for 64-bit instructions operating in 32-bit mode (3.98 KB, patch)
2012-10-18 20:08 UTC, Carl Love
Details
VEX: Add and use Ijk_SigILL (8.32 KB, patch)
2012-10-30 23:01 UTC, Mark Wielaard
Details
coregrind: Handle VEX_TRC_JMP_SIGILL (921 bytes, patch)
2012-10-30 23:02 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Love 2012-10-17 19:23:22 UTC
Valgrind is giving a type mismatch when the POWER PC rdicl is executed in 32-bit mode.  The rdicl instruction (64-bit rotate left, mask result) is a 64-bit category instruction.  If the rdicl instruction is executed on real 32-bit hardware a SIGILL signal is generated.  However, on 64-bit hardware running a 32-bit binary, the instruction executes, rotating the contents of the 64-bit register, without an error.  It is only possible to have a 64-bit category instruction in a 32-bit binary for a hand coded source file, the compiler will not generate such code.  The situation was recently found where the OPENSSL hand coded library was using the rdicl instruction to "probe" the hardware to determine if it was 32-bit hardware or 64-bit hardware.  We believe the coder expected to get the SIGILL signal when the code ran in 32-bit mode on real 64-bit hardware.  However that is not the case.  The IBM compiler team feels this is a bug in the library.  Appropriate bugs have been filed against OPENSSL.  The compiler team says that 99.999% of the time the use of the 64-bit instruction in 32-bit binaries is a bug, they feel we should just issue a SIGILL in Valgrind.  In Valgrind, most the 64-bit instructions test for 64-bit mode and call the decode error which causes Valgrind to print a message saying SIGILL.  There are a few instructions such as the rdicl where the test is missing.  We see two fixes for the 64-bit instruction execution for 32-bit binaries in Valgrind:
1)  Put the needed 64-bit mode tests in to make sure all 64-bit instructions generate SIGILL.  The downside is Valgrind will not run the same as reall hardware and in the very rare case where the programmer really, really knows what they are doing (i.e. they do have valid code)  their code will not run on Valgrind.
2)  Put checks into the 64-bit Valgrind code to detect that the mode is 32-bit, do not flag an instruction decode error, just return a 32-bit zero result and print a warning that the 64-bit instruction has been ignored in the 32-bit binary.  The solution will result in the OPENSSL code running just as it does in the real HW.  Note the OPENSSL code doesn't care what the result of the instruction is, just did it generate an error or not.  The vex_printf() can be used to print a warning but I am not sure this is really the best way.  There is something about using OFFB_EMNOTE to flag an emulation warning but I haven't had any luck yet getting any warning messages generated.

We are leaning toward solution 2 as it matches the real hardware operation.  Would be interested in feedback on what is the prefered way to generate the emulation warning message in Valgrind.

Reproducible: Always

Steps to Reproduce:
1.do a 32-bit compile of a hand coded assembly routine with 64-bit instructions
2.
3.
Comment 1 Carl Love 2012-10-17 19:29:22 UTC
adding Maynard to the cc list
Comment 2 Mark Wielaard 2012-10-17 19:39:12 UTC
This is also https://bugzilla.redhat.com/show_bug.cgi?id=810992 which has an example of the hand coded assembly that openssl uses that triggers this bug in comment number 6:
https://bugzilla.redhat.com/show_bug.cgi?id=810992#c6
Comment 3 Carl Love 2012-10-18 20:08:14 UTC
Created attachment 74638 [details]
Fix missing checks for 64-bit instructions operating in 32-bit mode

The attached patch adds the missing 32-bit mode checks to the 64-bit instructions.  The number of regression test passes and failures is not changed by this patch.

Please review the patch and post comments.  Thanks.
Comment 4 Julian Seward 2012-10-26 09:59:25 UTC
Looks fine to me.  Please commit.
Comment 5 Carl Love 2012-10-29 20:44:13 UTC
Committed the attached patch to fix the VEX code.  Vex committed revision 2558

Committed a fix to the memcheck/tests/ppc32/power_ISA2_05.c test to remove the 64-bit class instruction from the 32-bit test.  The expected output for the test was also updated.  Valgrind committed revision 13091.
Comment 6 Mark Wielaard 2012-10-30 21:43:02 UTC
The fix works, but is there any way to suppress all the output?

disInstr(ppc): unhandled instruction: 0x78000022
                 primary 30(0x1E), secondary 34(0x22)
==30163== valgrind: Unrecognised instruction at address 0xfd0d294.
==30163==    at 0xFD0D294: ??? (in /usr/lib/libcrypto.so.1.0.1c)
==30163==    by 0xFD93B93: OPENSSL_add_all_algorithms_noconf (in /usr/lib/libcrypto.so.1.0.1c)
==30163==    by 0x100005DB: main (in /home/test/testppc)
==30163== Your program just tried to execute an instruction that Valgrind
==30163== did not recognise.  There are two possible reasons for this.
==30163== 1. Your program has a bug and erroneously jumped to a non-code
==30163==    location.  If you are running Memcheck and you just saw a
==30163==    warning about a bad jump, it's probably your program's fault.
==30163== 2. The instruction is legitimate but Valgrind doesn't handle it,
==30163==    i.e. it's Valgrind's fault.  If you think this is the case or
==30163==    you are not sure, please let us know and we'll try to fix it.
==30163== Either way, Valgrind will now raise a SIGILL signal which will
==30163== probably kill your program.

The above is slightly misleading, the instruction was recognized, valgrind just decided (correctly) to throw a SIGILL in response. It is also printed when using -q to run silently, making it not very silent anymore :)
Comment 7 Mark Wielaard 2012-10-30 23:01:32 UTC
Created attachment 74889 [details]
VEX: Add and use Ijk_SigILL

> The above is slightly misleading, the instruction was recognized, valgrind just decided (correctly)
> to throw a SIGILL in response. It is also printed when using -q to run silently, making it not very
> silent anymore :)

How about making it possible to be explicit about (expected) illegal instructions like in this patch?
Comment 8 Mark Wielaard 2012-10-30 23:02:59 UTC
Created attachment 74890 [details]
coregrind: Handle VEX_TRC_JMP_SIGILL

coregrind part for VEX change
Comment 9 Mark Wielaard 2012-11-02 14:48:36 UTC
(In reply to comment #7)
> How about making it possible to be explicit about (expected) illegal
> instructions like in this patch?

After some discussion that wasn't the right way to go about this.
See https://bugs.kde.org/show_bug.cgi?id=309425 for an alternative (none-ppc-specific) solution.