278808 – PPC32 Special Instruction sequence clobbers R0 instead of preserving R0

Bug 278808 - PPC32 Special Instruction sequence clobbers R0 instead of preserving R0

Summary: PPC32 Special Instruction sequence clobbers R0 instead of preserving R0

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	general (show other bugs)
Version:	unspecified
Platform:	Unlisted Binaries Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-07-29 19:00 UTC by Dave Lerner
Modified:	2014-11-13 14:30 UTC (History)
CC List:	4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Dave Lerner 2011-07-29 19:00:00 UTC

The powerpc "Special Instruction" sequence for ppc32 used to flag special actions on by the vex translater is incorrect. The intended rotation is encoded wrong and becomes a shift that clobbers r0 instead of a rotation that preserves r0.  Instead of 
	rlwinm rA,rS,<n>,0,0
what was intened was
	rlwinm rA,rS,<n>,0,31
where the total of <n> in the four instructions adds to 64.  (See F.4.2 Operations on Words "Rotate lef imeediate" equivalent in 'The Programming Environment' reference).

This bug causes ppc32 memcheck/tests/sh-mem* tests to fail for my toolchain and target, incorrectly reporting when running valgrind memcheck/tests/sh-mem that the testapp is not RUNNING_ON_VALGRIND.

A simple patch which has been tested adds R0 to the list of clobbered registers, so while not preserving R0, at least the compiler knows it's clobbered. With the fix, RUNNING_ON_VALGRIND puts the 'true' result in the correct thread by reloading the thread id instead of assuming it was preserved across the macro in R0.

--- a/include/valgrind.h
+++ b/include/valgrind.h
@@ -406,7 +406,7 @@ typedef
                      "mr %0,3"     /*result*/                     \
                      : "=b" (_zzq_result)                         \
                      : "b" (_zzq_default), "b" (_zzq_ptr)         \
-                     : "cc", "memory", "r3", "r4");               \
+                     : "cc", "memory", "r3", "r4", "r0");         \
     _zzq_rlval = _zzq_result;                                     \
   }

@@ -419,7 +419,7 @@ typedef
                      "mr %0,3"                                    \
                      : "=b" (__addr)                              \
                      :                                            \
-                     : "cc", "memory", "r3"                       \
+                     : "cc", "memory", "r3", "r0"                 \
                     );                                            \
     _zzq_orig->nraddr = __addr;                                   \
   }

A cleaner patch would be to change the 4 assembler rlwinm instructions from a mask of 0 to a mask of 31, actually preserving R0, but this would require research looking for code that tests hex equivalents of the current opcodes.
(Found in version reported as 3.6.1 not in the version drop-down on the bug report form)

Comment 1 Florian Krohm 2011-10-21 02:51:55 UTC

Maynard, can you take a look? Would be good to get this in for 3.7.0

Dave, do you remember what toolchain produced the failure?

Comment 2 Julian Seward 2011-10-21 06:56:05 UTC

(In reply to comment #1)
> Maynard, can you take a look? Would be good to get this in for 3.7.0

It would be nice if it's possible.  However, I'd like for it to be fixed
by changing the rlwinms so they really are rotates, so that the combined
sequence leaves r0 unchanged, rather than by telling gcc that the sequence
clobbers r0.  A check that the 64-bit version isn't similarly broken would
be good.

Dave, are you 110% sure about your diagnosis?  I'm sure I checked this
pretty damn carefully, and I'm surprised this problem hasn't cropped up
before now.

Comment 3 Maynard Johnson 2011-10-21 17:47:43 UTC

(In reply to comment #1)
> Maynard, can you take a look? Would be good to get this in for 3.7.0
I don't have a 32-bit ppc platform to test on (my work is wholly with ppc64 servers), but I manually built the memcheck/tests/sh-mem testcase as 32-bit and ran it under valgrind on a SLES 11 SP1 system.  Works fine there.
> 
> Dave, do you remember what toolchain produced the failure?

Comment 4 Allison Karlitskaya 2013-11-20 22:04:11 UTC

(In reply to comment #2)
> Dave, are you 110% sure about your diagnosis?  I'm sure I checked this
> pretty damn carefully, and I'm surprised this problem hasn't cropped up
> before now.

We've hit this bug in GNOME twice now:

  https://bugzilla.gnome.org/show_bug.cgi?id=702648
  https://bugzilla.gnome.org/show_bug.cgi?id=710983

I independently came to the conclusion that it was because r0 was getting clobbered without being declared as such and tested a fix.  I then found this bug based on that analysis.  That should count for at least some confirmation.

The bug cropped up for us because we started using valgrind.h from GLib.  We're pretty widely-used, on many platforms, and with a good testsuite.  I think maybe this exact combination of circumstances just never came together before...

Comment 5 Julian Seward 2014-02-10 12:32:53 UTC

Fixed, valgrind rev 13797, vex rev 2816.  I just did the obvious thing
and changed the sequence to rlwinm rA,rS,<n>,0,31 as you pointed out,
and changed VEX to recognise that instead.  I also checked that the
64-bit version is actually correct, which it is.

One bad side effect of this now is that apps compiled against the old
version of valgrind.h (even with your clobbers-r0 annotation) will now
segfault when running on the fixed version of Valgrind, since the
client requests compiled into the app will no longer be recognized and
hence honoured by V.  I'm not sure what can be done about that, since
what this is is a bug in a public interface (effectively).

Comment 6 Mark Wielaard 2014-09-13 12:32:44 UTC

I am pondering to add the following code to help detect old code that might be using the old magic instructions on ppc32:

diff --git a/priv/guest_ppc_toIR.c b/priv/guest_ppc_toIR.c
index adabf64..8f7e4aa 100644
--- a/priv/guest_ppc_toIR.c
+++ b/priv/guest_ppc_toIR.c
@@ -18783,10 +18783,26 @@ DisResult disInstr_PPC_WRK (
       UInt word2 = mode64 ? 0x78006800 : 0x5400683E;
       UInt word3 = mode64 ? 0x7800E802 : 0x5400E83E;
       UInt word4 = mode64 ? 0x78009802 : 0x5400983E;
+      Bool is_special_preamble = False;
       if (getUIntPPCendianly(code+ 0) == word1 &&
           getUIntPPCendianly(code+ 4) == word2 &&
           getUIntPPCendianly(code+ 8) == word3 &&
           getUIntPPCendianly(code+12) == word4) {
+         is_special_preamble = True;
+      } else if (! mode64 &&
+                 getUIntPPCendianly(code+ 0) == 0x54001800 &&
+                 getUIntPPCendianly(code+ 4) == 0x54006800 &&
+                 getUIntPPCendianly(code+ 8) == 0x5400E800 &&
+                 getUIntPPCendianly(code+12) == 0x54009800) {
+         static Bool reported = False;
+         if (!reported) {
+            vex_printf("disInstr(ppc): old ppc32 instruction magic detected. Code might clobber r0.\n");
+            vex_printf("disInstr(ppc): source needs to be recompiled against latest valgrind.h.\n");
+            reported = True;
+         }
+         is_special_preamble = True;
+      }
+      if (is_special_preamble) {
          /* Got a "Special" instruction preamble.  Which one is it? */
          if (getUIntPPCendianly(code+16) == 0x7C210B78 /* or 1,1,1 */) {
             /* %R3 = client_request ( %R4 ) */

Unfortunalty there are old binaries out there that might be build against an old valgrind.h. This made worse by various projects embedding a private copy of the valgrind.h source. See e.g. http://codesearch.debian.net/search?q=This+file+is+part+of+Valgrind&path=valgrind.h

Comment 7 Mark Wielaard 2014-11-13 14:30:10 UTC

After discussion on irc I pushed the warning/detection of old ppc32 magic instruction sequences from comment #6 as VEX svn r2990.