Bug 228060

Summary: ARMv5 CP15 operations are not supported.
Product: [Developer tools] valgrind Reporter: Jacob Bramley <Jacob.Bramley+kde>
Component: vexAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: crash CC: cpigat242, glider, konstantin.s.serebryany
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Minimal test program to exercise ARMv5 CP15 operations.
A patch that adds the memory fence support to Valgrind on ARM

Description Jacob Bramley 2010-02-22 15:44:25 UTC
Created attachment 41011 [details]
Minimal test program to exercise ARMv5 CP15 operations.

Valgrind supports ARMv5 binaries, but unfortunately does not properly interpret the few CP15 operations available to user-space binaries in ARMv5. (Lack of support for the privileged-mode CP15 operations is not important.)

The complaint is as follows:

disInstr(arm): unhandled instruction: 0xEE070F95
                 cond=14(0xE) 27:20=224(0xE0) 4:4=1 3:0=5(0x5)

There are two such instructions:

    @ ARMv5 interpretation:     Flush prefetch buffer.
    @ ARMv6+ interpretation:    Instruction synchronization barrier.
    mcr     15, 0, r0, c7, c5, 4        @ 0xEE070F95

    @ ARMv5 interpretation:     Drain write buffer.
    @ ARMv6+ interpretation:    Data synchronization barrier.
    mcr     15, 0, r0, c7, c10, 4       @ 0xEE070F9A

----

ARMv6 and ARMv7 have numerous similar operations, but support for those isn't required by Valgrind at this stage as they're above the ARMv5 baseline.

The attached example program should run fine in Valgrind on a supported ARM platform. (There are a few commented-out lines which are ARMv6 and ARMv7 instructions, for reference.)

Thanks,
Jacob
Comment 1 Alexander Potapenko 2010-02-26 13:48:18 UTC
I can confirm this problem on an NVidia Tegra board.
Every program I run crash with the "Unhandled instruction" error:

 $ valgrind --tool=none false          
==2310== Nulgrind, the minimal Valgrind tool
==2310== Copyright (C) 2002-2009, and GNU GPL'd, by Nicholas Nethercote.
==2310== Using Valgrind-3.6.0.SVN and LibVEX; rerun with -h for copyright info
==2310== Command: false
==2310== 
disInstr(arm): unhandled instruction: 0xEE070FBA
                 cond=14(0xE) 27:20=224(0xE0) 4:4=1 3:0=10(0xA)
==2310== valgrind: Unrecognised instruction at address 0xffff0fa0.
...

Google tells me 0xFFFF0FA0 is the entry point of the kernel's __kernel_dmb helper syscall that performs the memory barrier for userspace applications. So the EE070FBA instruction probably belongs to the set Jacob has posted.

The jump to 0xFFFF0FA0 is probably done from /lib/ld-2.9.so
I haven't seen this problem with /lib/ld-2.10.1.so on another board.

$ uname -a
Linux chrome-arm4 2.6.29 #1 SMP Thu Feb 11 13:35:07 PST 2010 armv7l GNU/Linux
Comment 2 Jacob Bramley 2010-02-26 14:04:57 UTC
"0xEE070FBA" is this:
mcr	15, 0, r0, cr7, cr10, {5}

That's actually an ARMv6 "Data Memory Barrier" operation, and Valgrind doesn't claim to support ARMv6 (yet). I've seen cases where system and compiler libraries use ARMv6 instructions, causing programs to fail even when they are compiled for ARMv5.
Comment 3 Alexander Potapenko 2010-03-31 11:32:41 UTC
Just for the record:
the data memory barrier instruction in the Linux kernel is used at the __kuser_cmpxchg and __kernel_dmb entry points iff the CONFIG_SMP option is set (i.e. it does not depend on the ld.so version or the kernel version).

A possible workaround is to disable this option or to comment out the "mcr     p15, 0, r0, c7, c10, 5  @ dmb" lines in arch/arm/kernel/entry-armv.S
However, a better solution is to patch Valgrind for supporting the DMB.
Comment 4 Alexander Potapenko 2010-03-31 11:42:03 UTC
Created attachment 42399 [details]
A patch that adds the memory fence support to Valgrind on ARM
Comment 5 Alexander Potapenko 2010-03-31 11:43:08 UTC
Comment on attachment 42399 [details]
A patch that adds the memory fence support to Valgrind on ARM 

Attached is the patch for the DMB instruction on ARM.
"mcr p15, 0, r0, c7, c10, 5" is translated into the IRStmt_MBE(Imbe_Fence) IR statement, which in turn emits the same assembly code on the host ARM system.

I don't know whether it's totally correct, but Valgrind does not complain on the Tegra's SMP-enabled kernel anymore.
Comment 6 Julian Seward 2010-05-04 10:50:00 UTC
An extended version of the patch was committed as vex r1979.
It handles DMB DSB ISB and their CP15 equivalents.
Comment 7 Alexander Potapenko 2011-06-08 10:51:22 UTC
We've occasionally met another version of CP15 barrier:

disInstr(arm): unhandled instruction: 0xEE073FBA
                 cond=14(0xE) 27:20=224(0xE0) 4:4=1 3:0=10(0xA)
...
==21980== Process terminating with default action of signal 4 (SIGILL)
==21980==  Illegal opcode at address 0x57FCD90
==21980==    at 0x57FCD90: NvOsAtomicExchangeAdd32 (in /usr/lib/libnvos.so)

EE073FBA is "mcr p15, 0, r3, c7, c10, 5" -- looks like this can be used as a memory barrier as well.
It'll be nice to handle all the possible rN values.
Comment 8 Alexander Potapenko 2011-06-09 09:56:51 UTC
More barrier examples found by Ami Fischman:
==========================================================
--- guest_arm_toIR.c.DIST       2011-06-08 10:02:07.610142793 -0700
+++ guest_arm_toIR.c    2011-06-08 10:18:13.900458024 -0700
@@ -13930,6 +13930,21 @@
          stmt( IRStmt_MBE(Imbe_Fence) );
          DIP("mcr 15, 0, r0, c7, c10, 5 (data memory barrier)\n");
          goto decode_success;
+     case 0xEE073FBA: /* v7 */
+         /* mcr p15, 0, r3, c7, c10, 5 */
+         stmt( IRStmt_MBE(Imbe_Fence) );
+         DIP("mcr p15, 0, r3, c7, c10, 5\n");
+         goto decode_success;
+     case 0xEE074FBA: /* v7 */
+         /* mcr p15, 0, r4, c7, c10, 5 */
+         stmt( IRStmt_MBE(Imbe_Fence) );
+         DIP("mcr p15, 0, r4, c7, c10, 5\n");
+         goto decode_success;
+     case 0xEE076FBA: /* v7 */
+         /* mcr p15, 0, r6, c7, c10, 5 */
+         stmt( IRStmt_MBE(Imbe_Fence) );
+         DIP("mcr p15, 0, r6, c7, c10, 5\n");
+         goto decode_success;
       case 0xEE070F95: /* v6 */
          /* mcr 15, 0, r0, c7, c5, 4 (v6) equiv to ISB (v7).
             Instruction Synchronisation Barrier (or Flush Prefetch
==========================================================

I wonder if it's correct to handle these in such a way.