Bug 276897 - ARM v6 legacy patches
Summary: ARM v6 legacy patches
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.7 SVN
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-01 15:57 UTC by JW
Modified: 2013-09-27 12:30 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
patch to apply on r11823 (5.76 KB, patch)
2011-07-01 15:57 UTC, JW
Details
patch speeds dispatch, works on all ARM CPU (8.50 KB, patch)
2011-09-29 19:41 UTC, John Reiser
Details
patch for armv5 (14.27 KB, patch)
2011-10-03 04:16 UTC, John Reiser
Details
GENOFFSET(ARM,arm,TPIDRURO) in VEX/auxprogs/genoffsets.c (364 bytes, patch)
2011-10-03 05:19 UTC, John Reiser
Details
fix dispatch for thumb mode (10.45 KB, patch)
2011-10-08 19:51 UTC, John Reiser
Details
patch improves TT_FAST cache hits in ARM mode (89.02 KB, patch)
2011-10-09 17:06 UTC, John Reiser
Details
patch improves TT_FAST cache hits in ARM mode (11.48 KB, patch)
2011-10-09 17:09 UTC, John Reiser
Details
patch inner loops for speed and ARM cache hits (no support for v5/v6) (9.89 KB, patch)
2011-10-27 17:23 UTC, John Reiser
Details
patch inner loops for speed and ARM cache hits (no support for v5/v6) (9.86 KB, patch)
2011-10-27 18:00 UTC, John Reiser
Details

Note You need to log in before you can comment on or make changes to this bug.
Description JW 2011-07-01 15:57:05 UTC
Created attachment 61533 [details]
patch to apply on r11823

With minor modifications valgrind runs on ARMv6.

Attached is a patch , which will

- choose at configuration time either armv6 or cortex build
  . arm7 toolchain will default to cortex as before
  . arm (generic) toolchain will configure for armv6

- replace armv7-only instructions with slower armv6 variants in the scheduler assembly files

- pay attention to the 16k alignment requirements for MAP_FIXED mmap calls on armv6 platforms

I am not an expert in anything regarding valgrind or autotools, so re-write as you like...
Comment 1 John Reiser 2011-09-29 19:41:57 UTC
Created attachment 64077 [details]
patch speeds dispatch, works on all ARM CPU

This patch doubles the speed of instruction dispatching, and works on all ARM CPU.
Comment 2 John Reiser 2011-10-03 04:16:44 UTC
Created attachment 64147 [details]
patch for armv5

I have gotten valgrind(memcheck) to run on armv5te.
There is lots of testing ahead.  So far what works is
"valgrind /bin/date".
Work was done natively on a sheevaplug (1200 BogoMIPS, 512MB RAM)
running Debian testing (wheezy).

Starting from JW's patch, I recoded the dispatch inner loop for speed,
avoided instructions not present on armv5te (movt, movw, ldd, floating point), and found a way to handle Thread Local Storage (tls).
I believe that the code might run on armv4 if the few 'bx' instructions
were re-written as "mov pc,rx", but I have no way to test that these
are the only changes needed.
I also helped find and fix mis-aligned fetches in the readelf.c and readdwarf.c
(already committed to SVN by Tom Hughes.)


My environment:
----- /proc/cpuinfo
Processor       : Feroceon 88FR131 rev 1 (v5l)
BogoMIPS        : 1192.75
Features        : swp half thumb fastmult edsp
CPU implementer : 0x56  
CPU architecture: 5TE
CPU variant     : 0x2
CPU part        : 0x131
CPU revision    : 1

Hardware        : Marvell SheevaPlug Reference Board
-----
512MB RAM
-----
Debian testing (wheezy)
Linux sheevaplug 2.6.32-5-kirkwood #1 Wed Jan 12 15:27:07 UTC 2011 armv5tel GNU/Linux
gcc (Debian 4.6.1-4) 4.6.1
libc6  2.13-21
-----
Comment 3 John Reiser 2011-10-03 05:19:28 UTC
Created attachment 64149 [details]
GENOFFSET(ARM,arm,TPIDRURO) in VEX/auxprogs/genoffsets.c

Need an OFFSET for TPIDRURO.
Comment 4 John Reiser 2011-10-08 19:51:21 UTC
Created attachment 64345 [details]
fix dispatch for thumb mode

"valgrind --tool=memcheck ./hello" now works in both ARM mode and Thumb mode (gcc -mthumb).
Comment 5 John Reiser 2011-10-09 17:06:13 UTC
Created attachment 64365 [details]
patch improves TT_FAST cache hits in ARM mode

Use all cache slots in ARM mode, instead of just the even-numbered ones.  By using ARM conditional execution and dual issue, this takes no more cycles in the dispatcher loop.
Comment 6 John Reiser 2011-10-09 17:09:27 UTC
Created attachment 64366 [details]
patch improves TT_FAST cache hits in ARM mode

The right subset patch this time.  (The previous one was everything I'm working on, this one is TT_FAST cache only.)
Comment 7 John Reiser 2011-10-27 17:23:55 UTC
Created attachment 64945 [details]
patch inner loops for speed and ARM cache hits (no support for v5/v6)

armv7 and above still can benefit from faster dispatching and twice as many usable cache slots in ARM mode.  This patch contains only those changes (and thus will not run on armv5 or armv6.)  [Compiles, but UNTESTED because I have no armv7.]
Comment 8 John Reiser 2011-10-27 18:00:13 UTC
Created attachment 64946 [details]
patch inner loops for speed and ARM cache hits (no support for v5/v6)

Fix bug due to patch editing glitch.
Comment 9 Tomalak Geret'kal 2012-04-29 19:01:17 UTC
Could it be confirmed as to which order these patches should be applied? Common sense tells me that they should be applied in order, omitting those marked as obsolete (rendered with strikethrough), but on this thread [1] you indicate that #6 requires #2 to be reverted. However, #6 has no ./configure changes.

[1]: http://old.nabble.com/valgrind-support-for-ARMv5TE-with-MMU-td32332813.html
Comment 10 Jint George 2012-07-12 14:27:50 UTC
Hi, 
I am trying to run valgrind on ARM v6 architecture. I did the patch - "attachment 61533 [details] " and then did make valgrind. After when I try to execute valgrind on the target (ARM v6 platform on Linux), following error is shown: "Illegal Instruction set".

Anyone had the same issues or do I need to proceed in a different path? Please advise me on the same.
Comment 11 Jint George 2012-07-12 14:33:09 UTC
Hi, 
I am trying to run valgrind on ARM v6 architecture. I did the patch - "attachment 61533 [details] " and then did make valgrind. After when I try to execute valgrind on the target (ARM v6 platform on Linux), following error is shown: "Illegal Instruction set".

Anyone had the same issues or do I need to proceed in a different path? Please advise me on the same.
Comment 12 Arturs Galapovs 2013-01-24 16:17:00 UTC
Also having ILLSIG problem running valgrind.

--913-- Valgrind options:
--913--    -v
--913--    --vgdb=no
--913--    --db-command=/usr/bin/gdb
--913--    --db-attach=yes
--913--    --tool=memcheck
--913-- Contents of /proc/version:
--913--   Linux version 3.0.17 (gcc version 4.5.3 (crosstool-NG 1.15.2 - apollo_linux 0.1.0) ) #1 PREEMPT Tue May 15 08:00:00 MST 2012
--913-- Arch and hwcaps: ARM, ARMv6
--913-- Page sizes: currently 4096, max supported 4096
--913-- Valgrind library directory: /usr/lib/valgrind
--913-- Reading syms from /usr/bin/Tests (0x8000)
--913-- Reading syms from /lib/ld-2.13.so (0x4000000)
--913-- Reading syms from /usr/lib/valgrind/memcheck-arm-linux (0x38000000)
--913--    object doesn't have a dynamic symbol table
--913-- Reading suppressions file: /usr/lib/valgrind/default.supp
==913==
==913== Process terminating with default action of signal 4 (SIGILL)
==913==  Illegal opcode at address 0x380D03F4
==913==    at 0x4000780: ??? (in /lib/ld-2.13.so)
Comment 13 Arturs Galapovs 2013-01-25 11:46:50 UTC
(In reply to comment #12)
> Also having ILLSIG problem running valgrind.
> 
> --913-- Valgrind options:
> --913--    -v
> --913--    --vgdb=no
> --913--    --db-command=/usr/bin/gdb
> --913--    --db-attach=yes
> --913--    --tool=memcheck
> --913-- Contents of /proc/version:
> --913--   Linux version 3.0.17 (gcc version 4.5.3 (crosstool-NG 1.15.2 -
> apollo_linux 0.1.0) ) #1 PREEMPT Tue May 15 08:00:00 MST 2012
> --913-- Arch and hwcaps: ARM, ARMv6
> --913-- Page sizes: currently 4096, max supported 4096
> --913-- Valgrind library directory: /usr/lib/valgrind
> --913-- Reading syms from /usr/bin/Tests (0x8000)
> --913-- Reading syms from /lib/ld-2.13.so (0x4000000)
> --913-- Reading syms from /usr/lib/valgrind/memcheck-arm-linux (0x38000000)
> --913--    object doesn't have a dynamic symbol table
> --913-- Reading suppressions file: /usr/lib/valgrind/default.supp
> ==913==
> ==913== Process terminating with default action of signal 4 (SIGILL)
> ==913==  Illegal opcode at address 0x380D03F4
> ==913==    at 0x4000780: ??? (in /lib/ld-2.13.so)

Tried to feed more specific ARM architecture to valgrind configurure.in ( -march=armv6j -mcpu=arm1136jf-s ). Result the same - SIGILL
Comment 14 Arturs Galapovs 2013-02-25 19:42:40 UTC
Managed to run valgrind with additional modification. Problem was in floating point so-processor absence in our ARMv6 processor. If someoni is interested in hack I applyed to make it happen, below is a link:
https://docs.google.com/file/d/0B1bM8kFpEvB-b1hIZFlHb0dHNEU/edit

Not attaching this patch directly because it is the most fastest hack I could think off.
Comment 15 Peter 2013-09-27 12:30:22 UTC
(In reply to comment #14)
> Managed to run valgrind with additional modification. Problem was in
> floating point so-processor absence in our ARMv6 processor. If someoni is
> interested in hack I applyed to make it happen, below is a link:
> https://docs.google.com/file/d/0B1bM8kFpEvB-b1hIZFlHb0dHNEU/edit
> 
> Not attaching this patch directly because it is the most fastest hack I
> could think off.

Hi,
when I applied the patch the SIGILL problem disappeared, but now there's another problem.
When I try to test any program with Valgrind, for instance ls or echo, or whatever else, Valgrind says it caused a segmentation fault. 
Example:

root@mygw:~/valgrind# ./coregrind/valgrind ls
==3588== Memcheck, a memory error detector
==3588== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al.
==3588== Using Valgrind-3.7.0.SVN and LibVEX; rerun with -h for copyright info
==3588== Command: ls
==3588== 
==3588== Conditional jump or move depends on uninitialised value(s)
==3588==    at 0x400AD2C: ??? (in /lib/ld-2.7.so)
==3588== 
==3588== Conditional jump or move depends on uninitialised value(s)
==3588==    at 0x400AE7C: ??? (in /lib/ld-2.7.so)
==3588== 
==3588== Conditional jump or move depends on uninitialised value(s)
==3588==    at 0x400B028: ??? (in /lib/ld-2.7.so)
==3588== 
==3588== Invalid read of size 4
==3588==    at 0x400DB6C: ??? (in /lib/ld-2.7.so)
==3588==  Address 0x4001cfa0 is not stack'd, malloc'd or (recently) free'd
==3588== 
==3588== 
==3588== Process terminating with default action of signal 11 (SIGSEGV)
==3588==  Access not within mapped region at address 0x4001CFA0
==3588==    at 0x400DB6C: ??? (in /lib/ld-2.7.so)
==3588==  If you believe this happened as a result of a stack
==3588==  overflow in your program's main thread (unlikely but
==3588==  possible), you can try to increase the size of the
==3588==  main thread stack using the --main-stacksize= flag.
==3588==  The main thread stack size used in this run was 8388608.
==3588== 
==3588== Process terminating with default action of signal 11 (SIGSEGV)
==3588==  Access not within mapped region at address 0x4001CFA0
==3588==    at 0x400DB6C: ??? (in /lib/ld-2.7.so)
==3588==  If you believe this happened as a result of a stack
==3588==  overflow in your program's main thread (unlikely but
==3588==  possible), you can try to increase the size of the
==3588==  main thread stack using the --main-stacksize= flag.
==3588==  The main thread stack size used in this run was 8388608.
==3588== 
==3588== HEAP SUMMARY:
==3588==     in use at exit: 0 bytes in 0 blocks
==3588==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3588== 
==3588== All heap blocks were freed -- no leaks are possible
==3588== 
==3588== For counts of detected and suppressed errors, rerun with: -v
==3588== Use --track-origins=yes to see where uninitialised values come from
==3588== ERROR SUMMARY: 5 errors from 4 contexts (suppressed: 0 from 0)
Segmentation fault
root@mygw:~/valgrind#