Bug 295403

Summary: Memory access below SP with some STRD instructions.
Product: [Developer tools] valgrind Reporter: Jacob Bramley <Jacob.Bramley+kde>
Component: vexAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: normal CC: matt.cowell
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: Minimal test case. Build with: -nostartfiles -nodefaultlibs -static
Increase allowed offsets for ARM early writeback of SP base register in strd

Description Jacob Bramley 2012-03-06 09:20:16 UTC
Created attachment 69321 [details]
Minimal test case. Build with: -nostartfiles -nodefaultlibs -static

The following instruction (for example) will generate a warning from memcheck:

strd	r0, r1, [sp, #-8]!

VEX produces IR in the following order:
 * Calculate address.
 * Write memory.
 * Update base register (sp).

The same thing happens with register-indexed stores, but not with STR, STM or VSTM. I see exactly the same behaviour in both ARM and Thumb (except that Thumb doesn't support register-index).

Memcheck produces something like this:
==20445== Invalid write of size 4
==20445==    at 0x80A8: ??? (in /work/misc/check-sp-update/sp-update)
==20445==  Address 0xbd82e6ec is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes
Comment 1 Matt Cowell 2016-10-21 21:33:19 UTC
Created attachment 101695 [details]
Increase allowed offsets for ARM early writeback of SP base register in strd

GCC 5.4 (and likely all versions 4.8+) have a larger (unlimited?) range for using strd to allocate the stack frame, at least when compiling with -mcpu=cortex-a15.  ld.so and libc compiled with GCC 5.4 have offsets up to #-40: "strd    r3, r4, [sp, #-40]!".

Without this fix, hundreds of "Invalid write of size 4" ... "below stack pointer" errors are logged, starting in ld-*.so, which of course leads to millions of "uninitialised value" errors being logged, and valgrind becomes useless on ARMv7 / Cortex A15.

This simply removes the check for a -8 or -16 byte offset, since all offsets should be allowable for allocating a stack frame.