Bug 309921

Summary: PCMPISTRI validity bit propagation is imprecise
Product: [Developer tools] valgrind Reporter: Patrick J. LoPresti <lopresti>
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: 3.9.0.SVN   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Test case illustrating pcmpistri validity bit propagation failure
Refactoring PCMPISTRI 0x3A
Fix (harmless) bug in test case
[VEX] Fix PCMPISTRI 0x3A (and 0x38)
[VEX] Simplified (and corrected) patch
[VEX] Yet another version of patch
Patch to add regtest for pcmpistri

Description Patrick J. LoPresti 2012-11-11 20:37:06 UTC
The Intel C compiler uses the (hideously-complex-whatever-happened-to-RISC) PCMPISTRI instruction in various optimized string routines.

I am attaching an illustrative test case.  Hopefully, there are only a few permutations of this instruction that actually need precise validity bit tracking...

Reproducible: Always

Steps to Reproduce:
1. gcc -O3 -o test test.c
2. valgrind --partial-loads-ok=yes ./test
3. :-(
Actual Results:  
"Syscall param exit_group(status) contains uninitialised byte(s)"

Expected Results:  
No warnings, since this program always exits with status 12, because the bytes beyond the end of the (aligned) load are never used.
Comment 1 Patrick J. LoPresti 2012-11-11 20:38:10 UTC
Created attachment 75181 [details]
Test case illustrating pcmpistri validity bit propagation failure
Comment 2 Patrick J. LoPresti 2012-11-12 01:19:06 UTC
Created attachment 75187 [details]
Refactoring PCMPISTRI 0x3A

This patch factors out the PCMPISTRI 0x3A case (and 0x38, which is equivalent) and simplifies the code.  This C reference code can serve as a basis, I hope, for generating expanded IR.

Actually if you trace the code paths though, it is not too bad...  A bunch of the code is just marshalling data through the dirty helper interface; the actual computation can all be expressed as straightforward vector and scalar operations, I think.

Note that this patch does not break any regression tests, so I am pretty confident it is a valid refactoring.
Comment 3 Patrick J. LoPresti 2012-11-12 04:59:23 UTC
Created attachment 75190 [details]
Fix (harmless) bug in test case

PCMPISTRI clobbers condition codes
Comment 4 Patrick J. LoPresti 2012-11-12 05:15:30 UTC
Created attachment 75191 [details]
[VEX] Fix PCMPISTRI 0x3A (and 0x38)

I decided to go ahead and "just do it".

This patch replaces the call to a "dirty helper" with memcheck-friendly IR for PCMPISTRI 0x3A/0x38 (aka. "strlen").  It also includes the refactored C code so you can compare the IR to the original C...  Note that all of this code is dead except for the diffs against guest_amd64_toIR.c, and part of that is #ifdef'd out (the "dirty helper" version).

All regression tests still pass, plus the test case attached to this bug now works.
Comment 5 Patrick J. LoPresti 2012-11-12 23:43:21 UTC
Created attachment 75204 [details]
[VEX] Simplified (and corrected) patch

This is nearly identical to patch 75191, but without all of the dead code (which was only useful as a reference anyway).

Also it fixes a bug: Shl32(1, Ctz32(0)) is undefined.  This had the effect of causing the regression tests to pass on one run and then fail on another...  Just renaming the source code directory could make a difference.  Undefined behavior is nasty.

Anyway, this patch now passes the regression tests consistently.
Comment 6 Patrick J. LoPresti 2012-11-13 00:05:30 UTC
Created attachment 75205 [details]
[VEX] Yet another version of patch

OK, so Mux0X works but kills memcheck's validity bit tracking.

This final (?) version of the patch fixes that.  It still passes all regression tests; it fixes the test case on this bug; and, together with the patches on bug 294285, it seems to do very, very well on the optimized output from the Intel C compiler.  Finally!!
Comment 7 Patrick J. LoPresti 2012-11-19 22:05:14 UTC
Created attachment 75363 [details]
Patch to add regtest for pcmpistri

This patch adds a regression test "insn-pcmpistri" to memcheck/tests/amd64.  It fails on the current Valgrind but succeeds with patch 75205 applied.
Comment 8 Julian Seward 2012-11-20 15:25:59 UTC
Committed, r2562, r13132.  Thanks.