Summary: | False positive on x86/amd64 with ZF taken directly from addition | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Alexander Monakov <amonakov> |
Component: | memcheck | Assignee: | Paul Floyd <pjfloyd> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | pjfloyd |
Priority: | NOR | ||
Version: | 3.24 GIT | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | tests all add/sub Z/NZ |
Description
Alexander Monakov
2024-08-26 13:33:19 UTC
I can reproduce similar errors on amd64. I'm not too sure what the side effects of the inline asm is. The 4 bytes of r should be U1U1 (where U is uninitialized). --expensive-definedness-checks=yes doesn't seem to help. (In reply to Paul Floyd from comment #1) > I can reproduce similar errors on amd64. > > I'm not too sure what the side effects of the inline asm is. It just pretends to modify the variable 'r' to make GCC emit 'test edx, edx' before the branch. Since you've reproduced the issue, and it's the difference between two variants of machine code what matters, feel free to forget the C testcase. > The 4 bytes of r should be U1U1 (where U is uninitialized). UUU1 actually, s[1] in f is UU, s[0] is U1. The following patch helps. I'm quite unsure what the numbers in comments (like "8," in the context visible here) are counting. diff --git a/VEX/priv/guest_amd64_helpers.c b/VEX/priv/guest_amd64_helpers.c index da1cabc3c..033167037 100644 --- a/VEX/priv/guest_amd64_helpers.c +++ b/VEX/priv/guest_amd64_helpers.c @@ -1098,6 +1098,13 @@ IRExpr* guest_amd64_spechelper ( const HChar* function_name, binop(Iop_Add64, cc_dep1, cc_dep2), mkU64(0))); } + if (isU64(cc_op, AMD64G_CC_OP_ADDL) && isU64(cond, AMD64CondZ)) { + /* long add, then Z --> test ((int)(dst+src) == 0) */ + return unop(Iop_1Uto64, + binop(Iop_CmpEQ32, + unop(Iop_64to32, binop(Iop_Add64, cc_dep1, cc_dep2)), + mkU32(0))); + } /* 8, */ if (isU64(cc_op, AMD64G_CC_OP_ADDQ) && isU64(cond, AMD64CondS)) { The patch looks ok to me. I’ll check if there should be all 4 addq addl subq and subl variants. The numbers 0 4 8 etc in the comments are the values of the condition code. (In reply to Paul Floyd from comment #4) > The patch looks ok to me. I’ll check if there should be all 4 addq addl subq > and subl variants. There are also CondZ and CondNZ variants. (In reply to Alexander Monakov from comment #5) > (In reply to Paul Floyd from comment #4) > > The patch looks ok to me. I’ll check if there should be all 4 addq addl subq > > and subl variants. > > There are also CondZ and CondNZ variants. It looks like these have been added as needed, not systematically to cover all cases. (In reply to Paul Floyd from comment #6) > It looks like these have been added as needed, not systematically to cover > all cases. In GCC we cannot leave compiling-under-valgrind broken until a release of Valgrind with the fix for a false positive appears. We apply a workaround (in this case, initializing a few bytes more) and move on. As far as vectorized code in libcpp is concerned, a fix for this particular case is not needed anymore. If all false positives are fixed in this manner — covering just one case on just one architecture, when the original project already worked around it — all it may amount to is accumulation of dead weight. (In reply to Alexander Monakov from comment #7) > (In reply to Paul Floyd from comment #6) > > It looks like these have been added as needed, not systematically to cover > > all cases. > > In GCC we cannot leave compiling-under-valgrind broken until a release of > Valgrind with the fix for a false positive appears. We apply a workaround > (in this case, initializing a few bytes more) and move on. As far as > vectorized code in libcpp is concerned, a fix for this particular case is > not needed anymore. If all false positives are fixed in this manner — > covering just one case on just one architecture, when the original project > already worked around it — all it may amount to is accumulation of dead > weight. This item is my next priority. I'd like to be sure of getting it into 3.24. There are two things that I don't yet understand. I can't get addb or subb versions to work. Also I'm not sure what triggers the NZ versions. add and sub followed by jne seem to also trigger the Z version. Created attachment 173550 [details]
tests all add/sub Z/NZ
Here's the test I've been using. I don't think that there should be any errors since the lowest bit of all of the add/sub operations is always defined and 1 so the Z flag should always be defined and zero.
In Valgrind I've added all combinations of sub/add and Z/NZ.
The byte operations still generate errors. So instead of 8 errors I'm still getting 4 errors.
(In reply to Paul Floyd from comment #10) > The byte operations still generate errors. So instead of 8 errors I'm still > getting 4 errors. I see now. The auto expensiveness checks only apply down to word operations, not byte ones. Could you see if my try branch users/paulf/try-bug492210 works OK? (In reply to Paul Floyd from comment #12) > Could you see if my try branch users/paulf/try-bug492210 works OK? I briefly tried it with a standalone test for vectorized helpers from libcpp and the false positives are gone. By the way, I noticed that the repeated comment in your patch will need corrections: + /* long add, then Z --> test ((char)(dst+src) == 0) */ It's "long" only in the first instance, for ADDW/ADDB it's word/byte respectively. (In reply to Alexander Monakov from comment #13) > It's "long" only in the first instance, for ADDW/ADDB it's word/byte > respectively. I don't think that the comments are very consistent in this respect. They should probably all be long long. /* long add, then Z --> test ((char)(dst+src) == 0) */ return unop(Iop_1Uto64, binop(Iop_CmpEQ8, unop(Iop_64to8, binop(Iop_Add64, cc_dep1, cc_dep2)), mkU8(0))); binop(Iop_Add64, cc_dep1, cc_dep2) is the long long add unop(Iop_64to8 ...) is the char cast binop(Iop_CmpEQ8, ... mkU8(0)) is the test == 0 and finally unop(Iop_1Uto64 ...) is the conversion to 1 bit Z flag. (In reply to Paul Floyd from comment #14) > (In reply to Alexander Monakov from comment #13) > > > It's "long" only in the first instance, for ADDW/ADDB it's word/byte > > respectively. > > I don't think that the comments are very consistent in this respect. They > should probably all be long long. I think you are misunderstanding what the comments refer to: the part before "-->" refers to what's tested in the "if". Have a look at other such comments in the file. (In reply to Alexander Monakov from comment #15) > I think you are misunderstanding what the comments refer to: the part before > "-->" refers to what's tested in the "if". Have a look at other such > comments in the file. What if? You mean the condition that's being generated? For me that's clear enough from the Q/L/W/B suffixes. AFAICS it's all mixed up. It should either be all amd64 terminology (quad, long or maybe doubleword, word, byte) or ILP64 C (long or long long, int, short, byte). Byte and word seem OK but quad/long long and long/int are confusing. commit 7e78a9a990c990631adb3f7c7edfd8cda418547f (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Sun Sep 15 09:52:56 2024 +0200 Bug 492210 - False positive on x86/amd64 with ZF taken directly from addition Also adds similar checks for short and char equivalents to the original int reproducer. Initial fix provided by Alexander Monakov <amonakov@gmail.com> Two versions of the testcase, one with default options and one with -expensive-definedness-checks=yes because the byte operations subb and addb need the flag turned on explicitly. |