$ valgrind netstat On s390x That will generate: ==44073== Conditional jump or move depends on uninitialised value(s) ==44073== at 0x4935F12B9C: tsearch (in /lib64/libc-2.12.so) No such warnings are given on other arches. this is the assembly for the test that triggers the warning: 174 175 /* If the parent of this node is also red, we have to do 176 rotations. */ 177 if (parentp != NULL && (*parentp)->red) 0x0000004935f12a64 <+180>: ltgr %r8,%r8 0x0000004935f12a68 <+184>: je 0x4935f12af4 <__tsearch+324> 0x0000004935f12a6c <+188>: lg %r2,0(%r8) 0x0000004935f12a72 <+194>: ltg %r3,24(%r2) 0x0000004935f12a78 <+200>: jhe 0x4935f12af4 <__tsearch+324> 0x0000004935f12b90 <+480>: lg %r3,0(%r10) 0x0000004935f12b96 <+486>: ltg %r4,24(%r3) => 0x0000004935f12b9c <+492>: jhe 0x4935f12c08 <__tsearch+600> 178 { And this is the node_t initialization code: 281 q = (struct node_t *) malloc (sizeof (struct node_t)); 0x0000004935f12b2a <+378>: lghi %r2,32 0x0000004935f12b2e <+382>: brasl %r14,0x4935e28528 <malloc@plt> 282 if (q != NULL) 0x0000004935f12b34 <+388>: ltgr %r2,%r2 0x0000004935f12b38 <+392>: je 0x4935f12c08 <__tsearch+600> 283 { 284 *nextp = q; /* link new node to old */ 0x0000004935f12b3c <+396>: stg %r2,0(%r11) 285 q->key = key; /* initialize new node */ 0x0000004935f12b42 <+402>: lg %r4,168(%r15) 0x0000004935f12b48 <+408>: stg %r4,0(%r2) 286 q->red = 1; 0x0000004935f12b5e <+430>: oi 24(%r2),128 287 q->left = q->right = NULL; 0x0000004935f12b4e <+414>: lghi %r5,0 0x0000004935f12b52 <+418>: stg %r5,16(%r2) 0x0000004935f12b58 <+424>: stg %r5,8(%r2) 288 In gdb this is: Program received signal SIGTRAP, Trace/breakpoint trap. 0x00000048b3380b9c in maybe_split_for_insert (key=0x7feffad38, vrootp=<value optimized out>, compar=0x48b339b518 <known_compare>) at tsearch.c:177 177 if (parentp != NULL && (*parentp)->red) (gdb) print &(*parentp)->red $1 = (unsigned int *) 0x40296f8 (gdb) monitor get_vbits 0x40296f8 7f Reproducible: Always Also reported here: https://bugzilla.redhat.com/show_bug.cgi?id=816244
Thanks for the analysis. I first thought, it looks like a compiler bug, but gcc is correct. OI sets the MSB (leftmost on s390) so only one bit gets defined. LTG+JHE checks for zero or greater than zero. So only the first bit gets checked, even if the full double word is loaded. (a test under mask would be better for valgrind) The easiest solution might be a suppression, but it would be good if we could fix it on IR level. Currently not sure how to do that in an elegant way.
I don't know how you are doing condition code stuff in s390, but if it is thunk-based (like w/ x86, amd64, arm front ends) then you can often get rid of false positives resulting from partially defined condition codes, by adding a spechelper function (eg) guest_x86_spechelper. When set up properly, such a function allows the IR optimiser to convert calls to the "evaluate flags thunk" routine(s) into the actual test that is required (eg, x >=unsigned y), etc. I mention this because this technique is completely crucial for getting sane (low-noise) behaviour on x86/amd64/arm, and it looks like you might have a similar situation here.
(In reply to comment #1) > OI sets the MSB (leftmost on s390) so only one bit gets defined. > LTG+JHE checks for zero or greater than zero. So only the first bit gets Can you show what (post-optimisation) IR the LTG+JHE gets converted to, with --tool=none and --trace-flags=10001000 ? FWIW I am pretty sure gcc's amd64 backend plays similar tricks, and we can rewrite all of this stuff out in iropt, so that memcheck doesn't see any false dependencies and so isn't confused.
It is thunk based and we use spechelper. The generic spechelper used a 64bit compare, though. ------------------------ Front end ------------------------ lg %r1,0(%r3) ------ IMark(0x4153560, 6, 0) ------ t1 = 0x0:I64 t0 = Add64(Add64(t1,GET:I64(216)),0x0:I64) PUT(200) = LDbe:I64(t0) PUT(336) = 0x4153566:I64 ltg %r3,24(%r1) ------ IMark(0x4153566, 6, 0) ------ t3 = 0x18:I64 t2 = Add64(Add64(t3,GET:I64(200)),0x0:I64) t4 = LDbe:I64(t2) PUT(216) = t4 PUT(352) = 0xF:I64 PUT(360) = t4 PUT(368) = 0x0:I64 PUT(376) = 0x0:I64 PUT(336) = 0x415356C:I64 jhe .+100 ------ IMark(0x415356C, 4, 0) ------ t5 = s390_calculate_cond[mcx=0x13]{0x4011ced98}(0xA:I64,GET:I64(352),GET:I64(360),GET:I64(368),GET:I64(376)):I32 if (CmpNE32(t5,0x0:I32)) { PUT(336) = 0x41535D0:I64; exit-Boring } PUT(336) = 0x4153570:I64 PUT(336) = GET:I64(336); exit-Boring GuestBytes 4153560 16 E3 10 30 00 00 04 E3 30 10 18 00 02 A7 A4 00 32 00726FBA ------------------------ After tree-building ------------------------ IRSB { t0:I64 t1:I64 t2:I64 t3:I64 t4:I64 t5:I32 t6:I64 t7:I64 t8:I64 t9:I64 t10:I64 t11:I64 t12:I64 t13:I64 t14:I64 t15:I64 t16:I64 t17:I32 t18:I1 t19:I64 t20:I32 t21:I1 ------ IMark(0x4153560, 6, 0) ------ t9 = LDbe:I64(GET:I64(216)) PUT(200) = t9 PUT(336) = 0x4153566:I64 ------ IMark(0x4153566, 6, 0) ------ t4 = LDbe:I64(Add64(0x18:I64,t9)) PUT(216) = t4 PUT(352) = 0xF:I64 PUT(360) = t4 PUT(368) = 0x0:I64 PUT(376) = 0x0:I64 PUT(336) = 0x415356C:I64 ------ IMark(0x415356C, 4, 0) ------ if (CmpLE64S(0x0:I64,t4)) { PUT(336) = 0x41535D0:I64; exit-Boring } PUT(336) = 0x4153570:I64; exit-Boring }
(In reply to comment #4) > if (CmpLE64S(0x0:I64,t4)) { PUT(336) = 0x41535D0:I64; exit-Boring } IIUC, CmpLE64S(0x0:I64,t4) depends only on the top bit of t4, right? amd64 front end has to handle the same game from gcc. Here's how it does it. I reckon if you add an extra spec rule for this special case where one of the operands is zero, you can rewrite it out. if (isU64(cc_op, AMD64G_CC_OP_SUBB) && isU64(cond, AMD64CondS) && isU64(cc_dep2, 0)) { /* byte sub/cmp of zero, then S --> test (dst-0 <s 0) --> test dst <s 0 --> (ULong)dst[7] This is yet another scheme by which gcc figures out if the top bit of a byte is 1 or 0. See also LOGICB/CondS below. */ /* Note: isU64(cc_dep2, 0) is correct, even though this is for an 8-bit comparison, since the args to the helper function are always U64s. */ return binop(Iop_And64, binop(Iop_Shr64,cc_dep1,mkU8(7)), mkU64(1)); }
Indeed, the following seems to fix the problem. --- VEX/priv/guest_s390_helpers.c (revision 2549) +++ VEX/priv/guest_s390_helpers.c (working copy) @@ -1522,7 +1522,11 @@ return unop(Iop_1Uto32, binop(Iop_CmpLT64S, mkU64(0), cc_dep1)); } if (cond == 8 + 2 || cond == 8 + 2 + 1) { - return unop(Iop_1Uto32, binop(Iop_CmpLE64S, mkU64(0), cc_dep1)); + /* special case =0 || >0 to handle some gcc magic that only checks + * the first bit. Fixes 308427 + */ + return unop(Iop_64to32, binop(Iop_Xor64, binop(Iop_Shr64,cc_dep1,mkU8(63)), + mkU64(1))); } if (cond == 8 + 4 + 2 || cond == 8 + 4 + 2 + 1) { return mkU32(1); need regression testing and test cases of course.
should be fixed with VEX r2551
(In reply to comment #7) > should be fixed with VEX r2551 Tested against to original testcase and no more unconditional warnings are issued anymore. The new memcheck/tests/s390x/ltgjhe testcase also passes. Thanks.