Bug 387766 - asm shifts cause false positive "Conditional jump or move depends on uninitialised value"
Summary: asm shifts cause false positive "Conditional jump or move depends on uninitia...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.14 SVN
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-10 10:45 UTC by Markus Trippelsdorf
Modified: 2017-12-11 11:36 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Markus Trippelsdorf 2017-12-10 10:45:34 UTC
Running gcc trunk under valgrind produces many false positives, e.g.:

==20511== Conditional jump or move depends on uninitialised value(s)                                                                                        
==20511==    at 0xBBA16D: update_costs_from_copies(ira_allocno*, bool, bool) (in /home/trippels/gcc_build_dir_/gcc/cc1)
==20511==    by 0xBBB0ED: assign_hard_reg(ira_allocno*, bool) (in /home/trippels/gcc_build_dir_/gcc/cc1)
==20511==    by 0xBBF527: color_allocnos() (in /home/trippels/gcc_build_dir_/gcc/cc1)
==20511==    by 0xBC011B: color_pass(ira_loop_tree_node*) (in /home/trippels/gcc_build_dir_/gcc/cc1)
==20511==    by 0xBA7EAE: ira_traverse_loop_tree(bool, ira_loop_tree_node*, void (*)(ira_loop_tree_node*), void (*)(ira_loop_tree_node*)) (ira-build.c:1781)
==20511==    by 0xBB91C2: ira_color() (in /home/trippels/gcc_build_dir_/gcc/cc1)
==20511==    by 0xBA33B4: ira (ira.c:5286)                              
==20511==    by 0xBA33B4: (anonymous namespace)::pass_ira::execute(function*) (ira.c:5584)
==20511==    by 0xC5FD98: execute_one_pass(opt_pass*) (passes.c:2497)
==20511==    by 0xC60574: execute_pass_list_1(opt_pass*) (passes.c:2586)               
==20511==    by 0xC60586: execute_pass_list_1(opt_pass*) (passes.c:2587)
==20511==    by 0xC605B8: execute_pass_list(function*, opt_pass*) (passes.c:2597)
==20511==    by 0x9A2DC2: cgraph_node::expand() (cgraphunit.c:2139)
==20511==  Uninitialised value was created by a client request                            
==20511==    at 0x8F411D: base_pool_allocator<memory_block_pool>::allocate() (alloc-pool.h:419)                                                                                    
==20511==    by 0xBAA640: allocate (alloc-pool.h:502)
==20511==    by 0xBAA640: ira_create_allocno(int, bool, ira_loop_tree_node*) (ira-build.c:486)
==20511==    by 0xBAAB33: create_insn_allocnos(rtx_def*, rtx_def*, bool) (ira-build.c:1852)
==20511==    by 0xBAAF74: create_bb_allocnos (ira-build.c:1918)
==20511==    by 0xBAAF74: create_loop_tree_node_allocnos(ira_loop_tree_node*) (ira-build.c:1964)
==20511==    by 0xBA81AB: ira_traverse_loop_tree(bool, ira_loop_tree_node*, void (*)(ira_loop_tree_node*), void (*)(ira_loop_tree_node*)) (ira-build.c:1799)
==20511==    by 0xBACD38: create_allocnos (ira-build.c:2060)
==20511==    by 0xBACD38: ira_build() (ira-build.c:3420)
==20511==    by 0xBA331F: ira (ira.c:5273)
==20511==    by 0xBA331F: (anonymous namespace)::pass_ira::execute(function*) (ira.c:5584)
==20511==    by 0xC5FD98: execute_one_pass(opt_pass*) (passes.c:2497)
==20511==    by 0xC60574: execute_pass_list_1(opt_pass*) (passes.c:2586)
==20511==    by 0xC60586: execute_pass_list_1(opt_pass*) (passes.c:2587)
==20511==    by 0xC605B8: execute_pass_list(function*, opt_pass*) (passes.c:2597)
==20511==    by 0x9A2DC2: cgraph_node::expand() (cgraphunit.c:2139)

(gcc was configured with --enable-valgrind-annotations)

alloc-pool.h:419 reads:
 419   VALGRIND_DISCARD (VALGRIND_MAKE_MEM_UNDEFINED (header, size)); 

On x86 gcc trunk doesn't tune for partial_flag_reg_stall anymore.
This causes redundant test instructions after shifts to be removed. 

Asm diff of update_costs_from_copies():
@@ -7464,7 +7461,6 @@                       
        movl    12(%rbp), %edx              
        sall    $10, %edx                   
        sarl    $16, %edx                   
-       testl   %edx, %edx                  
        js      .L1438                      
        movq    40(%rsp), %r9               
        movq    72(%rsp), %r11
Comment 1 Jakub Jelinek 2017-12-10 10:49:50 UTC
But:

int
foo (int x)
{
  asm volatile ("sarl $16, %0; js 1f; incl %0; 1:" : "+r" (x));
  return x;
}

int
bar (int x)
{
  asm volatile ("sarl $16, %0; testl %0, %0; js 1f; incl %0; 1:" : "+r" (x));
  return x;
}

int
main ()
{
  __builtin_printf ("%x %x\n", foo (0x87654321), bar (0x87654321));
  __builtin_printf ("%x %x\n", foo (0x12345678), bar (0x12345678));
  return 0;
}

doesn't show any difference when running without and with valgrind.
Comment 2 Ivo Raisr 2017-12-10 12:21:21 UTC
(In reply to Markus Trippelsdorf from comment #0)
> Running gcc trunk under valgrind produces many false positives, e.g.:

Please provide a simple standalone reproducer. For example, could a part of update_costs_from_copies() be isolated to reproduce the problem?

There is also a patch for Valgrind which makes it to spend some extra time to accurately track definedness:
https://bugs.kde.org/show_bug.cgi?id=387664

Please could you try to build Valgrind with this patch and then run it with --expensive-definedness-checks=yes (or --expensive-definedness-checks=auto).
Comment 3 Markus Trippelsdorf 2017-12-10 12:55:32 UTC
(In reply to Ivo Raisr from comment #2)
> (In reply to Markus Trippelsdorf from comment #0)
> > Running gcc trunk under valgrind produces many false positives, e.g.:
> 
> Please provide a simple standalone reproducer. For example, could a part of
> update_costs_from_copies() be isolated to reproduce the problem?
> 
> There is also a patch for Valgrind which makes it to spend some extra time
> to accurately track definedness:
> https://bugs.kde.org/show_bug.cgi?id=387664
> 
> Please could you try to build Valgrind with this patch and then run it with
> --expensive-definedness-checks=yes (or --expensive-definedness-checks=auto).

Your patch fixes the issue. Thanks.
I will try to come up with a testcase tomorrow.
Comment 4 Markus Trippelsdorf 2017-12-10 13:11:15 UTC
(In reply to Markus Trippelsdorf from comment #3)
> (In reply to Ivo Raisr from comment #2)
> > (In reply to Markus Trippelsdorf from comment #0)
> > > Running gcc trunk under valgrind produces many false positives, e.g.:
> > 
> > Please provide a simple standalone reproducer. For example, could a part of
> > update_costs_from_copies() be isolated to reproduce the problem?
> > 
> > There is also a patch for Valgrind which makes it to spend some extra time
> > to accurately track definedness:
> > https://bugs.kde.org/show_bug.cgi?id=387664
> > 
> > Please could you try to build Valgrind with this patch and then run it with
> > --expensive-definedness-checks=yes (or --expensive-definedness-checks=auto).
> 
> Your patch fixes the issue. Thanks.

Actually no. Trunk is fine without any patch at all (and without --expensive-definedness-checks=yes). So it looks like the issue is already fixed.
Comment 5 Markus Trippelsdorf 2017-12-10 13:16:00 UTC
It was fixed a few days ago by commit 8a2acb304db99c:
 amd64: add a spec rule for SHRL/SARL then CondS.  gcc-8 has been seen to generate such things.
Comment 6 Julian Seward 2017-12-11 11:16:53 UTC
Verified .. gcc version 8.0.0 20171210 runs clean on x86_64 linux when
configured with --enable-valgrind-annotations, the trunk, and the
https://bugs.kde.org/show_bug.cgi?id=387664 patch, which will land soon.
I didn't try without the patch.

I should note in passing, that I previously tried the 20171203 snapshot
without --enable-valgrind-annotations, since I was not aware of it.
I got a lot of complaints, all originating from sparseset_bit_p() in
gcc/sparseset.h, because this deliberately does a comparison on
undefined values.  I fixed this using the patch below.

I mention this because I am concerned that --enable-valgrind-annotations
might "fix" the problem by zero-initialising various allocations that create
memory used in sparseset_bit_p().  IMO that is the wrong solution, because
it will "hide" any legitimate uninitialised values that would otherwise have
been read from such allocations -- and so you reduce Memcheck's ability
to detect errors in gcc.  That said, this is only speculation -- I haven't
actually checked what --enable-valgrind-annotations does.

My fix follows.  Clearly you'd actually want to ifdef the macro call so
that this actually builds on targets that don't have <valgrind/memcheck.h>.


--- gcc-8-20171203/gcc/sparseset.h~	2017-01-01 13:07:43.000000000 +0100
+++ gcc-8-20171203/gcc/sparseset.h	2017-12-06 10:21:00.447480896 +0100
@@ -130,24 +130,27 @@
 static inline SPARSESET_ELT_TYPE
 sparseset_size (sparseset s)
 {
   return s->size;
 }
 
 /* Return true if e is a member of the set S, otherwise return false.  */
 
+#include <valgrind/memcheck.h>
+
 static inline bool
 sparseset_bit_p (sparseset s, SPARSESET_ELT_TYPE e)
 {
   SPARSESET_ELT_TYPE idx;
 
   gcc_checking_assert (e < s->size);
 
   idx = s->sparse[e];
+  VALGRIND_MAKE_MEM_DEFINED(&idx, sizeof(idx));
 
   return idx < s->members && s->dense[idx] == e;
 }
 
 /* Low level insertion routine not meant for use outside of sparseset.[ch].
    Assumes E is valid and not already a member of the set S.  */
 
 static inline void
Comment 7 Markus Trippelsdorf 2017-12-11 11:36:20 UTC
The ENABLE_VALGRIND_ANNOTATIONS logic is in gcc/system.h:1143.

sparseset allocations are already handled in gcc/sparseset.c:

 25 /* Allocate and clear a n_elms SparseSet.  */
 26
 27 sparseset
 28 sparseset_alloc (SPARSESET_ELT_TYPE n_elms)
 29 {
 30   unsigned int n_bytes = sizeof (struct sparseset_def)
 31                          + ((n_elms - 1) * 2 * sizeof (SPARSESET_ELT_TYPE));
 32
 33   sparseset set = XNEWVAR (struct sparseset_def, n_bytes);
 34
 35   /* Mark the sparseset as defined to silence some valgrind uninitialized
 36      read errors when accessing set->sparse[n] when "n" is not, and never has
 37      been, in the set.  These uninitialized reads are expected, by design and
 38      harmless.  */
 39   VALGRIND_DISCARD (VALGRIND_MAKE_MEM_DEFINED (set, n_bytes));