Bug 352130 - helgrind reports false races for printfs using mempcpy manipulating FILE* state
Summary: helgrind reports false races for printfs using mempcpy manipulating FILE* state
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: helgrind (other bugs)
Version First Reported In: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-01 21:05 UTC by Mark Wielaard
Modified: 2022-04-08 13:26 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
test that calls printfs in a thread and in main (253 bytes, text/x-csrc)
2022-02-28 04:23 UTC, Godmar Back
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2015-09-01 21:05:58 UTC
helgrind doesn't see/know about the glibc internal locking done for FILE *state and so suppresses any races in glibc itself. From glibc-2.34567-NPTL-helgrind.supp:

# - suppress anything that has its top frame in libc.so.
#   This really isn't clever, since it could hide some 
#   legitimate races.  But the problem is, if we don't do
#   this, then loads of errors to do with stdio are reported, because
#   H fails to see glibc's internal locking/unlocking of FILE*s
#   as required by POSIX.  A better solution is needed.

{
   helgrind-glibc2X-004
   Helgrind:Race
   obj:*/lib*/libc-2.*so*
}

Newer glibc might use mempcpy to manipulate FILE* state since:

2012-03-29  David S. Miller  <davem@davemloft.net>

       * libio/fileops.c (_IO_new_file_xsputn): Don't try to optimize
       small copies by hand.

The code now just uses mempcpy to do the copying.

This causes a simple program like:

#include <stdio.h>
#include <pthread.h>

pthread_t thread;

void* thread3 (void* d)
{
  int count3 = 0;

  while(count3 < 100){
    printf("Thread 3: %d\n", count3++);
  }
  return NULL;
}

void* thread2 (void* d)
{
  int count2 = 0;

  while(count2 < 1000){
	  //
    printf("Thread 2: %d\n", count2++);
  }
  return NULL;
}

int main (){

  pthread_create (&thread, NULL, thread2, NULL);
  pthread_create (&thread, NULL, thread3, NULL);

  //Thread 1
  int count1 = 0;

  while(count1 < 10){
    printf("Thread 1: %d\n", count1++);
  }

  pthread_join(thread,NULL);
  return 0;
}

$ gcc -g -o threads -lpthread threads.c
$ valgrind --tool=helgrind ./threads

To produce:

==6367== ----------------------------------------------------------------
==6367== 
==6367== Possible data race during write of size 1 at 0x4022009 by thread #1
==6367== Locks held: none
==6367==    at 0x4C3422C: __GI_mempcpy (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==6367==    by 0x50CF0E1: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1307)
==6367==    by 0x509F66F: vfprintf (vfprintf.c:1323)
==6367==    by 0x50A9E78: printf (printf.c:33)
==6367==    by 0x40077D: main (threads.c:36)
==6367==  Address 0x4022009 is not stack'd, malloc'd or on a free list
==6367== 
==6367== ----------------------------------------------------------------
==6367== 
==6367== Possible data race during write of size 1 at 0x402200A by thread #1
==6367== Locks held: none
==6367==    at 0x4C34249: __GI_mempcpy (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==6367==    by 0x50CF0E1: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1307)
==6367==    by 0x50A31EC: vfprintf (vfprintf.c:1641)
==6367==    by 0x50A9E78: printf (printf.c:33)
==6367==    by 0x40077D: main (threads.c:36)
==6367==  Address 0x402200a is not stack'd, malloc'd or on a free list
==6367== 
==6367== ----------------------------------------------------------------
==6367== 
==6367== Possible data race during write of size 1 at 0x402200B by thread #1
==6367== Locks held: none
==6367==    at 0x4C3422C: __GI_mempcpy (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==6367==    by 0x50CF0E1: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1307)
==6367==    by 0x50A2F0D: vfprintf (vfprintf.c:1673)
==6367==    by 0x50A9E78: printf (printf.c:33)
==6367==    by 0x40077D: main (threads.c:36)
==6367==  Address 0x402200b is not stack'd, malloc'd or on a free list
==6367== 


Reproducible: Always




Proposed patch:

diff --git a/glibc-2.34567-NPTL-helgrind.supp b/glibc-2.34567-NPTL-helgrind.supp
index b61a339..ed105b8 100644
--- a/glibc-2.34567-NPTL-helgrind.supp
+++ b/glibc-2.34567-NPTL-helgrind.supp
@@ -23,6 +23,10 @@
 #   this, then loads of errors to do with stdio are reported, because
 #   H fails to see glibc's internal locking/unlocking of FILE*s
 #   as required by POSIX.  A better solution is needed.
+#
+# - some of the stdio functions in newer glibc manipulate stdio
+#   FILE*s state through mempcpy, which we intercept, so we also need
+#   to suppress such manipulations.
 
 #{
 #   helgrind-glibc2X-001
@@ -41,6 +45,14 @@
 }
 
 {
+   helgrind-glibc-io-xsputn-mempcpy
+   Helgrind:Race
+   fun:__GI_mempcpy
+   fun:_IO_*xsputn*
+   obj:*/lib*/libc-2.*so*
+}
+
+{
    helgrind-glibc2X-005
    Helgrind:Race
    obj:*/lib*/libpthread-2.*so*
Comment 1 Mark Wielaard 2015-09-04 10:52:16 UTC
valgrind svn r15622
Comment 2 Godmar Back 2022-02-28 04:23:04 UTC
Created attachment 147180 [details]
test that calls printfs in a thread and in main
Comment 3 Godmar Back 2022-02-28 04:30:14 UTC
I am seeing the same or a very similar issues in valgrind 3.18.1 on Ubuntu 20.04 LTS with GNU LibC 2.5
The attached test program results in:
```c
==952377== Helgrind, a thread error detector
==952377== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==952377== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==952377== Command: ./ptest
==952377== 
thread1
==952377== ---Thread-Announcement------------------------------------------
==952377== 
==952377== Thread #1 is the program's root thread
==952377== 
==952377== ---Thread-Announcement------------------------------------------
==952377== 
==952377== Thread #2 was created
==952377==    at 0x49BE282: clone (clone.S:71)
==952377==    by 0x48812EB: create_thread (createthread.c:101)
==952377==    by 0x4882E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
==952377==    by 0x4846FBA: pthread_create_WRK (hg_intercepts.c:445)
==952377==    by 0x48480BD: pthread_create@* (hg_intercepts.c:478)
==952377==    by 0x109203: main (in /home/gback/cs3214/dutchblitz/ptest)
==952377== 
==952377== ----------------------------------------------------------------
==952377== 
==952377== Possible data race during write of size 1 at 0x5292193 by thread #1
==952377== Locks held: none
==952377==    at 0x484C812: mempcpy (vg_replace_strmem.c:1668)
==952377==    by 0x492E7B1: _IO_new_file_xsputn (fileops.c:1236)
==952377==    by 0x492E7B1: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1197)
==952377==    by 0x4923677: puts (ioputs.c:40)
==952377==    by 0x10920F: main (in /home/gback/cs3214/dutchblitz/ptest)
==952377==  Address 0x5292193 is 3 bytes inside a block of size 1,024 alloc'd
==952377==    at 0x483E8D5: malloc (vg_replace_malloc.c:381)
==952377==    by 0x4920E83: _IO_file_doallocate (filedoalloc.c:101)
==952377==    by 0x493104F: _IO_doallocbuf (genops.c:347)
==952377==    by 0x49300AF: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:745)
==952377==    by 0x492E834: _IO_new_file_xsputn (fileops.c:1244)
==952377==    by 0x492E834: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1197)
==952377==    by 0x4923677: puts (ioputs.c:40)
==952377==    by 0x1091C4: thread1 (in /home/gback/cs3214/dutchblitz/ptest)
==952377==    by 0x48471B2: mythread_wrapper (hg_intercepts.c:406)
==952377==    by 0x4882608: start_thread (pthread_create.c:477)
==952377==    by 0x49BE292: clone (clone.S:95)
==952377==  Block was alloc'd by thread #2
==952377== 
main
==952377== 
==952377== Use --history-level=approx or =none to gain increased speed, at
==952377== the cost of reduced accuracy of conflicting-access information
==952377== For lists of detected and suppressed errors, rerun with: -s
==952377== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 19 from 19)
```

The bug fix Mark proposed in 2015 included a suppression. I have been looking for related suppressions in the 3.18.1 codebase, finding
```
{
   helgrind-glibc-io-xsputn-mempcpy
   Helgrind:Race
   fun:__GI_mempcpy
   fun:_IO_*xsputn*
   obj:*/lib*/libc-2.*so*
}
```
and
```

   Ubuntu804-hack-1
   Memcheck:Overlap
   fun:mempcpy
   fun:_IO_default_xsputn
   obj:/lib*/libc-2.*so*
}
```

Running with `--gen-suppressions=yes` however, generates:
```
{
   <insert_a_suppression_name_here>
   Helgrind:Race
   fun:mempcpy
   fun:_IO_new_file_xsputn
   fun:_IO_file_xsputn@@GLIBC_2.2.5
   fun:puts
   fun:main
}
```
which appears incompatible with those suppressions if I read this correctly.
Should this bug be reopened or a new bug be filed, or is there some other explanation?
Comment 4 Mark Wielaard 2022-04-08 13:26:40 UTC
> which appears incompatible with those suppressions if I read this correctly.
> Should this bug be reopened or a new bug be filed, or is there some other
> explanation?

The analysis is correct, a new bug was created:
https://bugs.kde.org/show_bug.cgi?id=450962