Bug 450962 - helgrind reports false races for printfs using mempcpy manipulating FILE* state in glibc-2.26+
Summary: helgrind reports false races for printfs using mempcpy manipulating FILE* sta...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: helgrind (other bugs)
Version First Reported In: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-02-28 11:56 UTC by Jesus Checa
Modified: 2022-04-11 13:00 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
valgrind run in RHEL 8.5 with glibc 2.28 (8.32 KB, text/x-log)
2022-02-28 11:56 UTC, Jesus Checa
Details
Patch adding suppression + testcase (3.16 KB, patch)
2022-02-28 12:01 UTC, Jesus Checa
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jesus Checa 2022-02-28 11:56:52 UTC
Created attachment 147187 [details]
valgrind run in RHEL 8.5 with glibc 2.28

Description of the issue:

Helgrind reports again false races for printfs as described in bug 352130. This happens in newer glibc versions.
More specifically I've reproduced this in RHEL-8.5 with glibc-2.28-164.el8.x86_64 and valgrind-3.17.0-5.el8.x86_64,
and also in Fedora 34 with glibc-2.33-21.fc34.x86_64 and valgrind-3.18.1-1.fc34.x86_64.

In RHEL-8.5 (with glibc 2.28) I've built and run the reproducer that Mark wrote originally in bug 352130,
which produces the output shown in the attachment.

It seems that in newer glibc versions mempcpy is used instead __GI_mempcpy, making
the suppresion from the original bug obsolete. I tracked down the changes to
__GI_mempcpy to the following glibc's commit:
commit 5c3e322d3be3803636e38bcaf083fb59b3a34f0c
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Jun 14 12:10:57 2017 -0700

    x86-64: Implement memmove family IFUNC selectors in C
    
    Implement memmove family IFUNC selectors in C.
    
    All internal calls within libc.so can use IFUNC on x86-64 since unlike
    x86, x86-64 supports PC-relative addressing to access the GOT entry so
    that it can call via PLT without using an extra register.  For libc.a,
    we can't use IFUNC for functions which are called before IFUNC has been
    initialized.  Use IFUNC internally reduces the icache footprint since
    libc.so and other codes in the process use the same implementations.
    This patch uses IFUNC for memmove family functions within libc.

From there, in glibc repo:
$ git branch --contains 5c3e322d3be3803636e38bcaf083fb59b3a34f0c
  master
  release/2.26/master
  release/2.27/master
  release/2.28/master
  release/2.29/master
  release/2.30/master
  release/2.31/master
  release/2.32/master
  release/2.33/master
  release/2.34/master

To check my assumption, I grabbed glibc 2.25 and 2.26 sources, and built/installed
into /opt/glibc-2.2[5|6], built the reproducer and patched the elf to use my custom
built glibc:
[root@localhost ~]# gcc -g t.c -o t -lpthread
[root@localhost ~]# cp t t-glibc2.25
[root@localhost ~]# cp t t-glibc2.26
[root@localhost ~]# patchelf --set-interpreter /opt/glibc-2.25/lib/ld-linux-x86-64.so.2 --set-rpath /opt/glibc-2.25/lib t-glibc2.25
[root@localhost ~]# patchelf --set-interpreter /opt/glibc-2.26/lib/ld-linux-x86-64.so.2 --set-rpath /opt/glibc-2.26/lib t-glibc2.26
[root@localhost ~]# ldd t-glibc2.2{5,6}
t-glibc2.25:
        linux-vdso.so.1 (0x00007ffce19ea000)
        libpthread.so.0 => /opt/glibc-2.25/lib/libpthread.so.0 (0x00007fe09abc3000)
        libc.so.6 => /opt/glibc-2.25/lib/libc.so.6 (0x00007fe09a824000)
        /opt/glibc-2.25/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007fe09ade1000)
t-glibc2.26:
        linux-vdso.so.1 (0x00007ffe3a5c3000)
        libpthread.so.0 => /opt/glibc-2.26/lib/libpthread.so.0 (0x00007f7fa3f38000)
        libc.so.6 => /opt/glibc-2.26/lib/libc.so.6 (0x00007f7fa3b86000)
        /opt/glibc-2.26/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f7fa4156000)
[root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.25 > /dev/null
==115881== Helgrind, a thread error detector
==115881== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.
==115881== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==115881== Command: ./t-glibc2.25
==115881== 
==115881== 
==115881== Use --history-level=approx or =none to gain increased speed, at
==115881== the cost of reduced accuracy of conflicting-access information
==115881== For lists of detected and suppressed errors, rerun with: -s
==115881== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2898 from 51)
[root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.25 > /dev/null                                                  
==115881== Helgrind, a thread error detector                                                                           
==115881== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.                                             
==115881== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==115881== Command: ./t-glibc2.25               
==115881==                                                                                                             
==115881==                                                                                                             
==115881== Use --history-level=approx or =none to gain increased speed, at                                             
==115881== the cost of reduced accuracy of conflicting-access information
==115881== For lists of detected and suppressed errors, rerun with: -s
==115881== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2898 from 51)
[root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.26 > /dev/null                                                                                                                                                               [59/20969]
==115884== Helgrind, a thread error detector                                                                                                                                                                                                   
==115884== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al.                                                                                                                                                                     
==115884== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info                                                                                                                                                                  
==115884== Command: ./t-glibc2.26                                                                                                                                                                                                              
==115884==                                                                                                                                                                                                                                     
==115884== ---Thread-Announcement------------------------------------------                                                                                                                                                                    
==115884==                                                                                                                                                                                                                                     
==115884== Thread #3 was created                                                                                                                                                                                                               
==115884==    at 0x515522E: clone (clone.S:71)                                                                                                                                                                                                 
==115884==    by 0x4E4B1FE: create_thread (createthread.c:100)                                                                                                                                                                                 
==115884==    by 0x4E4CC04: pthread_create@@GLIBC_2.2.5 (pthread_create.c:822)                                                                                                                                                                 
==115884==    by 0x4C38626: pthread_create_WRK (hg_intercepts.c:436)                                                                                                                                                                           
==115884==    by 0x4C39720: pthread_create@* (hg_intercepts.c:469)                                                                                                                                                                             
==115884==    by 0x40071A: main (t.c:30)                                                                                                                                                                                                       
==115884==                                                                                                                                                                                                                                     
==115884== ---Thread-Announcement------------------------------------------
==115884== 
==115884== Thread #2 was created
==115884==    at 0x515522E: clone (clone.S:71)
==115884==    by 0x4E4B1FE: create_thread (createthread.c:100)
==115884==    by 0x4E4CC04: pthread_create@@GLIBC_2.2.5 (pthread_create.c:822)
==115884==    by 0x4C38626: pthread_create_WRK (hg_intercepts.c:436)
==115884==    by 0x4C39720: pthread_create@* (hg_intercepts.c:469)
==115884==    by 0x400701: main (t.c:29)
==115884== 
==115884== ----------------------------------------------------------------
==115884== 
==115884== Possible data race during write of size 1 at 0x5C167DB by thread #3
==115884== Locks held: none
==115884==    at 0x4C3DFEA: mempcpy (vg_replace_strmem.c:1562)
==115884==    by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311)
==115884==    by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B00A3: vfprintf (vfprintf.c:1328)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x400694: thread3 (t.c:11)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Address 0x5c167db is 1,611 bytes inside a block of size 4,096 alloc'd
==115884==    at 0x4C2FEA5: malloc (vg_replace_malloc.c:380)
==115884==    by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101)
==115884==    by 0x50DCDDF: _IO_doallocbuf (genops.c:398)
==115884==    by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812)
==115884==    by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323)
==115884==    by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B00A3: vfprintf (vfprintf.c:1328)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x4006D0: thread2 (t.c:22)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Block was alloc'd by thread #2
==115884== 
==115884== ----------------------------------------------------------------
==115884== 
==115884== Possible data race during write of size 1 at 0x5C167DC by thread #3
==115884== Locks held: none
==115884==    at 0x4C3E007: mempcpy (vg_replace_strmem.c:1562)
==115884==    by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311)
==115884==    by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B1752: vfprintf (vfprintf.c:1642)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x400694: thread3 (t.c:11)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Address 0x5c167dc is 1,612 bytes inside a block of size 4,096 alloc'd
==115884==    at 0x4C2FEA5: malloc (vg_replace_malloc.c:380)
==115884==    by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101)
==115884==    by 0x50DCDDF: _IO_doallocbuf (genops.c:398)
==115884==    by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812)
==115884==    by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323)
==115884==    by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B00A3: vfprintf (vfprintf.c:1328)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x4006D0: thread2 (t.c:22)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Block was alloc'd by thread #2
==115884== 
==115884== ----------------------------------------------------------------
==115884== 
==115884== Possible data race during write of size 1 at 0x5C167DD by thread #3
==115884== Locks held: none
==115884==    at 0x4C3DFEA: mempcpy (vg_replace_strmem.c:1562)
==115884==    by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311)
==115884==    by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B06F4: vfprintf (vfprintf.c:1674)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x400694: thread3 (t.c:11)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Address 0x5c167dd is 1,613 bytes inside a block of size 4,096 alloc'd
==115884==    at 0x4C2FEA5: malloc (vg_replace_malloc.c:380)
==115884==    by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101)
==115884==    by 0x50DCDDF: _IO_doallocbuf (genops.c:398)
==115884==    by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812)
==115884==    by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323)
==115884==    by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271)
==115884==    by 0x50B00A3: vfprintf (vfprintf.c:1328)
==115884==    by 0x50B8958: printf (printf.c:33)
==115884==    by 0x4006D0: thread2 (t.c:22)
==115884==    by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398)
==115884==    by 0x4E4C45D: start_thread (pthread_create.c:465)
==115884==    by 0x515523E: clone (clone.S:95)
==115884==  Block was alloc'd by thread #2
==115884== 
==115884== 
==115884== Use --history-level=approx or =none to gain increased speed, at
==115884== the cost of reduced accuracy of conflicting-access information
==115884== For lists of detected and suppressed errors, rerun with: -s
==115884== ERROR SUMMARY: 1410 errors from 3 contexts (suppressed: 2768 from 42)

I did a similar check using a RHEL-7 distro, where this false races are not reported with
the available packages valgrind-3.15.0-11.el7.x86_64 and glibc-2.17-325.el7_9.x86_64.
After building glibc 2.25 and 2.26 and patching the elf, helgrind didn't report races
with the glibc2.25 executable and did report them with the glibc 2.26 one.

Reproducible:

Always if glibc used is newer/equal to 2.26, in all valgrind versions.

Additional information:

Tested with x86_64, s390x, aarch64 and ppc64le. glibc did not change anything related
to mempcpy in ppc64le so the old suppression still applies there.
Comment 1 Jesus Checa 2022-02-28 12:01:31 UTC
Created attachment 147188 [details]
Patch adding suppression + testcase

Attaching a patch that adds the corresponding suppression and a testcase with Mark Wielaard's reproducer from bug 352130
Comment 2 Mark Wielaard 2022-04-08 13:37:11 UTC
(In reply to Jesus Checa from comment #1)
> Created attachment 147188 [details]
> Patch adding suppression + testcase
> 
> Attaching a patch that adds the corresponding suppression and a testcase
> with Mark Wielaard's reproducer from bug 352130

Looks correct. Thanks for the analysis and all the testing. Locally verified against glibc-2.34.
Pushed as:

commit 7b5867b1fda1dbadc1291988828f6d6ad9438144
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Apr 8 14:58:38 2022 +0200

    helgrind reports false races for printfs using mempcpy on FILE* state
    
    We already have a suppression for helgrind which is for when glibc
    uses __GI_mempcpy to manipulate internal FILE state (this was bug
    352130). But since glibc-2.26 mempcpy is used instead __GI_mempcpy,
    making the suppresion from the original bug obsolete.
    
    This patch adds a new suppression using mempcpy but doesn't replace
    the original suppression for older systems.
    
    Patch adding suppression + testcase by Jesus Checa <jcheca@redhat.com>
    
    https://bugs.kde.org/show_bug.cgi?id=450962
Comment 3 Mark Wielaard 2022-04-11 13:00:10 UTC
commit 4c27f0f7649eb02336dcd0af9d2f8977b20d3a9f
Author: Mark Wielaard <mark@klomp.org>
Date:   Mon Apr 11 14:45:49 2022 +0200

    Extend helgrind suppression for _IO_*xsputn* FILE* state manipulation
    
    commit 7b5867b1f "helgrind reports false races for printfs using
    mempcpy on FILE* state" extended the helgrind-glibc-io-xsputn
    suppression by also covering mempcpy (instead of __GI_mempcpy).
    The test added in that commit exposed a couple of other variants
    of this suppression where _IO_*xsputn* called memcpy (instead of
    mempcpy) and/or had an extra indirection/function in between.
    
    Replace the two two suppressions with one that covers all cases
    where _IO_*xsputn* *mem*cpy variants with possibly another ...
    function in between.
    
    https://bugs.kde.org/show_bug.cgi?id=450962