Created attachment 147187 [details] valgrind run in RHEL 8.5 with glibc 2.28 Description of the issue: Helgrind reports again false races for printfs as described in bug 352130. This happens in newer glibc versions. More specifically I've reproduced this in RHEL-8.5 with glibc-2.28-164.el8.x86_64 and valgrind-3.17.0-5.el8.x86_64, and also in Fedora 34 with glibc-2.33-21.fc34.x86_64 and valgrind-3.18.1-1.fc34.x86_64. In RHEL-8.5 (with glibc 2.28) I've built and run the reproducer that Mark wrote originally in bug 352130, which produces the output shown in the attachment. It seems that in newer glibc versions mempcpy is used instead __GI_mempcpy, making the suppresion from the original bug obsolete. I tracked down the changes to __GI_mempcpy to the following glibc's commit: commit 5c3e322d3be3803636e38bcaf083fb59b3a34f0c Author: H.J. Lu <hjl.tools@gmail.com> Date: Wed Jun 14 12:10:57 2017 -0700 x86-64: Implement memmove family IFUNC selectors in C Implement memmove family IFUNC selectors in C. All internal calls within libc.so can use IFUNC on x86-64 since unlike x86, x86-64 supports PC-relative addressing to access the GOT entry so that it can call via PLT without using an extra register. For libc.a, we can't use IFUNC for functions which are called before IFUNC has been initialized. Use IFUNC internally reduces the icache footprint since libc.so and other codes in the process use the same implementations. This patch uses IFUNC for memmove family functions within libc. From there, in glibc repo: $ git branch --contains 5c3e322d3be3803636e38bcaf083fb59b3a34f0c master release/2.26/master release/2.27/master release/2.28/master release/2.29/master release/2.30/master release/2.31/master release/2.32/master release/2.33/master release/2.34/master To check my assumption, I grabbed glibc 2.25 and 2.26 sources, and built/installed into /opt/glibc-2.2[5|6], built the reproducer and patched the elf to use my custom built glibc: [root@localhost ~]# gcc -g t.c -o t -lpthread [root@localhost ~]# cp t t-glibc2.25 [root@localhost ~]# cp t t-glibc2.26 [root@localhost ~]# patchelf --set-interpreter /opt/glibc-2.25/lib/ld-linux-x86-64.so.2 --set-rpath /opt/glibc-2.25/lib t-glibc2.25 [root@localhost ~]# patchelf --set-interpreter /opt/glibc-2.26/lib/ld-linux-x86-64.so.2 --set-rpath /opt/glibc-2.26/lib t-glibc2.26 [root@localhost ~]# ldd t-glibc2.2{5,6} t-glibc2.25: linux-vdso.so.1 (0x00007ffce19ea000) libpthread.so.0 => /opt/glibc-2.25/lib/libpthread.so.0 (0x00007fe09abc3000) libc.so.6 => /opt/glibc-2.25/lib/libc.so.6 (0x00007fe09a824000) /opt/glibc-2.25/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007fe09ade1000) t-glibc2.26: linux-vdso.so.1 (0x00007ffe3a5c3000) libpthread.so.0 => /opt/glibc-2.26/lib/libpthread.so.0 (0x00007f7fa3f38000) libc.so.6 => /opt/glibc-2.26/lib/libc.so.6 (0x00007f7fa3b86000) /opt/glibc-2.26/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f7fa4156000) [root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.25 > /dev/null ==115881== Helgrind, a thread error detector ==115881== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==115881== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==115881== Command: ./t-glibc2.25 ==115881== ==115881== ==115881== Use --history-level=approx or =none to gain increased speed, at ==115881== the cost of reduced accuracy of conflicting-access information ==115881== For lists of detected and suppressed errors, rerun with: -s ==115881== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2898 from 51) [root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.25 > /dev/null ==115881== Helgrind, a thread error detector ==115881== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==115881== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==115881== Command: ./t-glibc2.25 ==115881== ==115881== ==115881== Use --history-level=approx or =none to gain increased speed, at ==115881== the cost of reduced accuracy of conflicting-access information ==115881== For lists of detected and suppressed errors, rerun with: -s ==115881== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2898 from 51) [root@localhost ~]# valgrind --tool=helgrind ./t-glibc2.26 > /dev/null [59/20969] ==115884== Helgrind, a thread error detector ==115884== Copyright (C) 2007-2017, and GNU GPL'd, by OpenWorks LLP et al. ==115884== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info ==115884== Command: ./t-glibc2.26 ==115884== ==115884== ---Thread-Announcement------------------------------------------ ==115884== ==115884== Thread #3 was created ==115884== at 0x515522E: clone (clone.S:71) ==115884== by 0x4E4B1FE: create_thread (createthread.c:100) ==115884== by 0x4E4CC04: pthread_create@@GLIBC_2.2.5 (pthread_create.c:822) ==115884== by 0x4C38626: pthread_create_WRK (hg_intercepts.c:436) ==115884== by 0x4C39720: pthread_create@* (hg_intercepts.c:469) ==115884== by 0x40071A: main (t.c:30) ==115884== ==115884== ---Thread-Announcement------------------------------------------ ==115884== ==115884== Thread #2 was created ==115884== at 0x515522E: clone (clone.S:71) ==115884== by 0x4E4B1FE: create_thread (createthread.c:100) ==115884== by 0x4E4CC04: pthread_create@@GLIBC_2.2.5 (pthread_create.c:822) ==115884== by 0x4C38626: pthread_create_WRK (hg_intercepts.c:436) ==115884== by 0x4C39720: pthread_create@* (hg_intercepts.c:469) ==115884== by 0x400701: main (t.c:29) ==115884== ==115884== ---------------------------------------------------------------- ==115884== ==115884== Possible data race during write of size 1 at 0x5C167DB by thread #3 ==115884== Locks held: none ==115884== at 0x4C3DFEA: mempcpy (vg_replace_strmem.c:1562) ==115884== by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311) ==115884== by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B00A3: vfprintf (vfprintf.c:1328) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x400694: thread3 (t.c:11) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Address 0x5c167db is 1,611 bytes inside a block of size 4,096 alloc'd ==115884== at 0x4C2FEA5: malloc (vg_replace_malloc.c:380) ==115884== by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101) ==115884== by 0x50DCDDF: _IO_doallocbuf (genops.c:398) ==115884== by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812) ==115884== by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323) ==115884== by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B00A3: vfprintf (vfprintf.c:1328) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x4006D0: thread2 (t.c:22) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Block was alloc'd by thread #2 ==115884== ==115884== ---------------------------------------------------------------- ==115884== ==115884== Possible data race during write of size 1 at 0x5C167DC by thread #3 ==115884== Locks held: none ==115884== at 0x4C3E007: mempcpy (vg_replace_strmem.c:1562) ==115884== by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311) ==115884== by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B1752: vfprintf (vfprintf.c:1642) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x400694: thread3 (t.c:11) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Address 0x5c167dc is 1,612 bytes inside a block of size 4,096 alloc'd ==115884== at 0x4C2FEA5: malloc (vg_replace_malloc.c:380) ==115884== by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101) ==115884== by 0x50DCDDF: _IO_doallocbuf (genops.c:398) ==115884== by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812) ==115884== by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323) ==115884== by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B00A3: vfprintf (vfprintf.c:1328) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x4006D0: thread2 (t.c:22) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Block was alloc'd by thread #2 ==115884== ==115884== ---------------------------------------------------------------- ==115884== ==115884== Possible data race during write of size 1 at 0x5C167DD by thread #3 ==115884== Locks held: none ==115884== at 0x4C3DFEA: mempcpy (vg_replace_strmem.c:1562) ==115884== by 0x50DB114: _IO_new_file_xsputn (fileops.c:1311) ==115884== by 0x50DB114: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B06F4: vfprintf (vfprintf.c:1674) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x400694: thread3 (t.c:11) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Address 0x5c167dd is 1,613 bytes inside a block of size 4,096 alloc'd ==115884== at 0x4C2FEA5: malloc (vg_replace_malloc.c:380) ==115884== by 0x50CF5A8: _IO_file_doallocate (filedoalloc.c:101) ==115884== by 0x50DCDDF: _IO_doallocbuf (genops.c:398) ==115884== by 0x50DC087: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:812) ==115884== by 0x50DB1BE: _IO_new_file_xsputn (fileops.c:1323) ==115884== by 0x50DB1BE: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1271) ==115884== by 0x50B00A3: vfprintf (vfprintf.c:1328) ==115884== by 0x50B8958: printf (printf.c:33) ==115884== by 0x4006D0: thread2 (t.c:22) ==115884== by 0x4C3881E: mythread_wrapper (hg_intercepts.c:398) ==115884== by 0x4E4C45D: start_thread (pthread_create.c:465) ==115884== by 0x515523E: clone (clone.S:95) ==115884== Block was alloc'd by thread #2 ==115884== ==115884== ==115884== Use --history-level=approx or =none to gain increased speed, at ==115884== the cost of reduced accuracy of conflicting-access information ==115884== For lists of detected and suppressed errors, rerun with: -s ==115884== ERROR SUMMARY: 1410 errors from 3 contexts (suppressed: 2768 from 42) I did a similar check using a RHEL-7 distro, where this false races are not reported with the available packages valgrind-3.15.0-11.el7.x86_64 and glibc-2.17-325.el7_9.x86_64. After building glibc 2.25 and 2.26 and patching the elf, helgrind didn't report races with the glibc2.25 executable and did report them with the glibc 2.26 one. Reproducible: Always if glibc used is newer/equal to 2.26, in all valgrind versions. Additional information: Tested with x86_64, s390x, aarch64 and ppc64le. glibc did not change anything related to mempcpy in ppc64le so the old suppression still applies there.
Created attachment 147188 [details] Patch adding suppression + testcase Attaching a patch that adds the corresponding suppression and a testcase with Mark Wielaard's reproducer from bug 352130
(In reply to Jesus Checa from comment #1) > Created attachment 147188 [details] > Patch adding suppression + testcase > > Attaching a patch that adds the corresponding suppression and a testcase > with Mark Wielaard's reproducer from bug 352130 Looks correct. Thanks for the analysis and all the testing. Locally verified against glibc-2.34. Pushed as: commit 7b5867b1fda1dbadc1291988828f6d6ad9438144 Author: Mark Wielaard <mark@klomp.org> Date: Fri Apr 8 14:58:38 2022 +0200 helgrind reports false races for printfs using mempcpy on FILE* state We already have a suppression for helgrind which is for when glibc uses __GI_mempcpy to manipulate internal FILE state (this was bug 352130). But since glibc-2.26 mempcpy is used instead __GI_mempcpy, making the suppresion from the original bug obsolete. This patch adds a new suppression using mempcpy but doesn't replace the original suppression for older systems. Patch adding suppression + testcase by Jesus Checa <jcheca@redhat.com> https://bugs.kde.org/show_bug.cgi?id=450962
commit 4c27f0f7649eb02336dcd0af9d2f8977b20d3a9f Author: Mark Wielaard <mark@klomp.org> Date: Mon Apr 11 14:45:49 2022 +0200 Extend helgrind suppression for _IO_*xsputn* FILE* state manipulation commit 7b5867b1f "helgrind reports false races for printfs using mempcpy on FILE* state" extended the helgrind-glibc-io-xsputn suppression by also covering mempcpy (instead of __GI_mempcpy). The test added in that commit exposed a couple of other variants of this suppression where _IO_*xsputn* called memcpy (instead of mempcpy) and/or had an extra indirection/function in between. Replace the two two suppressions with one that covers all cases where _IO_*xsputn* *mem*cpy variants with possibly another ... function in between. https://bugs.kde.org/show_bug.cgi?id=450962