Bug 339330

Summary: Feature request: Add support for C++11's std::atomic
Product: [Developer tools] valgrind Reporter: Michael Ensslin <michael>
Component: helgrindAssignee: Paul Floyd <pjfloyd>
Status: REPORTED ---    
Severity: wishlist CC: drahflow, jhasse, pjfloyd
Priority: NOR    
Version First Reported In: 3.9.0   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Michael Ensslin 2014-09-23 14:28:56 UTC
Currently, helgrind is not aware of any of the atomic types from the C++11 standard library, producing heaps of false positives.

Reproducible: Always

Steps to Reproduce:
mic@mic-nb /tmp $ cat test.cpp
#include <atomic>
#include <cstdio>
#include <thread>

int main() {
	std::atomic_bool a{false};
	std::thread t{[&a]() {
		a = true;
		std::printf("%i\n", a.load());
	}};
	a = false;
	std::printf("%i\n", a.load());
	t.join();
}
mic@mic-nb /tmp $ g++ -std=c++11 -lpthread test.cpp
mic@mic-nb /tmp $ valgrind --tool=helgrind ./a.out

Actual Results:  
False positives are reported for each access to the atomic variable

Expected Results:  
Program reported as clean

helgrind output:

==22370== Helgrind, a thread error detector
==22370== Copyright (C) 2007-2013, and GNU GPL'd, by OpenWorks LLP et al.
==22370== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==22370== Command: ./a.out
==22370== 
1
==22370== ---Thread-Announcement------------------------------------------
==22370== 
==22370== Thread #1 is the program's root thread
==22370== 
==22370== ---Thread-Announcement------------------------------------------
==22370== 
==22370== Thread #2 was created
==22370==    at 0x596A3AE: clone (in /usr/lib/libc-2.20.so)
==22370==    by 0x4E425A4: do_clone.constprop.4 (in /usr/lib/libpthread-2.20.so)
==22370==    by 0x4E43A6C: pthread_create@@GLIBC_2.2.5 (in /usr/lib/libpthread-2.20.so)
==22370==    by 0x4C302CD: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==22370==    by 0x5112E78: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (in /usr/lib/libstdc++.so.6.0.20)
==22370==    by 0x400EB5: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /tmp/a.out)
==22370==    by 0x400D81: main (in /tmp/a.out)
==22370== 
==22370== ----------------------------------------------------------------
==22370== 
==22370== Possible data race during write of size 1 at 0xFFEFFFBBF by thread #1
==22370== Locks held: none
==22370==    at 0x401FC1: std::__atomic_base<bool>::operator=(bool) (in /tmp/a.out)
==22370==    by 0x401D88: std::atomic_bool::operator=(bool) (in /tmp/a.out)
==22370==    by 0x400D92: main (in /tmp/a.out)
==22370== 
==22370== This conflicts with a previous write of size 1 by thread #2
==22370== Locks held: none
==22370==    at 0x401FC1: std::__atomic_base<bool>::operator=(bool) (in /tmp/a.out)
==22370==    by 0x401D88: std::atomic_bool::operator=(bool) (in /tmp/a.out)
==22370==    by 0x400D2F: main::{lambda()#1}::operator()() const (in /tmp/a.out)
==22370==    by 0x401D0B: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /tmp/a.out)
==22370==    by 0x401C50: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /tmp/a.out)
==22370==    by 0x401BCD: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /tmp/a.out)
==22370==    by 0x5112D1F: execute_native_thread_routine (in /usr/lib/libstdc++.so.6.0.20)
==22370==    by 0x4C30466: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==22370==  Address 0xffefffbbf is on thread #1's stack
==22370==  in frame #2, created by main (???)
==22370== 
0
==22370== 
==22370== For counts of detected and suppressed errors, rerun with: -v
==22370== Use --history-level=approx or =none to gain increased speed, at
==22370== the cost of reduced accuracy of conflicting-access information
==22370== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 25 from 25)
Comment 1 Paul Floyd 2021-01-05 09:32:23 UTC
I think that this just needs a specific suppression, something like

{
   helgrind---std::atomic assignment
   Helgrind:Race
   fun:store
   fun:_ZNSt13__atomic_baseIbEaSEb
   fun:_ZNSt6atomicIbEaSEb
}
Comment 2 Paul Floyd 2021-01-06 10:32:11 UTC
That's not enough, at least for Fedora 33, where I get further hazards

==12068== Possible data race during write of size 1 at 0x4DC7E30 by thread #1
==12068== Locks held: none
==12068==    at 0x4846952: mempcpy (vg_replace_strmem.c:1562)
==12068==    by 0x4C68261: _IO_file_xsputn@@GLIBC_2.2.5 (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C53018: __vfprintf_internal (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C3F2AE: printf (in /usr/lib64/libc-2.32.so)
==12068==    by 0x40125E: main (std_atomic.cpp:12)
==12068==  Address 0x4dc7e30 is 0 bytes inside a block of size 1,024 alloc'd
==12068==    at 0x483B7A5: malloc (vg_replace_malloc.c:307)
==12068==    by 0x4C5B6E3: _IO_file_doallocate (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C6A09F: _IO_doallocbuf (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C69237: _IO_file_overflow@@GLIBC_2.2.5 (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C682E5: _IO_file_xsputn@@GLIBC_2.2.5 (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C53018: __vfprintf_internal (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4C3F2AE: printf (in /usr/lib64/libc-2.32.so)
==12068==    by 0x4011FD: main::{lambda()#1}::operator()() const (std_atomic.cpp:9)
==12068==    by 0x4015D1: void std::__invoke_impl<void, main::{lambda()#1}>(std::__invoke_other, main::{lambda()#1}&&) (invoke.h:60)
==12068==    by 0x401586: std::__invoke_result<main::{lambda()#1}>::type std::__invoke<main::{lambda()#1}>(std::__invoke_result&&, (main::{lambda()#1}&&)...) (invoke.h:95)
==12068==    by 0x401533: void std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (thread:264)
==12068==    by 0x401507: std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::operator()() (thread:271)
==12068==  Block was alloc'd by thread #2
Comment 3 Paul Floyd 2021-10-19 16:14:52 UTC
I just tried this on FreeBSD 13 with clang 11 and gcc 10.3 and there were no errors. That's probably because the libc implementation has mutex protection, though I don't understand why I don't see any 

std::base<bool>::operator=(bool)

hazards.

I'll recheck on Fedora 34 that the second hazard I saw really came from the std::atomic.
Comment 4 Paul Floyd 2023-04-20 10:23:04 UTC
I don't think that the 2nd hazard that I saw was related. It should be suppressed by

{
   helgrind-glibc2X-004
   Helgrind:Race
   obj:@GLIBC_LIBC_PATH@
}

(Note that this is a super broad suppression which suppresses every leaf function in glibc. Probably most are thread safe but POSIX doesn't require all to be
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_01)

I guess that I was seeing that hazard before the suppression file got made glibc path agnostic. As far as I know printf is thread safe.

I just tried the original example again (RHEL 7.9, GCC 11.2) and get no error.

Getting back to my original idea of using a suppression. That's not so bad but it will only work with debug builds / builds with -fno-inline. In optimized builds "std::atomic<bool>::operator=(bool)" gets inlined.
Comment 5 Jens-W. Schicke-Uffmann 2025-03-18 14:11:11 UTC
This problem is not easily solved "just" by suppressions: std::atomic accesses don't create happens-before edges, and thus spurious data races are also reported for user code:

#include <thread>
#include <memory>
#include <atomic>

using namespace std;

int main(void) {
  std::atomic<std::shared_ptr<int>> value;

  thread reader([&value](){
    while(true) {
      auto ptr = value.load();
      if(ptr && *ptr == 1) { // line 13
        break;
      }
    }
  });

  value.store(make_shared<int>(1));
  reader.join();
  return 0;
}

compiled with g++-14 --std=c++20 -ggdb example.c++
and then valgrind --tool=helgrind a.out
reports apart from a lot of internals of std::atomic<std::shared_ptr<...>> also

==1512249== Possible data race during read of size 4 at 0x4DF0250 by thread #2
==1512249== Locks held: none
==1512249==    at 0x10921E: main::{lambda()#1}::operator()() const (example.c++:13)
==1512249==    by 0x109673: void std::__invoke_impl<void, main::{lambda()#1}>(std::__invoke_other, main::{lambda()#1}&&) (invoke.h:61)
==1512249==    by 0x109636: std::__invoke_result<main::{lambda()#1}>::type std::__invoke<main::{lambda()#1}>(main::{lambda()#1}&&) (invoke.h:96)
==1512249==    by 0x1095E3: void std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (std_thread.h:301)
==1512249==    by 0x1095B7: std::thread::_Invoker<std::tuple<main::{lambda()#1}> >::operator()() (std_thread.h:308)
==1512249==    by 0x10959B: std::thread::_State_impl<std::thread::_Invoker<std::tuple<main::{lambda()#1}> > >::_M_run() (std_thread.h:253)
==1512249==    by 0x4964223: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.33)
==1512249==    by 0x484F38A: ??? (in /usr/libexec/valgrind/vgpreload_helgrind-amd64-linux.so)
==1512249==    by 0x4B9F39B: start_thread (pthread_create.c:444)
==1512249==    by 0x4C2048F: clone (clone.S:100)
==1512249==  Address 0x4df0250 is 16 bytes inside a block of size 24 alloc'd
==1512249==    at 0x4842FD3: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_helgrind-amd64-linux.so)
==1512249==    by 0x10A986: std::__new_allocator<std::_Sp_counted_ptr_inplace<int, std::allocator<void>, (__gnu_cxx::_Lock_policy)2> >::allocate(unsigned long, void const*) (new_allocator.h:151)
==1512249==    by 0x10A59B: allocate (allocator.h:196)
==1512249==    by 0x10A59B: allocate (alloc_traits.h:515)
==1512249==    by 0x10A59B: std::__allocated_ptr<std::allocator<std::_Sp_counted_ptr_inplace<int, std::allocator<void>, (__gnu_cxx::_Lock_policy)2> > > std::__allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<int, std::allocator<void>, (__gnu_cxx::_Lock_policy)2> > >(std::allocator<std::_Sp_counted_ptr_inplace<int, std::allocator<void>, (__gnu_cxx::_Lock_policy)2> >&) (allocated_ptr.h:98)
==1512249==    by 0x10A3FB: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<int, std::allocator<void>, int>(int*&, std::_Sp_alloc_shared_tag<std::allocator<void> >, int&&) (shared_ptr_base.h:967)
==1512249==    by 0x10A22F: std::__shared_ptr<int, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<void>, int>(std::_Sp_alloc_shared_tag<std::allocator<void> >, int&&) (shared_ptr_base.h:1713)
==1512249==    by 0x109EEE: std::shared_ptr<int>::shared_ptr<std::allocator<void>, int>(std::_Sp_alloc_shared_tag<std::allocator<void> >, int&&) (shared_ptr.h:463)
==1512249==    by 0x109BF5: std::shared_ptr<int> std::make_shared<int, int>(int&&) (shared_ptr.h:1008)
==1512249==    by 0x1092A8: main (example.c++:19)
==1512249==  Block was alloc'd by thread #1
==1512249==
Comment 6 Paul Floyd 2025-03-18 16:27:59 UTC
I've long given up using suppressions.

I can't see how this can be fixed in Valgrind. There is instrumentation, but in this case I think that it would need to be done within the standard library or at least by hacking into the internals of std::atomic/std::shared_ptr.