Bug 472346

Summary: False positive mismatched frees
Product: [Developer tools] valgrind Reporter: Paul Floyd <pjfloyd>
Component: memcheckAssignee: Paul Floyd <pjfloyd>
Status: RESOLVED NOT A BUG    
Severity: wishlist CC: sam, zilla
Priority: NOR    
Version First Reported In: 3.22 GIT   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Unspecified   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Paul Floyd 2023-07-18 10:21:03 UTC
Some details. RHEL 7.9, GCC 5.3 home rolled. Debug exe but it dlopens a 

Here are a couple of cases that I see

==12389== Mismatched free() / delete / delete []
==12389==    at 0x63E3164: free (vg_replace_malloc.c:974)
==12389==    by 0x1B4D75F5: __gnu_cxx::new_allocator ..snip ... ::deallocate ...snip... (new_allocator.h:110)
==12389==  Address 0x27246130 is 0 bytes inside a block of size 88 alloc'd
==12389==    at 0x63E0EF1: operator new(unsigned long) (vg_replace_malloc.c:472)
==12389==    by 0x2514CCD: __gnu_cxx::new_allocator...snip...::allocate(unsigned long, void const*) (ext/new_allocator.h:104)

The code for deallocate is
      // __p is not permitted to be a null pointer.
      void
      deallocate(pointer __p, size_type)
      { ::operator delete(__p); }

The mangled last function in the callstack is
_ZN9__gnu_cxx13new_allocatorISt13_Rb_tree_nodeISt4pairIKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorIS2_IS8_S8_ESaISB_EEEEE10deallocateEPSF_m
which is deallocating a std::map<std::string, std::vector<std::pair<std::string,std::string> > >

cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorIS2_IS8_S8_ESaISB_EEEEE10deallocateEPSF_m>:
   925d6:       55                      push   %rbp
   925d7:       48 89 e5                mov    %rsp,%rbp
   925da:       48 83 ec 20             sub    $0x20,%rsp
   925de:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
   925e2:       48 89 75 f0             mov    %rsi,-0x10(%rbp)
   925e6:       48 89 55 e8             mov    %rdx,-0x18(%rbp)
   925ea:       48 8b 45 f0             mov    -0x10(%rbp),%rax
   925ee:       48 89 c7                mov    %rax,%rdi
   925f1:       e8 6a 3c 9c 00          callq  a56260 <_ZdlPv>
   925f6:       90                      nop
   925f7:       c9                      leaveq 
   925f8:       c3                      retq

There is something weird there. _ZdlPv in mangle-speak is _Z (C++) dl (delete) Pv (pointer to void). But Valgrind is redirecting this to free. Originally I thought that the ::operator delete(__p) was simply being inlined.

A second error from the same testcase:

==31400== Mismatched free() / delete / delete []
==31400==    at 0x63E36C5: operator delete(void*) (vg_replace_malloc.c:1025)
==31400==    by 0x2543553: ...snip... user code
==31400==  Address 0x285f41d0 is 0 bytes inside a block of size 64 alloc'd
==31400==    at 0x63E078B: malloc (vg_replace_malloc.c:431)
==31400==    by 0x27E4E2F7: operator new(unsigned long) (new_op.cc:50)
==31400==    by 0x2778CA2A: ...snip... user code

This is definitely a Valgrind problem - operator delete is redirected but operator new isn't. I tried debugging this with vgdb, and putting a break on _Znwm and there were 4 sites. Need more tracing / debugging of redirs and symtab when that shared library gets dlopen'd and mmap'd.

Now a second exe which generates huge numbers of errors on startup. These are for things like

const set<string> options = { "option1", "option2"};

Temporary std::string objects get created for the options shich cause errors like

==31386== Mismatched free() / delete / delete []
==31386==    at 0x188D7164: free (vg_replace_malloc.c:974)
==31386==    by 0x620C460: __static_initialization_and_destruction_0(int, int) (header.h:63)
==31386==    by 0x621BE96: _GLOBAL__sub_I_source.cc (source.cc:12969)
==31386==    by 0x163647CC: __libc_csu_init (in /path/to/exe)
==31386==    by 0x20E9A4E4: (below main) (in /usr/lib64/libc-2.17.so)
==31386==  Address 0x217dc590 is 0 bytes inside a block of size 37 alloc'd
==31386==    at 0x188D61E7: operator new[](unsigned long) (vg_replace_malloc.c:714)
==31386==    by 0x736C3DB: void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>(char const*, char const*, std::forward_iterator_tag) (basic_string.tcc:223)
==31386==    by 0x1B24FA9E: _M_construct_aux<char const*> (basic_string.h:236)
==31386==    by 0x1B24FA9E: _M_construct<char const*> (basic_string.h:255)
==31386==    by 0x1B24FA9E: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (basic_string.h:511)
==31386==    by 0x620C3D7: __static_initialization_and_destruction_0(int, int) (header.h:44)
==31386==    by 0x621BE96: _GLOBAL__sub_I_source.cc (source.cc:12969)
==31386==    by 0x163647CC: __libc_csu_init (in /path/to/exe)
==31386==    by 0x20E9A4E4: (below main) (in /usr/lib64/libc-2.17.so)

What I had originally thought was inlining might be tail call optimization.
In 000000000012b000 <_ZNSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE9_M_createERmm>:
there is
  12b036:       e9 25 b1 f5 ff          jmpq   86160 <_Znwm@plt>
Comment 1 Paul Floyd 2024-03-19 09:26:19 UTC
For the second exe, the problem is with tcmalloc. As an optimization it just uses a single function for free and operator deletes. See the comments here https://github.com/gperftools/gperftools/issues/792.
Comment 2 Paul Floyd 2024-03-19 16:23:43 UTC
Not too surprisingly, the second problem is also a user error.

There's a 3rd party library that links stitically with libstdc++ using a linker script to give all sumbols local scope except for a few global symbols in the exported interface.

Then there is a header with inline functions that call new and delete. The new gets called from the context of the 3rd party lib and so uses the hidden static libstdc++ new. The delete gets called from the context of the main exe an uses dynamic libstdc++ which isn't hidden. So memcheck sees a 'malloc' for the allocation and a 'delete' for the deallocation, and complains. The static new can be seen from the debuginfo but it doesn't get redirected.