Bug 435732 - memcheck/tests/leak_cpp_interior fails with gcc11
Summary: memcheck/tests/leak_cpp_interior fails with gcc11
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-14 17:39 UTC by Khem Raj
Modified: 2021-11-14 08:23 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Khem Raj 2021-04-14 17:39:54 UTC
SUMMARY

Test fails with gcc 11


STEPS TO REPRODUCE
1. compile whole system with gcc11
2. build valgrind from master
3. make tests

OBSERVED RESULT

memcheck/tests/leak_cpp_interior.stderr.diff
************
--- leak_cpp_interior.stderr.exp
+++ leak_cpp_interior.stderr.out
@@ -9,7 +9,7 @@
    definitely lost: x bytes in 1 blocks
    indirectly lost: 0 bytes in 0 blocks
      possibly lost: 0 bytes in 0 blocks
-   still reachable: x bytes in 8 blocks
+   still reachable: x bytes in 9 blocks
                       of which reachable via heuristic:
                         stdstring          : x bytes in 2 blocks
                         length64           : x bytes in 1 blocks
@@ -23,7 +23,7 @@
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
      possibly lost: x (+x) bytes in 4 (+4) blocks
-   still reachable: x (-x) bytes in 4 (-4) blocks
+   still reachable: x (-x) bytes in 5 (-4) blocks
                       of which reachable via heuristic:
                         stdstring          : 0 (-x) bytes in 0 (-2) blocks
                         length64           : 0 (-x) bytes in 0 (-1) blocks
@@ -35,10 +35,10 @@
 LEAK SUMMARY:
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
-     possibly lost: x (-x) bytes in 5 (+1) blocks
-   still reachable: x (+x) bytes in 3 (-1) blocks
+     possibly lost: x (-x) bytes in 4 (+0) blocks
+   still reachable: x (+x) bytes in 5 (+0) blocks
                       of which reachable via heuristic:
-                        newarray           : x (+x) bytes in 1 (+1) blocks
+                        newarray           : x (+x) bytes in 2 (+2) blocks
                         multipleinheritance: 0 (-x) bytes in 0 (-2) blocks
 To see details of leaked memory, give 'full' arg to leak_check

@@ -46,11 +46,11 @@
 LEAK SUMMARY:
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
-     possibly lost: x (-x) bytes in 5 (+0) blocks
-   still reachable: x (+x) bytes in 3 (+0) blocks
+     possibly lost: x (+x) bytes in 5 (+1) blocks
+   still reachable: x (-x) bytes in 4 (-1) blocks
                       of which reachable via heuristic:
                         length64           : x (+x) bytes in 1 (+1) blocks
-                        newarray           : 0 (-x) bytes in 0 (-1) blocks
+                        newarray           : 0 (-x) bytes in 0 (-2) blocks
 To see details of leaked memory, give 'full' arg to leak_check

 leak_check summary heuristics stdstring
@@ -58,7 +58,7 @@
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
      possibly lost: x (-x) bytes in 4 (-1) blocks
-   still reachable: x (+x) bytes in 4 (+1) blocks
+   still reachable: x (+x) bytes in 5 (+1) blocks

@@ -82,7 +82,7 @@
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
      possibly lost: 0 (+0) bytes in 0 (+0) blocks
-   still reachable: x (+0) bytes in 8 (+0) blocks
+   still reachable: x (+0) bytes in 9 (+0) blocks
                       of which reachable via heuristic:
                         stdstring          : x (+0) bytes in 2 (+0) blocks
                         length64           : x (+0) bytes in 1 (+0) blocks
@@ -95,7 +95,7 @@
    definitely lost: x (+0) bytes in 1 (+0) blocks
    indirectly lost: 0 (+0) bytes in 0 (+0) blocks
      possibly lost: x (+x) bytes in 6 (+6) blocks
-   still reachable: x (-x) bytes in 2 (-6) blocks
+   still reachable: x (-x) bytes in 3 (-6) blocks
                       of which reachable via heuristic:
                         stdstring          : 0 (-x) bytes in 0 (-2) blocks
                         length64           : 0 (-x) bytes in 0 (-1) blocks
=== Test Summary ===
TOTAL: 736
PASSED: 695
FAILED: 1
SKIPPED: 40
EXPECTED RESULT

No differences

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: Yocto ( master )
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 Paul Floyd 2021-04-15 09:58:40 UTC
Just a few comments on this. I'm not too sure how the std::string heuristic works these days.


Here is the code from mc_leakcheck.c

if (HiS(LchStdString, heur_set)) {
      // Detects inner pointers to Std::String for layout being
      //     length capacity refcount char_array[] \0
      // where ptr points to the beginning of the char_array.
      // Note: we check definedness for length and capacity but
      // not for refcount, as refcount size might be smaller than
      // a SizeT, giving a uninitialised hole in the first 3 SizeT.
      if ( ptr == ch->data + 3 * sizeof(SizeT)
           && MC_(is_valid_aligned_word)(ch->data + sizeof(SizeT))) {
         const SizeT capacity = *((SizeT*)(ch->data + sizeof(SizeT)));
         if (3 * sizeof(SizeT) + capacity + 1 == ch->szB
            && MC_(is_valid_aligned_word)(ch->data)) {
            const SizeT length = *((SizeT*)ch->data);
            if (length <= capacity) {
               // ??? could check there is no null byte from ptr to ptr+length-1
               // ???    and that there is a null byte at ptr+length.
               // ???
               // ??? could check that ch->allockind is MC_AllocNew ???
               // ??? probably not a good idea, as I guess stdstring
               // ??? allocator can be done via custom allocator
               // ??? or even a call to malloc ????
               VG_(set_fault_catcher) (prev_catcher);
               return LchStdString;
            }
         }
      }
   }

Modern C++ does not use an std:string layout that looks like
//     length capacity refcount char_array[] \0

In fact, since C++11, refounts in strings has been forbidden. GCC was a bit slow implementing this, which was done for GCC 5. With GCC 6 or 7 (can't remember which) the default C++ compilation changed to C++14. This is fairly important as GCC maintained backwards compatibility using C++ namespaces, So it is possible to use either the old or the new layout [https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html]. GCC 11 might have switched to defaulting to C++17, which could change things again (but I'm not aware of any such changes).

So what are the newer layouts?

With GCC / libstdc++ it is like this


   // _Alloc_hider just contains
   //  pointer _M_p; // The actual data.
   _Alloc_hider      _M_dataplus;
   size_type         _M_string_length;

   enum { _S_local_capacity = 15 / sizeof(_CharT) };

   union
   {
     _CharT           _M_local_buf[_S_local_capacity + 1];
     size_type        _M_allocated_capacity;
   };

With clang/libc++, see here 

https://eu90h.github.io/cpp-strings.html

Roughly speaking, the std:string now uses something called the "short string optimization" which avoids any 

struct __long {
   size_type __cap_;
   size_type __size_;
   pointer   __data_;
};


struct __short {
   union {
	 unsigned char __size_;
	 value_type __lx;
   };
   value_type __data_[__min_cap];
};

where __min_cap is 23.


Either way, the actual data pointer is not where the current code assumes that it is. I don't know of a way to tell which ABI is being used.
Comment 2 Paul Floyd 2021-04-15 20:34:49 UTC
Hmm. The stuff that I wrote for C++11 was mostly right, but it is wrong for C++03 and earlier.

libstdc++ prior to C++11 jus has a pointer in std::string. This points to memory containing the layout  len/cap/refcount/sting data.

I need to check to see how this testcase is compiled and do some debugging.
Comment 3 Paul Floyd 2021-04-16 13:27:47 UTC
I can reproduce this problem with a much older g++, 5.3.0.

Running the testcase in place gives the following information

==13544== 72,704 bytes in 1 blocks are still reachable in loss record 10 of 10
==13544==    at 0x402D80A: malloc (vg_replace_malloc.c:380)
==13544==    by 0x40D14BF: pool (eh_alloc.cc:117)
==13544==    by 0x40D14BF: __static_initialization_and_destruction_0 (eh_alloc.cc:244)
==13544==    by 0x40D14BF: _GLOBAL__sub_I_eh_alloc.cc (eh_alloc.cc:307)
==13544==    by 0x400F902: _dl_init (in /usr/lib64/ld-2.17.so)
==13544==    by 0x4001159: ??? (in /usr/lib64/ld-2.17.so)

I would expect the freeres function to clean this up. Need to do more debugging to see why it is not being called (or if it is being called, why it isn't freeing this memory).
Comment 4 Mark Wielaard 2021-10-27 16:00:06 UTC
See also https://bugs.kde.org/show_bug.cgi?id=403802
Comment 5 Paul Floyd 2021-10-27 20:01:40 UTC
The doesn't reproduce with g++11 (FreeBSD Ports Collection) 11.2.0
Comment 6 Paul Floyd 2021-10-28 13:07:56 UTC
We should be looking at leak_cpp_interior.stderr.diff-64bit rather than leak_cpp_interior.stderr.diff (same for 403802)

With g++ 9.2 on RHEL 7 I get a similar failure, but without the still reachable that I get with g++ 5.3. 


Running the test standalone as follows

 ../../vg-in-place --leak-check-heuristics=multipleinheritance,stdstring,newarray,length64 --suppressions=libstdc++.supp --leak-check=full --show-reachable=yes ./leak_cpp_interior

it looks like the problem is due to mismatches of the suppressions in libstdc++.supp and those produced by different GCC/libstc++ versions.


The following diff may be a fix

diff --git a/memcheck/tests/libstdc++.supp b/memcheck/tests/libstdc++.supp
index 520e6613a..475e2d9fa 100644
--- a/memcheck/tests/libstdc++.supp
+++ b/memcheck/tests/libstdc++.supp
@@ -81,10 +81,7 @@
   fun:pool
   fun:__static_initialization_and_destruction_0
   fun:_GLOBAL__sub_I_eh_alloc.cc
-  fun:call_init.part.0
-  fun:call_init
-  fun:_dl_init
-  obj:*lib*/ld-2.*.so
+  obj:*lib*/libstdc++.so*
 }
 {
    malloc-leaks-cxx-stl-string-classes-debug2
Comment 7 Paul Floyd 2021-11-02 09:01:42 UTC
I think that an even broader suppression for the stdio buffer should be OK, as follows

diff --git a/memcheck/tests/libstdc++.supp b/memcheck/tests/libstdc++.supp
index 520e6613a..3cd2e628d 100644
--- a/memcheck/tests/libstdc++.supp
+++ b/memcheck/tests/libstdc++.supp
@@ -81,10 +81,6 @@
   fun:pool
   fun:__static_initialization_and_destruction_0
   fun:_GLOBAL__sub_I_eh_alloc.cc
-  fun:call_init.part.0
-  fun:call_init
-  fun:_dl_init
-  obj:*lib*/ld-2.*.so
 }
 {
    malloc-leaks-cxx-stl-string-classes-debug2
Comment 8 Paul Floyd 2021-11-12 23:08:14 UTC
I made the suppression even broader. It's gone from failing to passing for me on Fedora 34 / GCC 11.2.1
Comment 9 Paul Floyd 2021-11-13 07:41:51 UTC
In the nightly tests I see that this is failing for several platforms still (s390x Fedora 34, amd64 Fedora 34 to 36).

It could be that the difference is down to whether debuginfo is installed/available for libstfc++.
Comment 10 Paul Floyd 2021-11-13 11:36:02 UTC
I do see a difference because of debuginfo, and pushed a second change for that. If the nightlies are clean I'll close this (and 403802)
Comment 11 Paul Floyd 2021-11-14 08:23:56 UTC
In only see Philippe W's GCC 4.8.5 / Contos 7 test failing now, and I think that is due to some other issue. So this can be closed.