469146 – massif --ignore-fn does not ignore inlined functions

Bug 469146 - massif --ignore-fn does not ignore inlined functions

Summary: massif --ignore-fn does not ignore inlined functions

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	massif (other bugs)
Version First Reported In:	3.21 GIT
Platform:	Other All

Importance:	NOR normal
Target Milestone:	---
Assignee:	Nicholas Nethercote

URL:
Keywords:

Depends on:
Blocks:

Reported:	2023-04-29 16:20 UTC by Paul Floyd
Modified:	2023-05-06 05:50 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Patch so that massif filter_IPs handles inline functions (3.69 KB, patch) 2023-04-29 19:45 UTC, Paul Floyd	Details
Update patch to handle stacked and inlined functions (4.01 KB, patch) 2023-05-02 20:12 UTC, Paul Floyd	Details
Show Obsolete (1) View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Paul Floyd 2023-04-29 16:20:37 UTC

As discussed on valgrind-developers

Comment 1 Paul Floyd 2023-04-29 19:45:36 UTC

Created attachment 158542 [details]
Patch so that massif filter_IPs handles inline functions

Looks better now

Does affect a few other regtests, but they also look better to me. Might need some twiddling if different compilers do different inlining.

I noticed that inline functions don't get mangled. Makes sense, there is no need to link them. Will need to add that to the doc.

I do have one extra testcase for this.

Comment 2 Carl Love 2023-05-01 21:23:29 UTC

Per comment in the Valgrind emailI see what is happening now. The stack in question is

-----------------------------------------------
==2756940==    at 0x48A4C8C: malloc (vg_replace_malloc.c:431)
==2756940==    by 0x58025633: ??? (m_trampoline.S:458)
==2756940==    by 0x4007D17: call_init (dl-init.c:70)
==2756940==    by 0x4007D17: _dl_init (dl-init.c:117)
==2756940==    by 0x40311E7: _dl_start_user (in 
/usr/lib/powerpc64-linux-gnu/ld64.so.1)

Note the identical addresses for call_init and _dl_init. I believe that 
means that call_init is inlined.

This bit of code in ms_main.c skips over call_init

    // top has no fnname => search for the first entry that has a fnname
    for (i = *top; i < n_ips && !top_has_fnname; i++) {
      top_has_fnname = VG_(get_fnname)(ep, ips[i], &fnname);
    }

The workaround is to add _dl_init to the ignore functions.
----------------------------------------------------
I tried adding the line 

prog: new-cpp
 vgopts: --stacks=no --time-unit=B --massif-out-file=massif.out
 vgopts: --ignore-fn=__part_load_locale --ignore-fn=__time_load_locale --ignore-fn=dwarf2_unwind_dyld_add_image_hook
+vgopts: --ignore-fn=dl_init

to massif/tests/new-cpp-vgtest, but it did not change the results.  Not sure if I didn't do it correctly?

Comment 3 Carl Love 2023-05-01 22:25:14 UTC

I tried down loading and applying the patch from Paul to the Valgrind mainline. 
Head of the tree was:

commit d97fed7c3e4aa7c910dfa0b6c5de12fd6cf08155 (tag: VALGRIND_3_21_0, origin/master, origin/HEAD)
Author: Mark Wielaard <mark@klomp.org>
Date:   Fri Apr 28 17:30:04 2023 +0200

    -> 3.21.0 final

With the this commit, i.e. no patch, I get the following results from make regtest on Power 10.

== 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\
tB failures, 2 post failures ==
memcheck/tests/bug340392                 (stderr)
memcheck/tests/linux/rfcomm              (stderr)
massif/tests/new-cpp                     (post)
massif/tests/overloaded-new              (post)

With the patch for massif to filter_IPs hadles inline functions, I am seeing new failures on Power 10.

= 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\
tB failures, 6 post failures ==
memcheck/tests/bug340392                 (stderr)
memcheck/tests/linux/rfcomm              (stderr)
massif/tests/alloc-fns-B                 (post)
massif/tests/deep-B                      (post)
massif/tests/deep-C                      (post)
massif/tests/deep-D                      (post)
massif/tests/new-cpp                     (post)
massif/tests/overloaded-new              (post)

The diff output for new-cpp looks unchanged from the base run (i.e. no patch).

The diff output for deep-B is as follows:
more deep-B.post.diff
--- deep-B.post.exp	2023-05-01 16:49:31.458553002 -0400
+++ deep-B.post.out	2023-05-01 18:15:57.327511581 -0400
@@ -46,13 +46,15 @@
   8          3,264            3,264            3,200            64            0
   9          3,672            3,672            3,600            72            0
 98.04% (3,600B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
-->98.04% (3,600B) 0x........: a5 (deep.c:23)
-  ->98.04% (3,600B) 0x........: a4 (deep.c:24)
-    ->98.04% (3,600B) 0x........: a3 (deep.c:25)
-      ->98.04% (3,600B) 0x........: a2 (deep.c:26)
-        ->98.04% (3,600B) 0x........: a1 (deep.c:27)
-          ->98.04% (3,600B) 0x........: main (deep.c:35)
-            
+->98.04% (3,600B) 0x........: a11 (deep.c:17)
+  ->98.04% (3,600B) 0x........: a10 (deep.c:18)
+    ->98.04% (3,600B) 0x........: a9 (deep.c:19)
+      ->98.04% (3,600B) 0x........: a8 (deep.c:20)
+        ->98.04% (3,600B) 0x........: a7 (deep.c:21)
+          ->98.04% (3,600B) 0x........: a6 (deep.c:22)
+            ->98.04% (3,600B) 0x........: a5 (deep.c:23)
+              ->98.04% (3,600B) 0x........: a4 (deep.c:24)
+                
 -------------------------------------------------------------------------------
-
   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 -------------------------------------------------------------------------------
-

The output for deep-C at first glance looks identical.

Comment 4 Paul Floyd 2023-05-02 07:18:47 UTC

Bah. I still think I'm on the right lines for the cause. I was expecting the extra fails.

I'll do some more debugging of new-cpp.

Comment 5 Paul Floyd 2023-05-02 20:10:41 UTC

My mistake was to replace loops that iterated over the callstack (but skipped inlined functions) with a loops that iterate over inline functions only.

We need to do both.

Comment 6 Paul Floyd 2023-05-02 20:12:16 UTC

Created attachment 158635 [details]
Update patch to handle stacked and inlined functions

Just the code fix. Will also need to update expecteds, maybe add a testcase and update the doc.

Comment 7 Paul Floyd 2023-05-03 09:54:30 UTC

(In reply to Paul Floyd from comment #6)
> Created attachment 158635 [details]
> Update patch to handle stacked and inlined functions
> 
> Just the code fix. Will also need to update expecteds, maybe add a testcase
> and update the doc.

On second thoughts deep-B looks like a regression. The alloc-fn loop is wrong somehow.

Comment 8 Paul Floyd 2023-05-03 10:15:03 UTC

+   for (i = 0; i < n_ips && !top_has_fnname; ++i) {

That needs to be "for (i = *top;"

+      iipc = VG_(new_IIPC)(ep, ips[i]);
+      do {
+         top_has_fnname = VG_(get_fnname_inl)(ep, ips[i], &fnname, iipc);
+         if (top_has_fnname && VG_(strIsMemberXA)(alloc_fns, fnname)) {
+            VERB(4, "filtering alloc fn %s\n", fnname);
+            (*top)++;
+            (*n_ips_sel)--;
+         } else {
+            i = n_ips;
+            break;
+         }
+      } while (VG_(next_IIPC)(iipc));
+      VG_(delete_IIPC)(iipc);

That's a bit tricker. The code is working its way down the stack as long as functions have a name and are in the alloc-fn set.

So maybe something like

 // look for inline functions
 do {
     top_has_fnname = VG_(get_fnname_inl)(ep, ips[i], &fnname, iipc);
     is_allloc_fn = VG_(strIsMemberXA)(alloc_fns, fnname);
  } while (top_has_fnname && is_alloc_fn && VG_(next_IIPC)(iipc));
  // next step down stack
      if (top_has_fnname &&  is_alloc_fn) {
         VERB(4, "filtering alloc fn %s\n", fnname);
         (*top)++;
         (*n_ips_sel)--;
      } else {
         break;
      }

Comment 9 Paul Floyd 2023-05-03 20:18:06 UTC

Carl, I've just pushed the patchset to a try branch

remotes/origin/users/paulf/try-massif_inline

I'm getting clean results for massif on gcc203, which is

paulf@gcc203:~/scratch/valgrind$ lscpu
Architecture:            ppc64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Big Endian
CPU(s):                  64
  On-line CPU(s) list:   0-63
Model name:              POWER8 (architected), altivec supported


Could you check it out and try it please?

Comment 10 Carl Love 2023-05-03 23:49:16 UTC

I cloned a test tree with:
git clone git://sourceware.org/git/binutils-gdb.git gdb-test-massif

I then looked for the remote

git fetch origin
git branch -v -a > junk
 
then looked in junk but don't see your tree?

Is there some other repository where I should be looking?

Comment 11 Paul Floyd 2023-05-04 05:21:21 UTC

git clone  https://sourceware.org/git/valgrind.git && git checkout users/paulf/try-massif_inline
should do the job

Comment 12 Paul Floyd 2023-05-04 09:56:58 UTC

Nick could you take a look at this please and give me your opinion?

The main benefit I see is that it will make the --ignore-fn more consistent with the massif.out. Specifically users will be able to copy and paste from the stack traces within the snapshots in massif.out directly to a script (or the terminal) without having to try to figure out if the name should be mangled or not. Previously it wouldn't be possible to filter out the inline functions in the stack traces.

Comment 13 Paul Floyd 2023-05-04 10:06:14 UTC

I forgot, one thing I was thinking of was to try to speed up the allocs/frees

iipc = VG_(new_IIPC)(ep, ips[i]);
...
VG_(delete_IIPC)(iipc);

that are in the two stack loops.  My plan is to split up VG_(new_IIPC) so that it uses a new function VG_(init_IIPC)

That would allow
InlIPCursor iipc; // stack allocation
...
VG_(init_IIPC)(&iipc, ep, ips[i]);
...
// no need for heap deallocation

Comment 14 Carl Love 2023-05-04 16:15:33 UTC

per comment 10, 11

Paul, I see I cloned GDB not valgrind when trying to install your test tree.  Sigh, had spent all day working on a GDB patch and failed to completely switch gears late in the day to look at this issue.  Sorry about that.

I was able to clone Valgrind and checkout your tree.  I am running the full regression tests now and will compare with mainline.

Comment 15 Carl Love 2023-05-04 18:29:25 UTC

Paul,

I did the following tests on a Power 10 system with Fedora release 36.

I ran the mainline Valgrind with the 3.21.0 final commit at the top (commit d97fed7c3e4aa7c910dfa0b6c5de12fd6cf08155 )
The results are:

== 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 2 post failures ==
memcheck/tests/bug340392                 (stderr)
memcheck/tests/linux/rfcomm              (stderr)
massif/tests/new-cpp                     (post)
massif/tests/overloaded-new              (post)

I then ran your try-massif tree which has 6 of your patches on top of the mainline Valgrine 3.21.0 final commit.  So, we have a good base line  comparison for your changes.

I see the following failures:

== 715 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\
tB failures, 0 post failures ==
memcheck/tests/bug340392                 (stderr)
memcheck/tests/linux/rfcomm              (stderr)

Both failures also occur in the mainline run.  Your branch fixes the massif tests cleanly.

Comment 16 Paul Floyd 2023-05-04 19:52:33 UTC

> Both failures also occur in the mainline run.  Your branch fixes the massif
> tests cleanly.

That's good.

I had a look at implementing a reset function for the "inlined IP cursor". It's not too difficult but the changes to the massif loops are a bit messy, so I'l wait and see if Nick has something to say.

Comment 17 Nick Nethercote 2023-05-05 11:13:01 UTC

It has been a long time since I looked at this code. I don't have much to add other than the changes seem plausible

Comment 18 Paul Floyd 2023-05-06 05:50:15 UTC

commit 645de069956aa667b924026d2125433461766e4d (HEAD -> master, origin/master, origin/HEAD)
Author: Paul Floyd <pjfloyd@wanadoo.fr>
Date:   Sat May 6 07:45:47 2023 +0200

    Bug 469146 - massif --ignore-fn does not ignore inlined functions