As discussed on valgrind-developers
Created attachment 158542 [details] Patch so that massif filter_IPs handles inline functions Looks better now Does affect a few other regtests, but they also look better to me. Might need some twiddling if different compilers do different inlining. I noticed that inline functions don't get mangled. Makes sense, there is no need to link them. Will need to add that to the doc. I do have one extra testcase for this.
Per comment in the Valgrind emailI see what is happening now. The stack in question is ----------------------------------------------- ==2756940== at 0x48A4C8C: malloc (vg_replace_malloc.c:431) ==2756940== by 0x58025633: ??? (m_trampoline.S:458) ==2756940== by 0x4007D17: call_init (dl-init.c:70) ==2756940== by 0x4007D17: _dl_init (dl-init.c:117) ==2756940== by 0x40311E7: _dl_start_user (in /usr/lib/powerpc64-linux-gnu/ld64.so.1) Note the identical addresses for call_init and _dl_init. I believe that means that call_init is inlined. This bit of code in ms_main.c skips over call_init // top has no fnname => search for the first entry that has a fnname for (i = *top; i < n_ips && !top_has_fnname; i++) { top_has_fnname = VG_(get_fnname)(ep, ips[i], &fnname); } The workaround is to add _dl_init to the ignore functions. ---------------------------------------------------- I tried adding the line prog: new-cpp vgopts: --stacks=no --time-unit=B --massif-out-file=massif.out vgopts: --ignore-fn=__part_load_locale --ignore-fn=__time_load_locale --ignore-fn=dwarf2_unwind_dyld_add_image_hook +vgopts: --ignore-fn=dl_init to massif/tests/new-cpp-vgtest, but it did not change the results. Not sure if I didn't do it correctly?
I tried down loading and applying the patch from Paul to the Valgrind mainline. Head of the tree was: commit d97fed7c3e4aa7c910dfa0b6c5de12fd6cf08155 (tag: VALGRIND_3_21_0, origin/master, origin/HEAD) Author: Mark Wielaard <mark@klomp.org> Date: Fri Apr 28 17:30:04 2023 +0200 -> 3.21.0 final With the this commit, i.e. no patch, I get the following results from make regtest on Power 10. == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\ tB failures, 2 post failures == memcheck/tests/bug340392 (stderr) memcheck/tests/linux/rfcomm (stderr) massif/tests/new-cpp (post) massif/tests/overloaded-new (post) With the patch for massif to filter_IPs hadles inline functions, I am seeing new failures on Power 10. = 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\ tB failures, 6 post failures == memcheck/tests/bug340392 (stderr) memcheck/tests/linux/rfcomm (stderr) massif/tests/alloc-fns-B (post) massif/tests/deep-B (post) massif/tests/deep-C (post) massif/tests/deep-D (post) massif/tests/new-cpp (post) massif/tests/overloaded-new (post) The diff output for new-cpp looks unchanged from the base run (i.e. no patch). The diff output for deep-B is as follows: more deep-B.post.diff --- deep-B.post.exp 2023-05-01 16:49:31.458553002 -0400 +++ deep-B.post.out 2023-05-01 18:15:57.327511581 -0400 @@ -46,13 +46,15 @@ 8 3,264 3,264 3,200 64 0 9 3,672 3,672 3,600 72 0 98.04% (3,600B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. -->98.04% (3,600B) 0x........: a5 (deep.c:23) - ->98.04% (3,600B) 0x........: a4 (deep.c:24) - ->98.04% (3,600B) 0x........: a3 (deep.c:25) - ->98.04% (3,600B) 0x........: a2 (deep.c:26) - ->98.04% (3,600B) 0x........: a1 (deep.c:27) - ->98.04% (3,600B) 0x........: main (deep.c:35) - +->98.04% (3,600B) 0x........: a11 (deep.c:17) + ->98.04% (3,600B) 0x........: a10 (deep.c:18) + ->98.04% (3,600B) 0x........: a9 (deep.c:19) + ->98.04% (3,600B) 0x........: a8 (deep.c:20) + ->98.04% (3,600B) 0x........: a7 (deep.c:21) + ->98.04% (3,600B) 0x........: a6 (deep.c:22) + ->98.04% (3,600B) 0x........: a5 (deep.c:23) + ->98.04% (3,600B) 0x........: a4 (deep.c:24) + ------------------------------------------------------------------------------- - n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) ------------------------------------------------------------------------------- - The output for deep-C at first glance looks identical.
Bah. I still think I'm on the right lines for the cause. I was expecting the extra fails. I'll do some more debugging of new-cpp.
My mistake was to replace loops that iterated over the callstack (but skipped inlined functions) with a loops that iterate over inline functions only. We need to do both.
Created attachment 158635 [details] Update patch to handle stacked and inlined functions Just the code fix. Will also need to update expecteds, maybe add a testcase and update the doc.
(In reply to Paul Floyd from comment #6) > Created attachment 158635 [details] > Update patch to handle stacked and inlined functions > > Just the code fix. Will also need to update expecteds, maybe add a testcase > and update the doc. On second thoughts deep-B looks like a regression. The alloc-fn loop is wrong somehow.
+ for (i = 0; i < n_ips && !top_has_fnname; ++i) { That needs to be "for (i = *top;" + iipc = VG_(new_IIPC)(ep, ips[i]); + do { + top_has_fnname = VG_(get_fnname_inl)(ep, ips[i], &fnname, iipc); + if (top_has_fnname && VG_(strIsMemberXA)(alloc_fns, fnname)) { + VERB(4, "filtering alloc fn %s\n", fnname); + (*top)++; + (*n_ips_sel)--; + } else { + i = n_ips; + break; + } + } while (VG_(next_IIPC)(iipc)); + VG_(delete_IIPC)(iipc); That's a bit tricker. The code is working its way down the stack as long as functions have a name and are in the alloc-fn set. So maybe something like // look for inline functions do { top_has_fnname = VG_(get_fnname_inl)(ep, ips[i], &fnname, iipc); is_allloc_fn = VG_(strIsMemberXA)(alloc_fns, fnname); } while (top_has_fnname && is_alloc_fn && VG_(next_IIPC)(iipc)); // next step down stack if (top_has_fnname && is_alloc_fn) { VERB(4, "filtering alloc fn %s\n", fnname); (*top)++; (*n_ips_sel)--; } else { break; }
Carl, I've just pushed the patchset to a try branch remotes/origin/users/paulf/try-massif_inline I'm getting clean results for massif on gcc203, which is paulf@gcc203:~/scratch/valgrind$ lscpu Architecture: ppc64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Big Endian CPU(s): 64 On-line CPU(s) list: 0-63 Model name: POWER8 (architected), altivec supported Could you check it out and try it please?
I cloned a test tree with: git clone git://sourceware.org/git/binutils-gdb.git gdb-test-massif I then looked for the remote git fetch origin git branch -v -a > junk then looked in junk but don't see your tree? Is there some other repository where I should be looking?
git clone https://sourceware.org/git/valgrind.git && git checkout users/paulf/try-massif_inline should do the job
Nick could you take a look at this please and give me your opinion? The main benefit I see is that it will make the --ignore-fn more consistent with the massif.out. Specifically users will be able to copy and paste from the stack traces within the snapshots in massif.out directly to a script (or the terminal) without having to try to figure out if the name should be mangled or not. Previously it wouldn't be possible to filter out the inline functions in the stack traces.
I forgot, one thing I was thinking of was to try to speed up the allocs/frees iipc = VG_(new_IIPC)(ep, ips[i]); ... VG_(delete_IIPC)(iipc); that are in the two stack loops. My plan is to split up VG_(new_IIPC) so that it uses a new function VG_(init_IIPC) That would allow InlIPCursor iipc; // stack allocation ... VG_(init_IIPC)(&iipc, ep, ips[i]); ... // no need for heap deallocation
per comment 10, 11 Paul, I see I cloned GDB not valgrind when trying to install your test tree. Sigh, had spent all day working on a GDB patch and failed to completely switch gears late in the day to look at this issue. Sorry about that. I was able to clone Valgrind and checkout your tree. I am running the full regression tests now and will compare with mainline.
Paul, I did the following tests on a Power 10 system with Fedora release 36. I ran the mainline Valgrind with the 3.21.0 final commit at the top (commit d97fed7c3e4aa7c910dfa0b6c5de12fd6cf08155 ) The results are: == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 2 post failures == memcheck/tests/bug340392 (stderr) memcheck/tests/linux/rfcomm (stderr) massif/tests/new-cpp (post) massif/tests/overloaded-new (post) I then ran your try-massif tree which has 6 of your patches on top of the mainline Valgrine 3.21.0 final commit. So, we have a good base line comparison for your changes. I see the following failures: == 715 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdou\ tB failures, 0 post failures == memcheck/tests/bug340392 (stderr) memcheck/tests/linux/rfcomm (stderr) Both failures also occur in the mainline run. Your branch fixes the massif tests cleanly.
> Both failures also occur in the mainline run. Your branch fixes the massif > tests cleanly. That's good. I had a look at implementing a reset function for the "inlined IP cursor". It's not too difficult but the changes to the massif loops are a bit messy, so I'l wait and see if Nick has something to say.
It has been a long time since I looked at this code. I don't have much to add other than the changes seem plausible
commit 645de069956aa667b924026d2125433461766e4d (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Sat May 6 07:45:47 2023 +0200 Bug 469146 - massif --ignore-fn does not ignore inlined functions