Bug 367995 - Integration of memcheck with custom memory allocator
Summary: Integration of memcheck with custom memory allocator
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.11.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Ivo Raisr
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-30 09:23 UTC by Ruurd Beerstra
Modified: 2017-01-22 17:20 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Patch to support new types of mempool (32.55 KB, patch)
2016-08-30 09:23 UTC, Ruurd Beerstra
Details
metamempool.patch (33.89 KB, application/octet-stream)
2016-09-15 11:46 UTC, Ruurd Beerstra
Details
reviewed patch (35.14 KB, patch)
2016-09-15 19:43 UTC, Ivo Raisr
Details
proposed patch v3 (35.78 KB, patch)
2016-09-20 14:32 UTC, Ivo Raisr
Details
metamempoolv2.patch (31.64 KB, application/octet-stream)
2016-09-21 12:32 UTC, Ruurd Beerstra
Details
metamempoolv3.patch (32.02 KB, application/octet-stream)
2016-09-22 13:37 UTC, Ruurd Beerstra
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ruurd Beerstra 2016-08-30 09:23:23 UTC
Created attachment 100849 [details]
Patch to support new types of mempool

The 'loosely defined' model for memory pools in valgrind is not sufficient for our custom allocator. There are 2 problems:
- Destroying a pool without explicitly freeing all items results in leaks being reported. Our allocator (and apps) allow this and does free & reuse the memory.
- Our allocator uses itself to allocate blocks of memory for the pools to allocate memory from. Valgrind considers this 'overlapping blocks'  which is a fatal error.

To support this, I've added a new mempool create call. Internally, 2 extra flags are maintained internally: One "auto_free" flag to arrange auto-freeing of chunks from a mempool and one "metablock" flag to prevent the check on overlapping blocks.
I've added regression test cases for the new options..
I've been using the patched version for a about 5 weeks on various Linux versions:
- SuSE 11.4
- SuSE 12.1
- RHAT 6.4
- RHAT 7.2 
And it works the way it should.
All regression tests work as before, the added tests show the new functionality.
I'd appreciate it if this patch would make it to the next version.
Comment 1 Ivo Raisr 2016-08-31 13:44:05 UTC
Thank you for the patch and specially for regression tests.

Overall it looks quite good but needs some polishing first.
I have the following questions and comments:

1. Please ensure all lines are shorter than 80 characters.
I even spotted some of them are > 100 chars.
I understand existing code is not 100% perfect and there can be some
exceptions but general rule is <= 80 chars.

2. include/valgrind.h
+#define MEMPOOL_METABLOCKS 2
I wonder why this is called MEMPOOL_METABLOCKS and not simply MEMPOOL_METAPOOL?
This would greatly simplify its description both in valgrind.h and mc-manual.xml.
And also greatly enhance understanding of is functionality.

3. describe_addr()
Why it is necessary to check for meta mempool almost at the end?
Please add some commentary.

4. mempool_block_maybe_describe() 
We could do better here with respect to meta mempool description.
Based on your experience, is it sufficient to differentiate between meta and non-meta pools
only based on the allocation stack trace?
I think meta pool should deserve its own BlockKind value which pp_addrinfo_WRK() will utilize.

5. memcheck/mc_include.h
+      int          is_mempool_block;
Use 'Bool' instead and 'True'/'False' for values.

6. MC_(detect_memory_leaks)()
+   // Another exception is that a custom allocator uses VALGRIND_MALLOCLIKE_BLOCK
+   // to allocate mempool blocks to allocate from, indicated by the metablock
+   // property of the mempool block.

I don't get meaning of this sentence. Please could you break it to more swalloable chunks
or rephrase it.

7. memcheck/mc_leakcheck.c
+         if (! ((ch1->is_mempool_block && (mp1 = find_mp_of_chunk(ch1)) != NULL && mp1->metablock) ||
+                (ch2->is_mempool_block && (mp2 = find_mp_of_chunk(ch2)) != NULL && mp2->metablock)
+               )
Please do not use variable assignment in 'if' condition. This makes the logic hard to follow.

8. memcheck/mc_main.c  2016-08-16 11:20:22.000000000 +0200
          MC_(new_block) ( tid, p, sizeB, /*ignored*/0, is_zeroed, 
-                          MC_AllocCustom, MC_(malloc_list) );
+                          MC_AllocCustom, /*not pool*/0, MC_(malloc_list) );
Use 'False' instead of '0'. The comment is not in sync with MC_(new_block)() declaration.

9. memcheck/mc_main.c, around  case VG_USERREQ__CREATE_MEMPOOL:
+         UInt flags     =       arg[4];
 -         MC_(create_mempool) ( pool, rzB, is_zeroed );
+         // The create_mempool function does not know these mempool flags, pass as booleans.
+         MC_(create_mempool) ( pool, rzB, is_zeroed, (flags & MEMPOOL_AUTO_FREE), (flags & MEMPOOL_METABLOCKS) );

The comment here is redundant.

10. memcheck/mc_malloc_wrappers.c
+   mc->is_mempool_block = is_mempool_block;
    VG_(HT_add_node)( table, mc );
-
    if (is_zeroed)

Why that empty line is removed?

11. void MC_(create_mempool)()
+      VG_(message)(Vg_UserMsg, "create_mempool(0x%lx, %u, %d, %d, %d)\n",
+                               pool, rzB, is_zeroed, auto_free, metablock);

Please improve this by clarifying what is what in the message, for example:
"create_mempool(<addr>, size=%u, zerod=%d, autofree=%d, metablock=%d)"

12. MC_(mempool_free)()
+   if (mp->auto_free) {
+      // All allocs in this memory block are not to been seen as leaked.
+      free_mallocs_in_mempool_block(mp,mc->data,mc->data + (mc->szB + 0UL));
+   }
The comment is redundant.
You are missing spaces after commas for the argument list.

13. free_mallocs_in_mempool_block()
+
+   if (mp->auto_free) {
+   ...

Why you test for mp->auto_free again? This was done back in MC_(mempool_free)().

14. free_mallocs_in_mempool_block()
+      if (VG_(clo_verbosity) > 2) {
+         VG_(message)(Vg_UserMsg, "MEMPOOL_DELETE: pool from 0x%lx to 0x%lx (size %ld) auto-free\n",
+            StartAddr,EndAddr,(unsigned long int) (EndAddr - StartAddr));
+      }
and 
+               if (VG_(clo_verbosity) > 2) {
+                  VG_(message)(Vg_UserMsg, "Auto-free of 0x%lx size=%ld\n",mc->data,(long int) mc->szB);
+               }

Please get these messages in sync with the rest of this module so they are first class citizens.
Also use spaces after commas. You cannot use %ld to print unsigned long int. Naked 'unsigned long int' is not used in Valgrind core - use something more appropriate, for example SizeT. Is the cast really necessary here?

15. free_mallocs_in_mempool_block()
+         while (!found && (mc = VG_(HT_Next)(MC_(malloc_list))) ) {
Please do not use variable assignment in 'if' condition. This makes the logic hard to follow.

16. free_mallocs_in_mempool_block()
+   int found;
Use 'Bool' and 'True'/'False'.

17. memcheck/tests/leak-autofreepool.c
Why not having 'struct cell' and 'struct pool' typedef'd to something more easily used, for
example 'cell_t' and 'pool_t'?

+struct pool _PlainPool, *PlainPool = &_PlainPool;
+struct pool _MetaPool,  *MetaPool  = &_MetaPool;
Use static.

+static int    MetaPoolFlags = 0;
For consideration: You can use arguments to create_meta_pool() and remove this
unnecessary variable.

+   /* A pool-block is expected to have overhead, and the core of
'overhead' is usually called 'metadata' in allocator's lingo. Just for clarification.
Comment 2 Ivo Raisr 2016-08-31 14:06:14 UTC
Also when running the newly introduced regressions tests, I encountered this failure:
$ cat memcheck/tests/leak-autofreepool-5.stderr.diff 
--- leak-autofreepool-5.stderr.exp      2016-08-31 15:08:12.835033806 +0000
+++ leak-autofreepool-5.stderr.out      2016-08-31 15:38:03.503967988 +0000
@@ -11,7 +11,7 @@
    ...
 This is usually caused by using VALGRIND_MALLOCLIKE_BLOCK in an inappropriate way.
 
-Memcheck: mc_leakcheck.c:1847... (vgMemCheck_detect_memory_leaks): the 'impossible' happened.
+Memcheck: mc_leakcheck.c:1908... (vgMemCheck_detect_memory_leaks): the 'impossible' happened.
 
 host stacktrace:
    ...

Please re-check the .exp files.
Comment 3 Julian Seward 2016-09-15 10:12:59 UTC
Ruurd, the window for 3.12 is closing fast.  Can you redo the patch taking
account of Ivo's review comments?
Comment 4 Ruurd Beerstra 2016-09-15 11:15:06 UTC
Hi,

Yes, I've got it finished. The patched version has been running for a while on various systems.
But then "real work" intervened and it just sat there.
I'll submit the revised patch this afternoon.

Thank you for the heads-up!

	Ruurd

-----Original Message-----
From: Julian Seward via KDE Bugzilla [mailto:bugzilla_noreply@kde.org] 
Sent: Thursday, September 15, 2016 12:13
To: Ruurd Beerstra <Ruurd.Beerstra@infor.com>
Subject: [valgrind] [Bug 367995] Integration of memcheck with custom memory allocator

https://bugs.kde.org/show_bug.cgi?id=367995

Julian Seward <jseward@acm.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jseward@acm.org

--- Comment #3 from Julian Seward <jseward@acm.org> --- Ruurd, the window for 3.12 is closing fast.  Can you redo the patch taking account of Ivo's review comments?

--
You are receiving this mail because:
You reported the bug.
Comment 5 Ruurd Beerstra 2016-09-15 11:46:37 UTC
Created attachment 101097 [details]
metamempool.patch

Hi,

Thank you for taking the time to look (very thoroughly) into this.

>--- Comment #1 from Ivo Raisr <ivosh@ivosh.net> --- Thank you for the patch and specially for regression
>tests.
>
>Overall it looks quite good but needs some polishing first.
>I have the following questions and comments:
>
>1. Please ensure all lines are shorter than 80 characters.
>I even spotted some of them are > 100 chars.
>I understand existing code is not 100% perfect and there can be some exceptions but general rule
>is <= 80 chars.

For real?  Personally, I like to take advantage of modern technology like big screens.
But: Your source, your rules. I've changed it.

>2. include/valgrind.h
>+#define MEMPOOL_METABLOCKS 2
>I wonder why this is called MEMPOOL_METABLOCKS and not simply MEMPOOL_METAPOOL?
>This would greatly simplify its description both in valgrind.h and mc-manual.xml.
>And also greatly enhance understanding of is functionality.

You're right. My first stab at hacking valgrind was a fix at the block-level.
When my understanding of valgrind grew, the code changed, the name did not. My bad. Fixed it.

>3. describe_addr()
>Why it is necessary to check for meta mempool almost at the end?
>Please add some commentary.

Done.

>4. mempool_block_maybe_describe()
>We could do better here with respect to meta mempool description.
>Based on your experience, is it sufficient to differentiate between meta and non-meta pools only
>based on the allocation stack trace?
>I think meta pool should deserve its own BlockKind value which
>pp_addrinfo_WRK() will utilize.

Out custom allocator allocates metapool blocks, and then doles out addresses from those blocks to the applications.
That makes every address handed out to the application part of a metapool block, which is why describe_addr looks for that kind of description at the very end because all metablock descriptions are boring: the place in the allocator where the metablock is allocated.
The allocation stack is even misleading, as a random event in the application may trigger the condition in the allocator where it decides in needs a new block for the pool.
So a "better" metapool-block description will only always point to the (uninteresting/misleading) place in the allocator itself. 
So I did not change the description.

>5. memcheck/mc_include.h
>+      int          is_mempool_block;
>Use 'Bool' instead and 'True'/'False' for values.

Done. But I could not help but noticing that this rule is violated often in valgrind sources...

>6. MC_(detect_memory_leaks)()
>+   // Another exception is that a custom allocator uses
>VALGRIND_MALLOCLIKE_BLOCK
>+   // to allocate mempool blocks to allocate from, indicated by the metablock
>+   // property of the mempool block.
>
>I don't get meaning of this sentence. Please could you break it to more swalloable chunks or rephrase it.

Done.

>7. memcheck/mc_leakcheck.c
>+         if (! ((ch1->is_mempool_block && (mp1 = find_mp_of_chunk(ch1)) 
>+ !=
>NULL && mp1->metablock) ||
>+                (ch2->is_mempool_block && (mp2 = find_mp_of_chunk(ch2)) 
>+ !=
>NULL && mp2->metablock)
>+               )
>Please do not use variable assignment in 'if' condition. This makes the logic hard to follow.

Hmm. A matter of taste and style. It turns the 2 lines into about 15. There is something to be said for brevity, too.
Also, I saw plenty of places in the valgrind source where this technique is used.
But: Your source: Done. 

>8. memcheck/mc_main.c  2016-08-16 11:20:22.000000000 +0200
>          MC_(new_block) ( tid, p, sizeB, /*ignored*/0, is_zeroed, 
>-                          MC_AllocCustom, MC_(malloc_list) );
>+                          MC_AllocCustom, /*not pool*/0, 
>+ MC_(malloc_list) );
>Use 'False' instead of '0'. The comment is not in sync with MC_(new_block)() declaration.

I interpreted the in-line comments as "show what the constant means", rather than a compulsory exact quoting of the
name of the parameter (like the "ignored" in the same call).
The definition of the function is only a single keystroke/mouse click away, if I want to look at it.
But: Done.

>9. memcheck/mc_main.c, around  case VG_USERREQ__CREATE_MEMPOOL:
>+         UInt flags     =       arg[4];
> -         MC_(create_mempool) ( pool, rzB, is_zeroed );
>+         // The create_mempool function does not know these mempool 
>+ flags,
>pass as booleans.
>+         MC_(create_mempool) ( pool, rzB, is_zeroed, (flags &
>MEMPOOL_AUTO_FREE), (flags & MEMPOOL_METABLOCKS) );
>
>The comment here is redundant.

For you, maybe. My first attempt simply passed the flags to the create_mempool function, which failed to compile.
It surprised me that I could not use those constants in the receiving function.
Extending the "flags" with more bits requires changing the signature of create_mempool.
So a comment is appropriate IMHO.

>10. memcheck/mc_malloc_wrappers.c
>+   mc->is_mempool_block = is_mempool_block;
>    VG_(HT_add_node)( table, mc );
>-
>    if (is_zeroed)
>
>Why that empty line is removed?

By accident. Restored.

>11. void MC_(create_mempool)()
>+      VG_(message)(Vg_UserMsg, "create_mempool(0x%lx, %u, %d, %d, %d)\n",
>+                               pool, rzB, is_zeroed, auto_free, 
>+ metablock);
>
>Please improve this by clarifying what is what in the message, for example:
>"create_mempool(<addr>, size=%u, zerod=%d, autofree=%d, metablock=%d)"

I just extended the existing message.
The other functions also simply print their parameters without annotation (mempool_alloc, mempool_free, mempool_trim, etc).
But: Done.

>12. MC_(mempool_free)()
>+   if (mp->auto_free) {
>+      // All allocs in this memory block are not to been seen as leaked.
>+      free_mallocs_in_mempool_block(mp,mc->data,mc->data + (mc->szB + 0UL));
>+   }
>The comment is redundant.
>You are missing spaces after commas for the argument list.

IMHO: Comments are hardly ever redundant.
And CSCOPE finds 23507 instances of a comma not followed by a space in C-sources of valgrind :-)
But: Done.

>13. free_mallocs_in_mempool_block()
>+
>+   if (mp->auto_free) {
>+   ...
>
>Why you test for mp->auto_free again? This was done back in MC_(mempool_free)().

Because what is true today, can be false tomorrow. I am used to sprinkling sources with assertions and "superfluous" extra checks.
Saved me on numerous occasions.
But, on reflection, I've changed it to a tl_assert instead of an IF that silently ignores a bad call.

>14. free_mallocs_in_mempool_block()
>+      if (VG_(clo_verbosity) > 2) {
>+         VG_(message)(Vg_UserMsg, "MEMPOOL_DELETE: pool from 0x%lx to 
>+ 0x%lx
>(size %ld) auto-free\n",
>+            StartAddr,EndAddr,(unsigned long int) (EndAddr - StartAddr));
>+      }
>and 
>+               if (VG_(clo_verbosity) > 2) {
>+                  VG_(message)(Vg_UserMsg, "Auto-free of 0x%lx
>size=%ld\n",mc->data,(long int) mc->szB);
>+               }
>
>Please get these messages in sync with the rest of this module so they are first class citizens.
>Also use spaces after commas. You cannot use %ld to print unsigned long int.
>Naked 'unsigned long int' is not used in Valgrind core - use something more appropriate, for example
>SizeT. Is the cast really necessary here?

Done. 

>15. free_mallocs_in_mempool_block()
>+         while (!found && (mc = VG_(HT_Next)(MC_(malloc_list))) ) {
>Please do not use variable assignment in 'if' condition. This makes the logic hard to follow.

I disagree. So does the rest of the valgrind source: there are 173 occurrences of assignments in a while statement in the  3.11.0 C source.
All but 2 of  21 HT_next usages in the 3.11.0 source tree are of the same form: Assignment in a while.
In libhb_core.c (line 4861) are the only other 2 of HT_next uses: there it leads to code duplication (which, in my book, is a Truly Bad Thing).

>16. free_mallocs_in_mempool_block()
>+   int found;
>Use 'Bool' and 'True'/'False'.

Done.

>17. memcheck/tests/leak-autofreepool.c
>Why not having 'struct cell' and 'struct pool' typedef'd to something more easily used, for example
>'cell_t' and 'pool_t'?

Because the test was copy/pasted/modified from the existing leak-pool.c test.

>+struct pool _PlainPool, *PlainPool = &_PlainPool; struct pool 
>+_MetaPool,  *MetaPool  = &_MetaPool;
>Use static.

Done. 

>+static int    MetaPoolFlags = 0;
>For consideration: You can use arguments to create_meta_pool() and remove this unnecessary variable.

That is how I would have written it, but the other tests in the test-framework that I studied follow that pattern: Pass a single argument which has a magic value.
The magic value is translated by the main function into a specific test-case (in this case: flags in MetaPoolFlags).
For example, the existing leak-pool tests calls leak-pool with values 0 - 5, with all .EXP files having the same magic value in the name.
I only learned enough about the test framework to be able to get the extra regression-test up & running, and tried to stay in the same vein as existing tests.
Are you saying that is a Bad Thing?

>+   /* A pool-block is expected to have overhead, and the core of
>'overhead' is usually called 'metadata' in allocator's lingo. Just for clarification.

Our custom allocator calls it "overhead" :-)
Updated.

> Also when running the newly introduced regressions tests, I encountered this
>failure:
>$ cat memcheck/tests/leak-autofreepool-5.stderr.diff 
>--- leak-autofreepool-5.stderr.exp      2016-08-31 15:08:12.835033806 +0000
>+++ leak-autofreepool-5.stderr.out      2016-08-31 15:38:03.503967988 +0000

Embarrassing: The line-number of the assert sneaked inside the EXP file. The (new) filter left that line-number intact.
Two bugs that cancelled each other out.
Worked for me, but nowhere else, of course. Darn, but programming is hard :-)
Fixed the filter and the EXP file for test 5.  It now says:

make[1]: Leaving directory `/home/rbeerstr/grind/valgrind-3.11.patch/memcheck/tests'
leak-autofreepool-0: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 0
leak-autofreepool-1: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 1
leak-autofreepool-2: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 2
leak-autofreepool-3: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 3
leak-autofreepool-4: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 4
leak-autofreepool-5: valgrind   --leak-check=full --show-possibly-lost=no --track-origins=yes ./leak-autofreepool 5

== 6 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==

Works for me (again :-)

Hope this version of the patch is acceptable,
	
	Regards,
	Ruurd Beerstra
Comment 6 Ivo Raisr 2016-09-15 11:56:07 UTC
Thank you for addressing my comments. I will try the new patch today or tomorrow.
Comment 7 Ivo Raisr 2016-09-15 19:43:31 UTC
Created attachment 101103 [details]
reviewed patch

Patch submitted by Ruurd with amended NEWS.
Comment 8 Ivo Raisr 2016-09-15 19:44:33 UTC
The attached patch is ready to land.
Regression tests pass on x86/Linux, amd64/Linux (Ubuntu), x86/Solaris and amd64/Solaris.
Comment 9 Julian Seward 2016-09-20 10:47:09 UTC
Ivo, thank you for reviewing this; Ruurd, thank you for addressing
Ivo's comments.

I looked at the revised patch.  I am generally a bit nervous about
mempool changes given that they are not much used and I am not sure
any of the Valgrind developers really understands that code any more.
But given that it looks plausible and it has some tests, I am OK with
it, provided the issues below can be cleared up.

I have some comments/questions:


(1)
When the new functionality is not in use (that is, neither
MEMPOOL_AUTO_FREE nor MEMPOOL_METAPOOL has been passed to the mempool
creation routines), is there any weakening of the sanity checking or
assertions, relative to pre-patch?


(2)
--- include/valgrind.h	(revision 15958)
+++ include/valgrind.h	(working copy)
+#define MEMPOOL_AUTO_FREE  1
+#define MEMPOOL_METAPOOL 2

(2a) Please call these VALGRIND_MEMPOOL_{AUTO_FREE,METAPOOL} so as
     to be consistent with the rest of the file and so as to reduce
     the chances of weird namespace clashes, given that this file is
     #include-d into arbitrary end-user code.  And update the doc diff
     accordingly.

(2b) Please mention, in the comment, that the flags are intended to be
     ORd together (viz, it's not an enum).  This wasn't obvious to me
     on first reading.


(3)
===================================================================
--- memcheck/mc_include.h	(revision 15958)
+++ memcheck/mc_include.h	(working copy)
@@ -67,6 +67,7 @@
       Addr         data;            // Address of the actual block.
       SizeT        szB : (sizeof(SizeT)*8)-2; // Size requested; 30 or 62 bits.
       MC_AllocKind allockind : 2;   // Which operation did the allocation.
+      Bool         is_mempool_block;
       ExeContext*  where[0];
       /* Variable-length array. The size depends on MC_(clo_keep_stacktraces).
          This array optionally stores the alloc and/or free stack trace. */

I am concerned about the addition of a Bool to struct _MC_Chunk.
There can be hundreds of thousands of these.  Given that an extra Bool
will add another word to the structure, the new Bool could increase
memory use by several megabytes.  There have been significant efforts
made in the past to keep these structures small, and I don't want to
undo them.

Is it absolutely necessary to add this new Bool?  Is there a way to
avoid it?
Comment 10 Ivo Raisr 2016-09-20 14:32:39 UTC
Created attachment 101201 [details]
proposed patch v3
Comment 11 Ruurd Beerstra 2016-09-20 14:55:12 UTC
Hi,

Answers between the text below...

>-----Original Message-----
>From: Julian Seward via KDE Bugzilla [mailto:bugzilla_noreply@kde.org] 
>Sent: Tuesday, September 20, 2016 12:47
>
>https://bugs.kde.org/show_bug.cgi?id=367995
>
>--- Comment #9 from Julian Seward <jseward@acm.org> --- Ivo, thank you for reviewing this; Ruurd,
>thank you for addressing Ivo's comments.
>
>I looked at the revised patch.  I am generally a bit nervous about mempool changes given that they
>are not much used and I am not sure any of the Valgrind developers really understands that code
>any more.
>But given that it looks plausible and it has some tests, I am OK with it, provided the issues below
>can be cleared up.
>
>I have some comments/questions:
>
>
>(1)
>When the new functionality is not in use (that is, neither MEMPOOL_AUTO_FREE nor MEMPOOL_METAPOOL
>has been passed to the mempool creation routines), is there any weakening of the sanity checking
>or assertions, relative to pre-patch?

AFAIK: No. I tried my best to make sure I did nothing to break any old behavior.
When both flags are off, you get a Plain Old Memory pool.
The regression tests that deal with pools all still succeed.

>(2)
>--- include/valgrind.h    (revision 15958)
>+++ include/valgrind.h    (working copy)
>+#define MEMPOOL_AUTO_FREE  1
>+#define MEMPOOL_METAPOOL 2
>
>(2a) Please call these VALGRIND_MEMPOOL_{AUTO_FREE,METAPOOL} so as
>     to be consistent with the rest of the file and so as to reduce
>     the chances of weird namespace clashes, given that this file is
>     #include-d into arbitrary end-user code.  And update the doc diff
>     accordingly.

Done. All occurrences in all files updated.

>(2b) Please mention, in the comment, that the flags are intended to be
>     ORd together (viz, it's not an enum).  This wasn't obvious to me
>     on first reading.

Done.

>(3)
>===================================================================
>--- memcheck/mc_include.h    (revision 15958)
>+++ memcheck/mc_include.h    (working copy)
>@@ -67,6 +67,7 @@
>       Addr         data;            // Address of the actual block.
>       SizeT        szB : (sizeof(SizeT)*8)-2; // Size requested; 30 or 62
>bits.
>       MC_AllocKind allockind : 2;   // Which operation did the allocation.
>+      Bool         is_mempool_block;
>       ExeContext*  where[0];
>       /* Variable-length array. The size depends on MC_(clo_keep_stacktraces).
>          This array optionally stores the alloc and/or free stack trace. */
>
>I am concerned about the addition of a Bool to struct _MC_Chunk.
>There can be hundreds of thousands of these.  Given that an extra Bool will add another word to
>the structure, the new Bool could increase memory use by several megabytes.  There have been significant
>efforts made in the past to keep these structures small, and I don't want to undo them.
>
>Is it absolutely necessary to add this new Bool?  Is there a way to avoid it?

I see only one way: Every time the "is_mempool_block" value is used, find the status of the chunk dynamically instead.
That involves scanning all memory-pool chunk-lists (mp->chunks) of all existing memory pools for the current chunk.
Hmm. That was actually quite easy.
I've dropped the is_mempool_block Bool, just now, and created a:

Bool MC_(is_mempool_block)( MC_Chunk* mc_search );

function.  It think it is reasonably efficient tradeoff.
It may have to search all chunks of all memory pools to find that the chunk is (not) part  of a memory pool, but that is in only in describe_addr and detect_memory_leaks functions, and both functions are only used when a problem has been detected (I think).
If you don’t use custom allocators, and/or those custom allocators do not use memory pools, nothing extra happens.

I've just compiled that, and it seems to work as before (regression tests all OK).
Before bothering you with this updated patch I want to run a few more test but the day is ending here and I've gotta run.
If it works I'll submit the patched patch :-)

Idea: It may be a good idea to keep globally track of the existence of a METAPOOL memory pool, to avoid the scans because the interesting things it has to scan for only happen when such a pool currently exists. I'll sleep on that one.

How much time have I got before the window closes?

	Thank you for your remarks,
	Ruurd





>
>--
>You are receiving this mail because:
>You reported the bug.
>
Comment 12 Ivo Raisr 2016-09-20 15:26:59 UTC
Thank you for your review, Julian.

1) I will let Ruurd comment on this.

2) a+b is addressed in the new patch v3 (attached).

3) I was not able to come up with an efficient way how to express this without
introducing is_mempool_block flag there. Perhaps Ruurd could have an idea?
For the meantime, I reduced szB off one bit and lend it to is_mempool_block.
Note: current layout of MC_Chunk goes as far as to SVN r5791.
Comment 13 Ruurd Beerstra 2016-09-21 12:32:19 UTC
Created attachment 101208 [details]
metamempoolv2.patch

Hello again,

In follow up of yesterday's discussion: I've created attached metapoolv2.patch.
It addresses the remarks given by Julian.

I've managed to remove the is_mempool_block Bool from the chunk-struct.
It now dynamically determines that a chunk is a mempool block by scanning all mempool blocks in all mempools.
Since there won’t be an enormous amounts of pools, and chunks in pools are typically pretty big so there won’t be a huge number of those, and the scan only happens in descr_addr and detect_memory_leaks, this has no measurable impact on performance (at least, in my tests).

I did tests today with this version of valgrind: no problems detected. All regression tests run OK as before.
So I am happy with this, hope you are too.

	Best regards,
	Ruurd


-----Original Message-----
From: Ruurd Beerstra 
Sent: Tuesday, September 20, 2016 16:55
To: 'bug-control@bugs.kde.org' <bug-control@bugs.kde.org>
Subject: RE: [valgrind] [Bug 367995] Integration of memcheck with custom memory allocator

Hi,

Answers between the text below...

>-----Original Message-----
>From: Julian Seward via KDE Bugzilla [mailto:bugzilla_noreply@kde.org]
>Sent: Tuesday, September 20, 2016 12:47
>
>https://bugs.kde.org/show_bug.cgi?id=367995
>
>--- Comment #9 from Julian Seward <jseward@acm.org> --- Ivo, thank you 
>for reviewing this; Ruurd, thank you for addressing Ivo's comments.
>
>I looked at the revised patch.  I am generally a bit nervous about 
>mempool changes given that they are not much used and I am not sure any 
>of the Valgrind developers really understands that code any more.
>But given that it looks plausible and it has some tests, I am OK with 
>it, provided the issues below can be cleared up.
>
>I have some comments/questions:
>
>
>(1)
>When the new functionality is not in use (that is, neither 
>MEMPOOL_AUTO_FREE nor MEMPOOL_METAPOOL has been passed to the mempool 
>creation routines), is there any weakening of the sanity checking or assertions, relative to pre-patch?

AFAIK: No. I tried my best to make sure I did nothing to break any old behavior.
When both flags are off, you get a Plain Old Memory pool.
The regression tests that deal with pools all still succeed.

>(2)
>--- include/valgrind.h    (revision 15958)
>+++ include/valgrind.h    (working copy)
>+#define MEMPOOL_AUTO_FREE  1
>+#define MEMPOOL_METAPOOL 2
>
>(2a) Please call these VALGRIND_MEMPOOL_{AUTO_FREE,METAPOOL} so as
>     to be consistent with the rest of the file and so as to reduce
>     the chances of weird namespace clashes, given that this file is
>     #include-d into arbitrary end-user code.  And update the doc diff
>     accordingly.

Done. All occurrences in all files updated.

>(2b) Please mention, in the comment, that the flags are intended to be
>     ORd together (viz, it's not an enum).  This wasn't obvious to me
>     on first reading.

Done.

>(3)
>===================================================================
>--- memcheck/mc_include.h    (revision 15958)
>+++ memcheck/mc_include.h    (working copy)
>@@ -67,6 +67,7 @@
>       Addr         data;            // Address of the actual block.
>       SizeT        szB : (sizeof(SizeT)*8)-2; // Size requested; 30 or 62
>bits.
>       MC_AllocKind allockind : 2;   // Which operation did the allocation.
>+      Bool         is_mempool_block;
>       ExeContext*  where[0];
>       /* Variable-length array. The size depends on MC_(clo_keep_stacktraces).
>          This array optionally stores the alloc and/or free stack 
> trace. */
>
>I am concerned about the addition of a Bool to struct _MC_Chunk.
>There can be hundreds of thousands of these.  Given that an extra Bool 
>will add another word to the structure, the new Bool could increase 
>memory use by several megabytes.  There have been significant efforts made in the past to keep these structures small, and I don't want to undo them.
>
>Is it absolutely necessary to add this new Bool?  Is there a way to avoid it?

I see only one way: Every time the "is_mempool_block" value is used, find the status of the chunk dynamically instead.
That involves scanning all memory-pool chunk-lists (mp->chunks) of all existing memory pools for the current chunk.
Hmm. That was actually quite easy.
I've dropped the is_mempool_block Bool, just now, and created a:

Bool MC_(is_mempool_block)( MC_Chunk* mc_search );

function.  It think it is reasonably efficient tradeoff.
It may have to search all chunks of all memory pools to find that the chunk is (not) part  of a memory pool, but that is in only in describe_addr and detect_memory_leaks functions, and both functions are only used when a problem has been detected (I think).
If you don’t use custom allocators, and/or those custom allocators do not use memory pools, nothing extra happens.

I've just compiled that, and it seems to work as before (regression tests all OK).
Before bothering you with this updated patch I want to run a few more test but the day is ending here and I've gotta run.
If it works I'll submit the patched patch :-)

Idea: It may be a good idea to keep globally track of the existence of a METAPOOL memory pool, to avoid the scans because the interesting things it has to scan for only happen when such a pool currently exists. I'll sleep on that one.

How much time have I got before the window closes?

	Thank you for your remarks,
	Ruurd





>
>--
>You are receiving this mail because:
>You reported the bug.
>
Comment 14 Philippe Waroquiers 2016-09-21 20:47:14 UTC
Quickly read the last version of the patch, sorry for entering in the game so late

Some comments:

* Typo in the xml documentation:  alocator

* lines like below:  opening brace should be on the same line
+         if (MC_(is_mempool_block)(ch1))
+         {

* for detecting/reporting/asserting the overlap condition
   in case ch1_is_meta != ch2_is_meta, I am wondering if we should not check
   that the non meta block is (fully) inside the meta block.
   It looks to be an error if the non meta block is not fully inside the meta block.

*  free_mallocs_in_mempool_block : this looks to be an algorithm that will be
    O (n * m)   when n is the nr of malloc-ed blocks, and m is the nr of blocks covered
    by Start/End address. That might be very slow for big applications, that allocates millions
    of blocks, e.g. 1 million normal block, and one million blocks in meta blocks
    will take a lot of time to cleanup ?

Thanks
Comment 15 Ruurd Beerstra 2016-09-22 13:37:50 UTC
Created attachment 101231 [details]
metamempoolv3.patch

Hi,

Thank you for reviewing the patch.

>--- Comment #14 from Philippe Waroquiers <philippe.waroquiers@skynet.be> --- Quickly read the last
>version of the patch, sorry for entering in the game so late
>
>Some comments:
>
>* Typo in the xml documentation:  alocator

Oops. Fixed. 

>* lines like below:  opening brace should be on the same line
>+         if (MC_(is_mempool_block)(ch1))
>+         {

I've written code in the OTB (One True Brace style) for so many years it is hard to be forced to stop doing that :-)
Done, though.

>* for detecting/reporting/asserting the overlap condition
>   in case ch1_is_meta != ch2_is_meta, I am wondering if we should not check
>   that the non meta block is (fully) inside the meta block.
>   It looks to be an error if the non meta block is not fully inside the meta block.

Yes, that would be a serious error in the custom allocator.
Of course our allocator does not do that, so I didn't think of that :-)
I've added an extra check for that. Ran all the regression tests, all is well.

>*  free_mallocs_in_mempool_block : this looks to be an algorithm that will be
>    O (n * m)   when n is the nr of malloc-ed blocks, and m is the nr of blocks
>covered
>    by Start/End address. That might be very slow for big applications, that allocates millions
>    of blocks, e.g. 1 million normal block, and one million blocks in meta blocks
>    will take a lot of time to cleanup ?

Short answer: Yes.
Long answer:
Part of the inefficiency is that it has to restart the scan after modifying the list. Can't help that.
Also, I can't find any other way in valgrind to find the chunks with a particular address-range other than a brute-force scan.
But if the big application you describe were not using auto-free pools, and it wanted to prevent memory leaks, it would have to explicitly free all those items, which takes the same lot of time + incurring the extra overhead to pass those calls to valgrind. I can't see any way around that, either.
The overhead is only incurred by custom allocators using the auto-free feature, not by any existing applications or allocators.
Also, if you memcheck an application using many millions of alloc/frees, you're willing to wait a while, anyway.

Our custom allocator has a clever feature where it doles out a chunk of a meta block to the application without keeping track of it.
It simply advances a "used" pointer in the pool block.
Those chunks are non-freeable and the application knows this, of course.
Very efficient way to, for example, store a temporary XML tree in a separate pool.
When the XML tree is discarded, the auto-free pool is destroyed and the application does not have to traverse the tree to free it.
Our allocator simply marks all the pool blocks as free for re-use.
The problem was that valgrind would not allow that (when a re-use happened it would see that as an internal error because the MALLOCLIKE blocks had never been freed as far valgrind was concerned and handing out the same address twice is a Bad Thing).
This patch of mine makes valgrind usable for our environment.
I now use the modified valgrind in our regression test environment and we're very happy with it.

Does that answer the questions?

	Attached is a revised version of the patch,
	Regards,
	Ruurd Beerstra.
Comment 16 Julian Seward 2016-09-23 09:02:05 UTC
Philippe, thank you for looking at this.  And Ruurd, for your patience.

> The overhead is only incurred by custom allocators using the auto-free feature,
> not by any existing applications or allocators.

In that case, then I am happy to go with it.  Philippe, what do you think?
Comment 17 Ivo Raisr 2016-09-23 09:36:54 UTC
I will take care of the integration if Philippe is ok with it.
Things which need to be done:
- update NEWS
- mark the filter script as executable for svn
- mark the compiled binary as ignored for svn
Comment 18 Philippe Waroquiers 2016-09-24 06:23:34 UTC
(In reply to Ruurd Beerstra from comment #15)
Thanks for all your work on this, I think this is useful 
(I have not yet looked in depth, but I think this might be used
for the 'self-hosting' of valgrind, as valgrind uses pools).

> Part of the inefficiency is that it has to restart the scan after modifying
> the list. Can't help that.
> Also, I can't find any other way in valgrind to find the chunks with a
> particular address-range other than a brute-force scan.
It is effectively the scan restart which makes it quadratic.
The brute-force scan is reasonable: that will be O(n), and
avoiding this linear scan would imply overhead for the non mempool
uses.

Such mempool functionalities will often be used by applications
that use a lot of (small) blocks, so it would be better to avoid this quadratic
aspect.
Various techniques can be used for that, I think the best/easiest
is to add a function
     void* VG_(HT_remove_at_Iter) ( VgHashTable *table)
which removes the item at the current position of the iterator
ensuring that after the removal the iterator is such that VG_(HT_next)
will return the element following the one just removed (or NULL, if the removed
element was the last element).
This removal will not cause problems (no hash table resize, and proper
maintenance of the iter and chains).

Philippe

NB: more generally, as an hash table can only have one single iterator, it would be
possible to allow calls to the other removal functions, but I think it is better to
have a special function for that).
Handling additions during iteration is more problematic, due to possible hash table resize.
Comment 19 Philippe Waroquiers 2016-09-24 06:27:41 UTC
(In reply to Ivo Raisr from comment #17)
> I will take care of the integration if Philippe is ok with it.
As indicated in comment 18, I think we can avoid relatively
easily the quadratic aspect.
Otherwise, if that would not be easy to do, let's integrate the
patch in any case.

Thanks
Comment 20 Ivo Raisr 2016-09-24 21:18:01 UTC
Fixed by SVN r15984.

I committed slightly adjusted changes from metamempoolv3.patch at this point.
Once VG_(HT_remove_at_Iter)() is available, we can easily adopt all over memcheck.
Comment 21 Ivo Raisr 2016-09-24 21:18:54 UTC
Ruurd, would you like to provide an implementation of VG_(HT_remove_at_Iter)()?
If yes, please create a new bug for tracking it.
Comment 22 Ivo Raisr 2016-09-25 19:21:13 UTC
Follow up fix SVN r15985.
Comment 23 Philippe Waroquiers 2017-01-22 17:20:59 UTC
See some follow up in bug 375415