Bug 375415 - free list of blocks, mempool blocks and describe addr do not work properly together
Summary: free list of blocks, mempool blocks and describe addr do not work properly to...
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.12.0
Platform: Other Linux
: NOR minor
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-22 17:19 UTC by Philippe Waroquiers
Modified: 2017-05-20 16:14 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philippe Waroquiers 2017-01-22 17:19:49 UTC
There are several strange things in the interaction between free list
of blocks, mempool blocks and describe addr.

* MC_(get_freed_block_bracketting) always uses the redzone sizeMC_(Malloc_Redzone_SzB).
  For a mempool block, the redzone size to use should be the one from the mempool.
  This can cause either False positive or False negative matching between an addr and
  a mempool block.

* For an unclear reason, the introduction of the meta pool functionality has changed
   the below code by adding the && !MC_(is_mempool_block)(mc) condition. It is unclear
   why a freed mempool block would not be a good candidate to describe an (erroneous)
   address.
   /* -- Search for a recently freed block which might bracket it. -- */
   mc = MC_(get_freed_block_bracketting)( a );
   if (mc && !MC_(is_mempool_block)(mc)) {

* It is also not very clear in which order the address description should be done.
  The code searches in meta mempool last. But it the meta mempool has been created from
  a big malloc superblock, then it looks to me that it would be better to first search
  in the allocated 2nd level blocks (as this is a real application memory), before finding
  the meta pool block.

Probably we need a somewhat heavy rework of the way addresses are described.
Maybe we should always search all types of blocks (allocated or freed, classical or mempool),
and then decide a priority order to display the info.

See also bug 322245
Comment 1 Ruurd Beerstra 2017-01-24 09:54:12 UTC
Quick note: I think I can clarify the "For an unclear reason, the introduction of the meta pool functionality has changed" below.
Without that fix, our allocator that uses the metapool feature (allocate large blocks from which smaller blocks are allocated) would always describe the address of the large outer block, and that is totally uninteresting (triggered by the need for a new block).
The description of the smaller, inner block is the one to show.
So I did not *remove* the search for a mempool block, I moved it to the end of the function, with a comment (see below) that attempted to describe the why...  This way, the description prefers the inner block over the mempool block.

Hope that helps - Ruurd.

memcheck/mc_errors.c:1106:

   /* -- Perhaps it's in a meta mempool block? -- */
   /* This test is done last, because metapool blocks overlap with blocks
      handed out to the application. That makes every heap address part of
      a metapool block, so the interesting cases are handled first.
      This final search is a last-ditch attempt. When found, it is probably
      an error in the custom allocator itself. */
   if (mempool_block_maybe_describe( a, /*is_metapool*/ True, ai )) {
      return;
   }
Comment 2 Philippe Waroquiers 2017-01-24 18:55:43 UTC
(In reply to Ruurd Beerstra from comment #1)
First, thanks for providing feedback: providing further feedback/follow-up
after commit of a provided patch is well appreciated.

> Quick note: I think I can clarify the "For an unclear reason, the
> introduction of the meta pool functionality has changed" below.
> Without that fix, our allocator that uses the metapool feature (allocate
> large blocks from which smaller blocks are allocated) would always describe
> the address of the large outer block, and that is totally uninteresting
> (triggered by the need for a new block).
> The description of the smaller, inner block is the one to show.
Yes, the reason why we have 2 scans of the mempools blocks:
   one for the non meta pool first
   one for the meta pool last
is very clear/understood.

Rather, what I do not understand is why we need the new sub-condition:
"&& !MC_(is_mempool_block)(mc)"
in the below:
   mc = MC_(get_freed_block_bracketting)( a );
   if (mc && !MC_(is_mempool_block)(mc)) {
With this new sub-condition, unless I missed something, a (small inner) mempool
block freed will never be used to describe an error.

> So I did not *remove* the search for a mempool block,
True. But as I understand; what you have removed is that a mempool (inner for
meta pool) block is never landing in an error address description.
And *that* is not clear to me.

> I moved it to the end
> of the function, with a comment (see below) that attempted to describe the
> why...  This way, the description prefers the inner block over the mempool
> block.
> 
> Hope that helps - Ruurd.
> 
> memcheck/mc_errors.c:1106:
> 
>    /* -- Perhaps it's in a meta mempool block? -- */
>    /* This test is done last, because metapool blocks overlap with blocks
>       handed out to the application. That makes every heap address part of
>       a metapool block, so the interesting cases are handled first.
>       This final search is a last-ditch attempt. When found, it is probably
>       an error in the custom allocator itself. */
>    if (mempool_block_maybe_describe( a, /*is_metapool*/ True, ai )) {
>       return;
>    }
Comment 3 Ivo Raisr 2017-05-05 16:12:43 UTC
Ruurd, can you please comment?
Comment 4 Ruurd Beerstra 2017-05-09 08:19:30 UTC
Sorry for the delay, I'm doing lots of stuff on other projects and valgrind is on the back burner for me. It "just works" as part of our regression framework, doing several thousand runs every night and catches newly introduced memory errors every so often.
So I hope I can still make sense here :-)

Point 1 (bad redzone): My bad, caused by being all new to valgrind when I took a stab at modifying it. Our code doesn't use redzones, so both are zero and I didn't look closely enough. Needs a fix.

Point 2: The not-is-mempool test: Unless I'm completely wrong here: If I take that test out, many application-malloced blocks that lie within a mempool block will be described as lying within a mempool block, with the allocation stack being the one that triggered the allocation of that mempool block (some random moment when the pool is exhausted).
That is not what I want: I want the description of the malloc that was done *from* the mempool block, with the proper allocation stack.
This is because our allocator uses itself to allocate the mempool blocks, so both mempool-block allocs and plain application allocs are on the same malloc list. The test for a mempool block at the end was added to catch the case where an application-malloc is NOT within a mempool, which should not happen in our custom-allocator case, but the case needs handling.
Your remark: "what you have removed is that a mempool (inner for
meta pool) block is never landing in an error address description" is correct: I consider that address description an internal and uninteresting affair of our custom allocator and there should always be a more interesting, application malloc stack that I want to show.
Does that make sense?

Point 3: About the order: I think you sort of echo there what I try to describe above. This is hard stuff, trying to get the best address description in all cases. All I can say: The current version works for me, and I gather there are few custom allocators these days that use the mempool features. But if my amateur changes require rework: feel free to improve upon it. I'll (test) run it in our test environment.
Comment 5 Philippe Waroquiers 2017-05-15 20:27:23 UTC
(In reply to Ruurd Beerstra from comment #4)
Thanks for the feedback, a first positive result of this feedback is
that a close examination of the test mempool2.c revealed the test
was not properly testing what it was supposed to test
(see revision 16369).


> 
> Point 1 (bad redzone): My bad, caused by being all new to valgrind when I
> took a stab at modifying it. Our code doesn't use redzones, so both are zero
> and I didn't look closely enough. Needs a fix.
Yes, it needs a fix. But do not worry, you did not create this bug.
As far as I can see, this bug was already there since a very very long time
(even before I implemented --redzone-size).
It is however not easy to fix, at least not without having a list of freed
blocks per memory pool, which then implies probably to heavily rework
the current implementation of free blocks.

> 
> Point 2: The not-is-mempool test: Unless I'm completely wrong here:
I believe you are completely wrong here :).
As I understand, in the below test, no way mc can even be a mempool block
   mc = MC_(get_freed_block_bracketting)( a );
   if (mc && !MC_(is_mempool_block)(mc)) {
The reasoning: is_mempool_block checks that mc is a mempool block by
scanning the chunks in all mempools.
But if a block has been freed, then it is removed from the mp->chunks.
So, as I understand, this test is useless : if mc is found, it will never
be a free mempool block.

> If I
> take that test out, many application-malloced blocks that lie within a
> mempool block will be described as lying within a mempool block, with the
> allocation stack being the one that triggered the allocation of that mempool
> block (some random moment when the pool is exhausted).
I think that there is confusion between 2 different conditions:
   * the condition that I am discussing just above, which is useless as
     I understand (at least, removing it makes no difference in the
     regression tests, and reading the code, as explained above, it looks
     useless).
   * the other condition is what you have added in the calls to
     mempool_block_maybe_describe, which is to check for metapool False,
     and then True. Yes, these 2 calls are necessary, otherwise, effectively,
     a freed inner block will always be described by the first call
     to mempool_block_maybe_describe.

So, I am tempted to just remove this useless condition.
It would be nice however if you could just first check that effectively
in your application, it also makes no difference.
(see patch below)

> That is not what I want: I want the description of the malloc that was done
> *from* the mempool block, with the proper allocation stack.
> This is because our allocator uses itself to allocate the mempool blocks, so
> both mempool-block allocs and plain application allocs are on the same
> malloc list. The test for a mempool block at the end was added to catch the
> case where an application-malloc is NOT within a mempool, which should not
> happen in our custom-allocator case, but the case needs handling.
> Your remark: "what you have removed is that a mempool (inner for
> meta pool) block is never landing in an error address description" is
> correct: I consider that address description an internal and uninteresting
> affair of our custom allocator and there should always be a more
> interesting, application malloc stack that I want to show.
> Does that make sense?
> 
> Point 3: About the order: I think you sort of echo there what I try to
> describe above. This is hard stuff, trying to get the best address
> description in all cases. 
Yes, we need a rework of this. E.g. the msg
"recently re-allocated block" is sometimes plain wrong, when a outer block
was split in inner blocks, and an inner block was freed. The outer block
is described as recently reallocated, which is not true.

> All I can say: The current version works for me,
> and I gather there are few custom allocators these days that use the mempool
> features. But if my amateur changes require rework: feel free to improve
> upon it. I'll (test) run it in our test environment.
If you could check the below patch, that would be nice.
Thanks

Index: memcheck/mc_errors.c
===================================================================
--- memcheck/mc_errors.c	(revision 16378)
+++ memcheck/mc_errors.c	(working copy)
@@ -1098,7 +1098,7 @@
    }
    /* -- Search for a recently freed block which might bracket it. -- */
    mc = MC_(get_freed_block_bracketting)( a );
-   if (mc && !MC_(is_mempool_block)(mc)) {
+   if (mc) {
       ai->tag = Addr_Block;
       ai->Addr.Block.block_kind = Block_Freed;
       ai->Addr.Block.block_desc = "block";
Comment 6 Ruurd Beerstra 2017-05-16 08:15:45 UTC
Hi,

I was completely wrong :-)
I took out the extra !MC_(is_mempool_block)(mc) and it still reports the proper, usable locations. So it can be removed.

	Thanks,
	Ruurd

-----Original Message-----
From: Philippe Waroquiers [mailto:bugzilla_noreply@kde.org] 
Sent: Monday, May 15, 2017 22:27
To: Ruurd Beerstra <Ruurd.Beerstra@infor.com>
Subject: [valgrind] [Bug 375415] free list of blocks, mempool blocks and describe addr do not work properly together

https://bugs.kde.org/show_bug.cgi?id=375415

--- Comment #5 from Philippe Waroquiers <philippe.waroquiers@skynet.be> --- (In reply to Ruurd Beerstra from comment #4) Thanks for the feedback, a first positive result of this feedback is that a close examination of the test mempool2.c revealed the test was not properly testing what it was supposed to test (see revision 16369).


> 
> Point 1 (bad redzone): My bad, caused by being all new to valgrind 
> when I took a stab at modifying it. Our code doesn't use redzones, so 
> both are zero and I didn't look closely enough. Needs a fix.
Yes, it needs a fix. But do not worry, you did not create this bug.
As far as I can see, this bug was already there since a very very long time (even before I implemented --redzone-size).
It is however not easy to fix, at least not without having a list of freed blocks per memory pool, which then implies probably to heavily rework the current implementation of free blocks.

> 
> Point 2: The not-is-mempool test: Unless I'm completely wrong here:
I believe you are completely wrong here :).
As I understand, in the below test, no way mc can even be a mempool block
   mc = MC_(get_freed_block_bracketting)( a );
   if (mc && !MC_(is_mempool_block)(mc)) { The reasoning: is_mempool_block checks that mc is a mempool block by scanning the chunks in all mempools.
But if a block has been freed, then it is removed from the mp->chunks.
So, as I understand, this test is useless : if mc is found, it will never be a free mempool block.

> If I
> take that test out, many application-malloced blocks that lie within a 
> mempool block will be described as lying within a mempool block, with 
> the allocation stack being the one that triggered the allocation of 
> that mempool block (some random moment when the pool is exhausted).
I think that there is confusion between 2 different conditions:
   * the condition that I am discussing just above, which is useless as
     I understand (at least, removing it makes no difference in the
     regression tests, and reading the code, as explained above, it looks
     useless).
   * the other condition is what you have added in the calls to
     mempool_block_maybe_describe, which is to check for metapool False,
     and then True. Yes, these 2 calls are necessary, otherwise, effectively,
     a freed inner block will always be described by the first call
     to mempool_block_maybe_describe.

So, I am tempted to just remove this useless condition.
It would be nice however if you could just first check that effectively in your application, it also makes no difference.
(see patch below)

> That is not what I want: I want the description of the malloc that was 
> done
> *from* the mempool block, with the proper allocation stack.
> This is because our allocator uses itself to allocate the mempool 
> blocks, so both mempool-block allocs and plain application allocs are 
> on the same malloc list. The test for a mempool block at the end was 
> added to catch the case where an application-malloc is NOT within a 
> mempool, which should not happen in our custom-allocator case, but the case needs handling.
> Your remark: "what you have removed is that a mempool (inner for meta 
> pool) block is never landing in an error address description" is
> correct: I consider that address description an internal and 
> uninteresting affair of our custom allocator and there should always 
> be a more interesting, application malloc stack that I want to show.
> Does that make sense?
> 
> Point 3: About the order: I think you sort of echo there what I try to 
> describe above. This is hard stuff, trying to get the best address 
> description in all cases.
Yes, we need a rework of this. E.g. the msg "recently re-allocated block" is sometimes plain wrong, when a outer block was split in inner blocks, and an inner block was freed. The outer block is described as recently reallocated, which is not true.

> All I can say: The current version works for me, and I gather there 
> are few custom allocators these days that use the mempool features. 
> But if my amateur changes require rework: feel free to improve upon 
> it. I'll (test) run it in our test environment.
If you could check the below patch, that would be nice.
Thanks

Index: memcheck/mc_errors.c
===================================================================
--- memcheck/mc_errors.c        (revision 16378)
+++ memcheck/mc_errors.c        (working copy)
@@ -1098,7 +1098,7 @@
    }
    /* -- Search for a recently freed block which might bracket it. -- */
    mc = MC_(get_freed_block_bracketting)( a );
-   if (mc && !MC_(is_mempool_block)(mc)) {
+   if (mc) {
       ai->tag = Addr_Block;
       ai->Addr.Block.block_kind = Block_Freed;
       ai->Addr.Block.block_desc = "block";

--
You are receiving this mail because:
You are on the CC list for the bug.
Comment 7 Philippe Waroquiers 2017-05-20 16:14:07 UTC
The useless condition was removed in revision 16402.