Bug 458118

Summary: Track deletions of objects from unloaded shared libraries
Product: [Developer tools] valgrind Reporter: Michael Barth <Spirrwell>
Component: memcheckAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: wishlist CC: philippe.waroquiers, pjfloyd
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description Michael Barth 2022-08-20 23:23:20 UTC
I don't know if this is something that is possible. Consider the following, you create an object in one shared library that is deleted in another. You then unload the library that did the allocation, and then proceed to delete the object. If that object is not trivial and has destructors that need to be called, you will end up with a segmentation fault.

Valgrind will simply report the issue as "is not stack'd, malloc'd or (recently) free'd." This had lead me to believing I had some kind of heap corruption bug with an overwrite or underwrite somewhere that was altering the pointer to point at something that did not exist. But it was a lot simpler than that, the shared library that created the object was simply unloaded.

If these kinds of bugs could be found by Valgrind, that would be extremely helpful.
Comment 1 Paul Floyd 2022-08-21 16:57:09 UTC
Does --keep-debuginfo=yes make any difference?

Also can you provide a small example that reproduces the issue?
Comment 2 Michael Barth 2022-08-21 19:15:03 UTC
(In reply to Paul Floyd from comment #1)
> Does --keep-debuginfo=yes make any difference?
> 
> Also can you provide a small example that reproduces the issue?

This does not make any difference for me. Here is an example repo: https://github.com/Spirrwell/so_unload_delete_demo

It is worth noting that when using --leak-check=full, valgrind does pick up on the fact that there's a leak relating to dlclose in this example. But only when these two things are side-by-side in such a small example would I be able to see the relation between the two.

While debugging this big old application, I had a giant wall of leaks and I was trying to debug what I THOUGHT was heap corruption, but was simply this shared library unloading issue. So having ANY little extra hint that this kind of problem occurred would be quite useful.
Comment 3 Philippe Waroquiers 2022-08-22 18:38:46 UTC
Valgrind keeps recently freed blocks in a list that allows to report where it was allocated. If the size of this list (controlled by --freelist-vol parameter) is big enough and you use --keep-debuginfo=yes, then I think valgrind should be able to tell you the stack trace that allocated the referenced freed block.

Now, if the segmentation violation happens because the destructor code has been unloaded and this destructor code is not found anymore via a pointer in the dispatch table, then valgrind does not track executable code and/or dispatch table.
Comment 4 Michael Barth 2022-08-22 19:29:47 UTC
I do believe it is a result of the destructor being called and not specifically delete or free. Using placement new with malloc'd memory and calling the destructor manually yields the same kind of segmentation fault.

In which case, that kinda stinks. Oh well, thank you anyway!