Bug 199468 - Suppressions: stack size limited to 25 while --num-callers allows more frames
Summary: Suppressions: stack size limited to 25 while --num-callers allows more frames
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: unspecified
Platform: RedHat Enterprise Linux Linux
: NOR normal
Target Milestone: wanted3.6.0
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-08 21:35 UTC by kitty__wu
Modified: 2016-09-07 20:20 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kitty__wu 2009-07-08 21:35:54 UTC
Version:            (using KDE 4.2.4)
Compiler:          gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42) 
OS:                Linux
Installed from:    RedHat RPMs


This sounds inconsistent for me.

Running memcheck on a binary handled by our team
currently generates tons of errors (more than 500...),
mainly originating from some 3rd-party shared libraries.

In order to analyse them, I have tried to generate the
related suppressions; I have noticed that most of the
corresponding errors still appear due to
the limitation mentioned into the description.

This limitation (0 <= frame <= 24) is hardcoded and defined
in the ./coregrind/m_errormgr.c file:
  #define VG_MAX_SUPP_CALLERS  24

I have recompiled valgrind (3.4.1) by applying the expected change:

diff ./coregrind/m_errormgr.c.ORIG ./coregrind/m_errormgr.c
180c180
< #define VG_MAX_SUPP_CALLERS  24
---
> #define VG_MAX_SUPP_CALLERS  49   /* 0 <= callers <= 49 */

and it appears that it fixes the problem without causing any
additional trouble.

Is this a suitable fix?
If it is the case, would you mind integrate it into the next
version please ?

Cheers,
Christophe


Options used:
  --tool=memcheck           \
  --leak-check=summary      \
  --show-reachable=yes      \
  --leak-resolution=high    \
  --gen-suppressions=all    \
  --num-callers=40          \
  -v                        \
  --log-file=${VG_LOG_FILE}
Comment 1 Robin Kuzmin 2016-08-03 18:07:35 UTC
I have the same issue.
I have a suppression:
{
   google::protobuf::Message::PrintDebugString() const (text_format.cc:110)
   Memcheck:Leak
   match-leak-kinds: reachable
   fun:_Znwm
   ...
   fun:_ZNK6google8protobuf7Message11DebugStringEv
   fun:_ZNK6google8protobuf7Message16PrintDebugStringEv
   ...
}
This suppression is ignored for "--num-callers=20", but works fine (suppresses) for "--num-callers=40". 
I didn't expect the suppressions to depend on "--num-callers".
Comment 2 Philippe Waroquiers 2016-08-10 08:48:49 UTC
(In reply to Robin Kuzmin from comment #1)
> I have the same issue.
> I have a suppression:
> {
>    google::protobuf::Message::PrintDebugString() const (text_format.cc:110)
>    Memcheck:Leak
>    match-leak-kinds: reachable
>    fun:_Znwm
>    ...
>    fun:_ZNK6google8protobuf7Message11DebugStringEv
>    fun:_ZNK6google8protobuf7Message16PrintDebugStringEv
>    ...
> }
> This suppression is ignored for "--num-callers=20", but works fine
> (suppresses) for "--num-callers=40". 
> I didn't expect the suppressions to depend on "--num-callers".

I think that we have 2 different problems:
The bug speaks about the fact that a suppression is limited to 24 entries, while the --num-callers can now (in valgrind 3.11) go up to 500.

Your problem is that a suppression works with 40 num callers, but does not suppress
for 20 num callers. This is not abnormal: if for example  fun:_ZNK6google8protobuf7Message11DebugStringEv is at depth 30, when you use --num-callers=20, then this function will not be recorded in the error stacktrace and the suppression will not match.
This is because the logic is:
  * first an error is made, with a stacktrace recorded as specified with --num-callers
  * after that, the suppression entries are tried.

If suppressions would have to be done without taking --num-callers into account, then it
means that all errors stacktraces would have to be first recorded without limit, and then after
(unsuccesful) suppression matching, the frames exceeding --num-callers would have to be dropped.

The original problem (suppressions limited to 24 entries while errors can record up to 500 frames) is somewhat strange. I could not retrieve any comment or svn log describing why
suppression entries are limited to 24 entries. --num-callers used to be limited to 50.

If we increase the nr of entries in suppressions to 500 (the max value for --num-callers),
this will however has as a side effect that --gen-suppressions will produce bigger entries
while they were limited to 24 entries. I guess this is not a problem, as in any case, for many suppression entries, 24 was already a lot (e.g. for suppression entries intended to suppress errors in libraries)
Comment 3 Robin Kuzmin 2016-08-10 17:07:35 UTC
You wrote: If suppressions would have to be done without taking --num-callers into account, then it means that all errors stacktraces would have to be first recorded without limit, and then after (unsuccesful) suppression matching, the frames exceeding --num-callers would have to be dropped.

That closely describes my intuitive expectation about how Valgrind should work (at least based on what I read for  --num-callers at http://valgrind.org/docs/manual/manual-core.html#manual-core.options). I expected the suppressions to work always, regardless of --num-callers. And I expected the --num-callers to be considered ONLY WHEN PRINTING the stack trace to console. And I expected that internally Valgrind saves the entire stack trace for every error.

Now I understand that the implementation that corresponds to my expectation would work much slower, require much more resources (for stack traces), and will hardly work ok for the endless recursion (endless stack trace until the stack overflow).

Thus I recommend at least 2 settings for the length of the stack trace:
+ the first setting specifies how many stack frames valgrind saves internally, the same setting specifies how many stack frames are used for the suppressions (currently this is done using the  --num-callers). By default this setting should be relatively high (e.g. 50+);
+ the second setting specifies how many stack frames valgrind prints to the console when reporting the errors. By default this setting should be relatively low (e.g. 12-).

And the documentation should clearly describe which setting determines what, and from the documentation it should be clear that the suppression might not suppress for short "first setting".
Comment 4 Robin Kuzmin 2016-08-10 18:46:27 UTC
I have run through this discussion more thoroughly. My understanding how valgrind works (or should work) has changed. Here is my vision.

In order to avoid multiple reports of the same error, valgrind uses top 4 stack frames to common up the errors (see description of "--num-callers" at http://valgrind.org/docs/manual/manual-core.html#manual-core.options). 

Valgrind (memcheck) deals with (at least) 2 types of the stack traces:

1. The Short-term Stack Traces. They are reported immediately upon error, e.g. when the analyzed program accesses outside the heap-allocated block. At this stage the entire stack trace can be used for suppressions (both for applying the suppressions and for generating the suppression with --gen-suppressions). After suppressions, if/when reporting these stack traces to the console the top --num-callers frames are (or can be) printed. Then these stack traces are (or can be) forgotten by valgrind (and only the top 4 stack frames are (or can be) saved to common up the errors). Thus for the Short-term Stack Traces there is no need for the suppressions to depend on the --num-callers. Implementing this mechanism will be relatively cheap (the price will be an INsignificant lower down in performance and INsignificant (and short-term) increase in resource requirements).

2. The Long-term Stack Traces. These stack traces valgrind has to store in memory for a long period of time, and potentially till the end of the analyzed program. E.g. when the analyzed program makes an allocation, the allocation stack trace has to be kept until the CORRECT deallocation (if the CORRECT deallocation Ever happens) or until the termination (if the CORRECT deallocation Never happens). For the Long-term Stack Traces valgrind can try to APPLY suppressions immediately (upon allocation) to the entire stack trace. If any of the suppressions is applicable then the top 4 stack frames are stored to common up the errors (and the stack trace can be marked as "to be suppressed by suppression N"), otherwise (none of the suppressions apply) the number of stack frames to be saved depends on the --gen-suppressions. If "no" then the --num-callers stack frames are stored, otherwise the entire stack trace is stored (and the entire stack trace is used for generating the suppression (if the CORRECT deallocation never happens), and only the --num-callers stack frames are used for reporting the error to the console).

Thus the suppressions might be independent of the --num-callers.
Comment 5 Philippe Waroquiers 2016-08-11 22:18:16 UTC
(In reply to Robin Kuzmin from comment #4)
> I have run through this discussion more thoroughly. My understanding how
> valgrind works (or should work) has changed. Here is my vision.
> 
> In order to avoid multiple reports of the same error, valgrind uses top 4
> stack frames to common up the errors (see description of "--num-callers" at
> http://valgrind.org/docs/manual/manual-core.html#manual-core.options). 
> 
> Valgrind (memcheck) deals with (at least) 2 types of the stack traces:
> 
> 1. The Short-term Stack Traces. They are reported immediately upon error,
> e.g. when the analyzed program accesses outside the heap-allocated block. At
> this stage the entire stack trace can be used for suppressions (both for
> applying the suppressions and for generating the suppression with
> --gen-suppressions). After suppressions, if/when reporting these stack
> traces to the console the top --num-callers frames are (or can be) printed.
> Then these stack traces are (or can be) forgotten by valgrind (and only the
> top 4 stack frames are (or can be) saved to common up the errors). Thus for
> the Short-term Stack Traces there is no need for the suppressions to depend
> on the --num-callers. Implementing this mechanism will be relatively cheap
> (the price will be an INsignificant lower down in performance and
> INsignificant (and short-term) increase in resource requirements).
> 
> 2. The Long-term Stack Traces. These stack traces valgrind has to store in
> memory for a long period of time, and potentially till the end of the
> analyzed program. E.g. when the analyzed program makes an allocation, the
> allocation stack trace has to be kept until the CORRECT deallocation (if the
> CORRECT deallocation Ever happens) or until the termination (if the CORRECT
> deallocation Never happens). For the Long-term Stack Traces valgrind can try
> to APPLY suppressions immediately (upon allocation) to the entire stack
> trace. If any of the suppressions is applicable then the top 4 stack frames
> are stored to common up the errors (and the stack trace can be marked as "to
> be suppressed by suppression N"), otherwise (none of the suppressions apply)
> the number of stack frames to be saved depends on the --gen-suppressions. If
> "no" then the --num-callers stack frames are stored, otherwise the entire
> stack trace is stored (and the entire stack trace is used for generating the
> suppression (if the CORRECT deallocation never happens), and only the
> --num-callers stack frames are used for reporting the error to the console).
> 
> Thus the suppressions might be independent of the --num-callers.

For what you call 'long term stacktraces" : capturing a big stacktrace is costly by itself.
A lot of effort was spent to optimise this, but still this is costly. So, always capturing
an (unlimited) stack trace is not desirable.
But applying the suppression entries is even (a lot) more costly. E.g. it implies to translate
an IP address into a function name.
A lot of effort was spent already to make these suppression matching faster (e.g. using lazy completers, see m_errormgr.c is_suppressible_error function).
But such suppression matching can clearly not be done all the time (e.g. for all allocations;
in memcheck or in some tools such as helgrind, even for all memory accesses).

So, what you suggest cannot be implemented without impacting significantly the
performance.

With that in mind, I think the best approach for you is just to use suppressions that works
properly with the --num-callers you are using.

For what concerns the original problem (suppression entries limited to 24 entries):
I guess we could/should use the same max sze as the max value fo r--num-callers
Comment 6 Robin Kuzmin 2016-08-15 21:41:06 UTC
Makes sense for me.
Comment 7 Philippe Waroquiers 2016-09-07 20:20:11 UTC
Committed revision 15945.
With this commit, a suppression entry can have a nr of callers up to the maximum of
--num-callers (500). Generated suppressions are generated with up to --num-callers entries.