Bug 474443 - run valgrind with --leak-check=full and --show-leak-kinds=all or without the 2 leak options enfluence error summary.
Summary: run valgrind with --leak-check=full and --show-leak-kinds=all or without the ...
Status: RESOLVED NOT A BUG
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.21.0
Platform: Arch Linux Linux
: NOR critical
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-12 09:07 UTC by Jane
Modified: 2023-09-20 10:09 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
The 5 valgrind logs with different valgrind options and the same progrogam ./simv (898.60 KB, application/x-zip-compressed)
2023-09-16 09:42 UTC, Jane
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jane 2023-09-12 09:07:31 UTC
SUMMARY
My program used malloc/calloc/realloc of system call and run on amd64 machine of CetOS7.9.2009, which compiles and links to be either a dynamic lib and envoked from a binary or a binary. 
For the output dynamic library: when I ran valgrind, flagd "Valgrind-3.21.0-d97fed7c3e-20230428 and LibVEX" as "valgrind --tool=memcheck -v --trace-children=yes --track-origins=yes --log-file=val.log --error-limit=no <my running scripts, which called a binary and link to my dynamic lib>", I got the valgrind report with ERROR SUMMARY with 6 errors from 4 contexts there are 4 contexts detail call stacks in the log file; but if I ran "valgrind --tool=memcheck -v --trace-children=yes --track-origins=yes --log-file=val.log --error-limit=no  --leak-check=full and --show-leak-kinds=all  <my running scripts>", I got the valgrind report with ERROR SUMMARY with 378 errors from 376 contexts but again there are only 4 contexts detail call stacks in the log file!
For the output binary: I ran like above ways,  for some test cases I even got 0 ERROR SYMMARY without "--leak-check=full and --show-leak-kinds=all" but I got number(>0) ERROR SYMMARY with "--leak-check=full and --show-leak-kinds=all" but without any details about the error context. And for other many test cases, no above issue.


STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS

Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 Paul Floyd 2023-09-13 05:30:42 UTC
This looks more like user error than a critical defect to me.

When you are using  --log-file and --trace-children=yes you should really use %p in the log file name. Otherwise the children will overwrite the log file.

Try -log-file=val.%p.log

If that doesn't solve your problem then we need more info. Add an attachment with your log files in a tarball so that we can see more clearly what is happening.

I'll think about detecting --trace-children and --log-file without a %p and issuing a warning.
Comment 2 Jane 2023-09-14 02:24:57 UTC
I‘ve tried what the email said but it doesn't solve my problem. The only difference is the val.log name became val.<pid>.log.  What I have experienced includes unexpected "Command" shown in valgrind log file, and more than one pid in valgrind log file etc.

发件人:Paul Floyd <bugzilla_noreply@kde.org>
发送时间:09/13/23 13:30:49
收件人:'Jane' <jane@gwxeda.com>
主题:[valgrind] [Bug 474443] run valgrind with --leak-check=full and --show-leak-kinds=all or without the 2 leak options enfluence error summary.
https://bugs.kde.org/show_bug.cgi?id=474443

--- Comment #1 from Paul Floyd <pjfloyd@wanadoo.fr> ---
This looks more like user error than a critical defect to me.

When you are using  --log-file and --trace-children=yes you should really use
%p in the log file name. Otherwise the children will overwrite the log file.

Try -log-file=val.%p.log

If that doesn't solve your problem then we need more info. Add an attachment
with your log files in a tarball so that we can see more clearly what is
happening.

I'll think about detecting --trace-children and --log-file without a %p and
issuing a warning.

-- 
You are receiving this mail because:
You reported the bug.
Comment 3 Paul Floyd 2023-09-14 10:07:13 UTC
I really can't tell any more without the logs and details of what exactly which process are being run.
Comment 4 Jane 2023-09-15 08:43:39 UTC
(In reply to Paul Floyd from comment #3)
> I really can't tell any more without the logs and details of what exactly
> which process are being run.

Can here attach log file then?
Comment 5 Paul Floyd 2023-09-15 10:31:49 UTC
(In reply to Jane from comment #4)
> (In reply to Paul Floyd from comment #3)
> > I really can't tell any more without the logs and details of what exactly
> > which process are being run.
> 
> Can here attach log file then?

There is an "add an attachment" link on the web page.
Comment 6 Jane 2023-09-16 09:42:13 UTC
Created attachment 161656 [details]
The 5 valgrind logs with different valgrind options and the same progrogam ./simv

The valgrind command line is like: valgrind --tool=memcheck --trace-children=yes ... --log-file=...  ./simv
With different valgrind options I got different valgrind report, mainly error summary with or without error contexts details. BTW, simv is   the 3rd run step to run digital simulation of tool "vcs", and my codes compiled as a dynamic library loaded by the binary simv.
Comment 7 Paul Floyd 2023-09-16 13:54:00 UTC
Not directly relevant, I work for Siemens EDA and I used to work for Synopsys.

So my facetious answer is - use QuestaSim :-)

Just some comments on the logs to start with.

Lots of the errors refer to obfuscated names used by VCS, so it's hard to tell where the problem is.

Something is trying to close files (and failing):

==98899==    at 0xC2317A0: __close_nocancel (in /usr/lib64/libpthread-2.17.so)
==98899==    by 0x7FBC153: SNPSle_417a3676075ec189542ebd57460c18ac (in /datacenter/tools/synopsys/vcs-mx/O-2018.09-SP2/linux64/lib/libvcsnew.so)

Errors like

==98899== Syscall param write(buf) points to unaddressable byte(s)
==98899== Syscall param msync(start) points to uninitialised byte(s)

could be harmless. Valgrind can't tell when data structures have holes or padding.

Do you get the same result if you use --errors-for-leak-kinds=none?

My suspicion is that when you have leak detection on the leak count up to the execve() is getting added to the error count.
Comment 8 Jane 2023-09-18 05:07:01 UTC
I did not care about the detail error contexts about vcs codes, which acctually can see the contexts although the contexts are only located the .so files. What I did care is, there are more errores, which dissappearred in details of error contexts. And again, I RAN valgrind the same way the same program and the same case, but with or without leaks options, I got different mem error summaries!
Comment 9 Paul Floyd 2023-09-18 06:06:13 UTC
Repeating the end of my previous comment:

Do you get the same result if you use --errors-for-leak-kinds=none?
Comment 10 Jane 2023-09-20 08:21:19 UTC
(In reply to Paul Floyd from comment #9)
> Repeating the end of my previous comment:
> 
> Do you get the same result if you use --errors-for-leak-kinds=none?

If without the setting " --errors-for-leak-kinds=none", what's the defatult of the option setting then? I was told mem error report is about mem problem besides mem leak. Any way, I will try it.
Comment 11 Jane 2023-09-20 08:43:42 UTC
"My suspicion is that when you have leak detection on the leak count up to the execve() is getting added to the error count."
My suspicion is valgrind has bug in this case. Seems --errors-for-leak-kinds default is [definite, possible], right? But in my experience, those two kinds of leak always do not count into error summary. And for this case, If I do what you said, i.e. --leak-check=full and --show-leak-kinds=all --errors-for-leak-kinds=none or without the three options, I got the same error summary now.
Comment 12 Paul Floyd 2023-09-20 10:02:31 UTC
The count for "ERROR SUMMARY with 6 errors" is maintained in a global variable, "n_errs_found" which gets modified by two functions, VG_(unique_error) and VG_(maybe_record_error).

VG_(unique_error) is used for recording leaks. So that looks like the one related to this.

The conditions for counting leaks as errors are
1. mode is full
2. there are 1 or more leaks
3. the kind of the leak matches one of the --errors-for-leaks kinds

So leaks only get added to the ERROR SUMMARY with --leak-check=full or --leak-check=yes
Comment 13 Paul Floyd 2023-09-20 10:09:13 UTC
And this is in the manual:

https://valgrind.org/docs/manual/mc-manual.html

The answer to this question affects the numbers printed in the ERROR SUMMARY line, and also the effect of the --error-exitcode option. First, a leak is only counted as a true "error" if --leak-check=full is specified. Then, the option --errors-for-leak-kinds=<set> controls the set of leak kinds to consider as errors. The default value is --errors-for-leak-kinds=definite,possible