492389 – Found nondeterminism in Valgrind's log for Memcheck

Bug 492389 - Found nondeterminism in Valgrind's log for Memcheck

Summary: Found nondeterminism in Valgrind's log for Memcheck

Status:	REPORTED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	memcheck (other bugs)
Version First Reported In:	3.18.1
Platform:	Ubuntu Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2024-08-30 04:31 UTC by jo.alen1@outlook.com
Modified:	2024-09-07 11:04 UTC (History)
CC List:	1 user (show)

See Also:	492388 492343 492382 492348 492386
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
A one-page PDF resembling the file difference in Valgrind's log for Memcheck (46.53 KB, application/pdf) 2024-08-30 04:31 UTC, jo.alen1@outlook.com	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description jo.alen1@outlook.com 2024-08-30 04:31:57 UTC

Created attachment 173114 [details]
A one-page PDF resembling the file difference in Valgrind's log for Memcheck

Hi Valgrind Maintainers and Team!

I have been using Valgrind as a tool to check for nondeterministic behaviors and I know that there are multiple subtools that Valgrind's suite comes with. I would like to inquire about the Memcheck subtool in analyzing some of the reports generated after each run against a binary. This nondeterminism concerns the Valgrind generated report. Out of 41 repositories we've tested against Memcheck, we saw that 25/41 showed flaky reportings for their respective binaries. 

Would the respective team and its maintainers for this product take a look at this issue and provide possible explanations? Thank you very much!

STEPS TO REPRODUCE
1. Use a GitHub Actions runner with Ubuntu 22.04 + Valgrind 3.18.1 pulled from the Ubuntu package manager
2. Run your executable with Memcheck enabled. 
3. Obtain the Valgrind report

OBSERVED RESULT
We noticed that out of these 25 repositories that reported nondeterminism in their Valgrind logs, we saw five distinct variations these logs experienced: 
  - difference in the byte-related metrics
  - differences of "still-reachable allocation" byte counts
  - differences in the memory and resource usage metrics (such as heap statistics)
  - differences in the number of errors reported
  - difference in the performance metrics (like instruction counts and etc.). 

In the given attachment, the left column represents Run #1 while the right column represents Run #2. As you notice, run #1 experienced two reported errors relating to uninitialized value(s) while run #2 experienced no reported errors and is 0. 

ADDITIONAL INFORMATION
See the attached PDF for the difference. Red and green resemble the differences seen between the two runs.

Comment 1 Paul Floyd 2024-08-31 04:54:44 UTC

We need a test case to reproduce the problem.

If there is nondeterminism in the test case then that’s likely to reflect in the memcheck results. The chances are low that Valgrind adds more nondeterminism.

Comment 2 Paul Floyd 2024-08-31 05:21:34 UTC

And when I say test case I mean some simple to build source code that I can run from a terminal. Preferably on Fedora or FreeBSD. I don’t want to have to set up a build framework.

Comment 3 jo.alen1@outlook.com 2024-08-31 20:00:55 UTC

(In reply to Paul Floyd from comment #2)
> And when I say test case I mean some simple to build source code that I can
> run from a terminal. Preferably on Fedora or FreeBSD. I don’t want to have
> to set up a build framework.

I can give you the source code file that I ran in this example. Download through this link and make sure you have the words.txt file (its needed for the game): https://drive.google.com/drive/folders/1Zs6EBnAJqE7NdaLX1SAJGJ0-v0cMVOXz?usp=sharing

Comment 4 jo.alen1@outlook.com 2024-09-05 18:27:37 UTC

Basically, the given code I provided shows that in one run experienced two reported errors relating to uninitialized value(s) while the other consecutive run experienced no reported errors and is 0. Would you be able to check again? 

Here is the source code: https://drive.google.com/drive/folders/1Zs6EBnAJqE7NdaLX1SAJGJ0-v0cMVOXz?usp=sharing 

@Paul Floyd

Comment 5 Paul Floyd 2024-09-07 11:04:32 UTC

(In reply to jo.alen1@outlook.com from comment #4)
> Basically, the given code I provided shows that in one run experienced two
> reported errors relating to uninitialized value(s) while the other
> consecutive run experienced no reported errors and is 0. Would you be able
> to check again? 
> 
> Here is the source code:
> https://drive.google.com/drive/folders/1Zs6EBnAJqE7NdaLX1SAJGJ0-
> v0cMVOXz?usp=sharing 
> 
> @Paul Floyd

I couldn't reproduce any issues.

Can you reproduce with a debug build?
Can you also reproduce with the latest version of Valgrind?

I get quite a few compiler warnings (clang 18 on FreeBSD), nothing too serious.

The testcase uses srand set by time and rand. Do you have a mechanism to set a known seed to make sure that the exe is deterministic for your tests?

Also how do you drive the exe for your tests?