142706 – massif numbers don't seem to add up

Bug 142706 - massif numbers don't seem to add up

Summary: massif numbers don't seem to add up

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	massif (show other bugs)
Version:	3.2.3
Platform:	Ubuntu Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Nicholas Nethercote

URL:
Keywords:

Depends on:
Blocks:

Reported:	2007-03-08 22:35 UTC by Zooko O'Whielacronx
Modified:	2007-11-26 23:51 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed In:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Zooko O'Whielacronx 2007-03-08 22:35:13 UTC

Dear makers of valgrind and massif:

Thank you so much for making two excellent tools!

I'm having trouble understanding the meaning of the timespace total percentages,
because they don't seem to reflect the appearance of the graph.  For example, in
this graph:

http://zooko.com/heap/30654/massif.30654.ps

and this chart:

http://zooko.com/heap/30654/massif.30654.html

The chart says that PyString_FromString is the largest single user of spacetime
with 9.3%, but the graph shows three other functions using substantially more
than PyString_FromString uses.

Likewise in this, our most recent graph:

http://zooko.com/heap/30878/massif.30878.ps

http://zooko.com/heap/30878/massif.30878.html

The chart claims that PyCode_New is the 5th-largest consumer of spacetime, at
6.1% of the total, but the graph doesn't show PyCode_New at all.

Please help me figure this stuff out!

Thank you,

Zooko

Comment 1 Nicholas Nethercote 2007-03-08 23:13:38 UTC

Hmm, the .ps and the .html don't appear to match.  It's not a problem with your understanding... it may well be a bug in Massif.

Comment 2 Zooko O'Whielacronx 2007-07-17 06:21:35 UTC

The Allmydata-Tahoe project is interested in the memory usage of an app, but this bug is a bit of a deterrent to trusting massif's output.  I have the hypothesis that massif was wrong because the running time was several orders of magnitude larger than what I guess is a typical massif task, and the aggregate memory usage was an order of magnitude or two larger than what I guess is typical.

So, basically, I hazard a guess that there is some inaccuracy in massif's aggregation of samples which doesn't really show up in smaller tasks but becomes significant for larger tasks.
 
http://allmydata.org/pipermail/tahoe-dev/2007-July/000050.html

Comment 3 Nicholas Nethercote 2007-11-26 23:51:39 UTC

[A general message:]

Massif has recently been completely overhauled.  Instead of recording
space-time usage, it now records space usage at various points during
execution, including the peak allocation point.  The output format has also
changed, and presents more information, more compactly.  It's also more
robust than the old version.

The new version will be Valgrind 3.3.0, which should be released in the next
week or two.  In the meantime, if you want to try it, please check out the
code from the SVN repository (see
http://www.valgrind.org/downloads/repository.html for how).

[A specific message about this bug:]
The better internal sanity-checking in the new version will hopefully mean
that this problem is fixed.  So I'm closing this bug.  Please reopen it if
you still have problems with the new version.