Bug 371916

Summary:	execution tree xtree concept
Product:	[Developer tools] valgrind	Reporter:	Philippe Waroquiers <philippe.waroquiers>
Component:	general	Assignee:	Julian Seward <jseward>
Status:	RESOLVED FIXED
Severity:	wishlist	CC:	ivosh, josef.weidendorfer, philippe.waroquiers
Priority:	NOR
Version First Reported In:	3.12 SVN
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed In:
Sentry Crash Report:
Attachments:	patch that implements the xtree concept new image file, to save as docs/images/kcachegrind_xtree.png to produce the doc patch that implements the xtree concept V2 xtree concept version 3, handling Ivo's comments

Description Philippe Waroquiers 2016-10-31 21:48:38 UTC

Summary:
###########################
This patch implements a new "xtree" module in pub_tool_xtree.h and m_xtree.c.
This module generalises the data structure used by massif to
visualise the heap usage.

An xtree can be used to visualise any kind of resource or event associated
to stack traces.
For example, we could use an xtree to show some counts/events
related to system calls, e.g. the nr of bytes read/written by read/write
syscalls, ...

In the current patch, it is however only used for representing heap usage.
Tools that are replacing malloc/free/... can now very easily also
produce a heap memory representation.

massif internals has been changed to be based on this new xtree module.

memcheck, helgrind and massign are now supporting a new option
  --xtree-memory=none|allocs|full
to produce a heap usage report at the end of execution.
These tools also now provides the 'xtmemory [filename]' gdbserver
monitor command to produce an xtree heap usage report on demand.

An xtree can be saved in 2 formats:
  * massif format
  * callgrind format


Performance:
##############################
Depending on the application, the massif tool is typically a few percents
to 2 times faster than the 3.12 massif (e.g. perf/heap is about twice faster
now).
Massif also typically now uses less memory (e.g. perf/many-xpts uses now
125Mb of arena core memory instead of 205Mb).


Some more detailed changes
##########################
pub_tool_xtree.h, coregrind/m_xtree.h, coregrind/pub_core_xtree.h:
* implementation of the generalised xtree.
  (this is the bulk of the change)

docs/xml/manual-core.xml, docs/xml/manual-core-adv.xml:
* document the new options and Execution Tree concept.

coregrind/m_debuginfo/debuginfo.c:
* Addition of a symbol name cache, maintained/cleared similarly to
  the current cfsi_m_cache.

coregrind/m_deduppoolalloc.c, pub_tool_deduppoolalloc.h:
* function VG_(allocStrDedupPA) specialised to create a de-duplicating
  pool for strings.

coregrind/m_main.c, coregrind/m_options.c,
coregrind/m_replacemalloc/replacemalloc_core.c, coregrind/pub_core_options.h:
* addition of the new options --xtree-memory=none|allocs|full
  and --xtree-memory-file=filename

include/pub_tool_xarray.h, coregrind/m_xarray.c:
* new function VG_(xa_str_is_member), to search a string in an
  xarray of strings.

massif/ms_main.c:
* change internal implementation so as to use the new xtree module.

helgrind/hg_main.c,
memcheck/mc_include.h, memcheck/mc_main.c, memcheck/mc_malloc_wrappers.c,
massif/ms_main.c:
* support for option --xtree-memory
* support for gdbserver command xtmemory [filename]

Various small changes to massif test .exp files (mostly because the
--stats output has changed, due to the changed implementation).

Comment 1 Philippe Waroquiers 2016-10-31 21:49:32 UTC

Created attachment 101932 [details]
patch that implements the xtree concept

Comment 2 Philippe Waroquiers 2016-10-31 22:03:04 UTC

Created attachment 101933 [details]
new image file, to save as  docs/images/kcachegrind_xtree.png to produce the doc

The doc will not build without this file, and svn diff does not show
binary files, so attaching it separately.

Comment 3 Ivo Raisr 2016-11-01 21:28:53 UTC

Thank you for this patch, Philippe.
I was concentrating on the interface pub_tool_xtree.h in the first run and here are my questions and comments:

1. void*(*alloc_fn)(const HChar*, SizeT) in VG_(newXT)().
What is 'const HChar *' used for? I can imagine 'SizeT' is for the size. Perhaps if you can name the arguments, it would be clearer? Or you can give a hint that VG_(malloc)() and VG_(free)() are typically used for alloc_fn and free_fn?

2. What is 'ec' and 'IP' as referred to in VG_(newXT)()'? It is not explained in the file header.

3. What is argument 'cc' used for in VG_(newXT)()?

4. Functions in this header file use a mixture of camelCase and all_lower_case which is inconsistent and  disturbs reader's eyes. Can you do something with it?

5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(), VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs instead of 'void *'? Am I missing something?

6. XtAllocsEvents does not need to be exported.

7. For VG_(XtMemoryFull_free)(), does providing ec_alloc brings an unnecessary burden on the tool? Perhaps m_xtree.c could cache it?

8. VG_(XtMemory_report)() talks about xtree-memory=full, --xtree-memory=allocs and --xtree-memory=none. How one does set these? 

9. "typedef void MsFile;" Surely we can do better with implementation hiding. Several lines above "typedef struct _XTree  XTree;" was used, so what about:
    typedef struct _MsFile MsFile;

10. I also wonder why callgrind and massif specific functionality is contained in generic module xtree? Perhaps it could be located in the tools themselves?

Comment 4 Ivo Raisr 2016-11-01 21:39:01 UTC

The idea of associating some data to execution contexts reminds me of an aggregation and quantization capabilities of DTrace in Solaris/illumos when applied to stack traces... Oh, the world is so small!

Comment 5 Philippe Waroquiers 2016-11-01 22:29:37 UTC

Thanks for the comments, here is some feedback.
I will attach a new patch version, with an updated pub_tool_xtree.h.

(In reply to Ivo Raisr from comment #3)
> Thank you for this patch, Philippe.
> I was concentrating on the interface pub_tool_xtree.h in the first run and
> here are my questions and comments:
> 
> 1. void*(*alloc_fn)(const HChar*, SizeT) in VG_(newXT)().
> What is 'const HChar *' used for? I can imagine 'SizeT' is for the size.
> Perhaps if you can name the arguments, it would be clearer? Or you can give
> a hint that VG_(malloc)() and VG_(free)() are typically used for alloc_fn
> and free_fn?
This uses the classical alloc_fn and free_fn convention, used also
in other modules. See e.g. pub_tool_xarray.h or pub_tool_rangemap.h.
I agree that it would be more readable to have parameter names or
possibly/preferably we should define function types e.g. in
pub_tool_basics.h.
I have added this to my TODO list, to rework all occurences of alloc_fn/free_fn

> 
> 2. What is 'ec' and 'IP' as referred to in VG_(newXT)()'? It is not
> explained in the file header.
IP is Instruction Pointer, ec is exe context.
I have added some more explanation in the PURPOSE comment at the top. 

> 
> 3. What is argument 'cc' used for in VG_(newXT)()?
Classical 'cost centre' for modules that are allocating memory.
I have added a comment line to clarify cc is allocation cost centre.

> 
> 4. Functions in this header file use a mixture of camelCase and
> all_lower_case which is inconsistent and  disturbs reader's eyes. Can you do
> something with it?
Humph, that is a very valuable comment, but I am not sure what style
to use. Most of the current code is itself a mixture of upper
case/lower case, VG_(...), ...
For example, in pub_tool_hashtable.h, we find a.o.
 VgHashNode** VG_(HT_to_array)
 void VG_(HT_ResetIter) ( VgHashTable *table );
So, it seems that the convention is that there is no convention, and this
new module follows the 'no convention' convention.
But if we agree on a convention, I can rework this.
(my preference would be to Words_Separated_By_Underscore, this is longer
but IMO more readable and can be supported automatically by re-formatter
tools).

> 
> 5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(),
> VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs
> instead of 'void *'? Am I missing something?
These are functions to be used as init_data_fn/add_data_fn/sub_data_fn,
so they match the argument profiles as found in VG_(newXT)

> 
> 6. XtAllocsEvents does not need to be exported.
Fixed.

> 
> 7. For VG_(XtMemoryFull_free)(), does providing ec_alloc brings an
> unnecessary burden on the tool? Perhaps m_xtree.c could cache it?
In fact, the idea is that the tool is itself maintaining the alloc ec,
and so the cpu/memory cost is then neglectible. If m_xtree.c would itself
either maintain or (re-)compute the ec_alloc, then that would bring cpu
and/or memory overhead, and the xtree interface would need to receive
the allocated or block address.

> 
> 8. VG_(XtMemory_report)() talks about xtree-memory=full,
> --xtree-memory=allocs and --xtree-memory=none. How one does set these? 
These are the 3 values for the new command line option --xtree-memory.
I have clarified the comment.

> 
> 9. "typedef void MsFile;" Surely we can do better with implementation
> hiding. Several lines above "typedef struct _XTree  XTree;" was used, so
> what about:
>     typedef struct _MsFile MsFile;
In reality, MsFile is a VgFile, but I do
not want to expose this, so using typedef void MsFile; was hiding it
without introducing an artificial struct _MsFile.
As the user cannot do much with a void, this looks to hide what it is.
Not sure about what to do here.

> 
> 10. I also wonder why callgrind and massif specific functionality is
> contained in generic module xtree? Perhaps it could be located in the tools
> themselves?
These are implementing core functionality to output an xtree in a callgrind
format or a massif format. memcheck/helgrind and massif can now each output
an xtree in massif or callgrind format.
So, this code cannot go into the respective tools, it must be in core.

Comment 6 Philippe Waroquiers 2016-11-01 22:32:12 UTC

Created attachment 101950 [details]
patch that implements the xtree concept V2

Handled Ivo's comments

Comment 7 Philippe Waroquiers 2016-11-01 23:13:56 UTC

(In reply to Ivo Raisr from comment #3)

> 5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(),
> VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs
> instead of 'void *'? Am I missing something?
Re-reading the code, I see that these functions also do not need to
be exported. I have also fixed this (but not in patch V2).

> 
> 6. XtAllocsEvents does not need to be exported.

Comment 8 Ivo Raisr 2016-11-03 23:24:27 UTC

Thank you, Philippe, for answering my questions.
I have an additional one and then some docs stuff below.


>> 4. Functions in this header file use a mixture of camelCase and
>> all_lower_case which is inconsistent and  disturbs reader's eyes. Can you do
>> something with it?

> Humph, that is a very valuable comment, but I am not sure what style
> to use.

Alright. But you can unify the function names so their names are consistent among themselves. Consider this random selection:
VG_(newXT)()
VG_(xt_filter_maybe_below_main)()
VG_(xt_offset_main_or_below_main)()
VG_(init_XtAllocs)()
VG_(XtMemoryFull_init)()
VG_(XtMemoryFull_resizeInPlace)()
VG_(XtMemory_report)()
VG_(snapshotXT)()
VG_(subFromXT)()
VG_(n_ips_sel)()
VG_(XtPrint_callgrind)()
VG_(XtClose_massif)()

Sometimes the function name follows pattern: "XTthing_operation",
sometimes "operationXT" and sometimes something completely different.

Comment 9 Ivo Raisr 2016-11-03 23:38:43 UTC

As for the documentation, the concept is explained very well. Thank you for this!
I found just few typos:

       <para> When set to <varname>none</varname>, no memory execution
-      tree is be produced.</para>
+      tree is produced.</para>

       <para> When set to <varname>allocs</varname>, the memory execution
-      tree gives the current number of allocated bytes and current number of
+      tree gives the current number of allocated bytes and the current number of
       allocated blocks. </para>

-      bytes and blocks, the total number of freed bypes and
+      bytes and blocks, the total number of freed bytes and

-  an heap usage xtree graphical representation produced by
+  a heap usage xtree graphical representation produced by

-    <para><option>curB</option> current number of bytes allocated. The
+    <para><option>curB</option> current number of Bytes allocated. The

-    <para><option>curBk</option> current number of blocks allocated,
+    <para><option>curBk</option> current number of Blocks allocated,

-    <para>An xtree file in a Callgrind Format contains a single callgraph,
+    <para>An xtree file in the Callgrind Format contains a single callgraph,

-    <para>An xtree file in a Massif Format contains one detailed tree
+    <para>An xtree file in the Massif Format contains one detailed tree

-  visualiser are more versatile that the Massif Format
+  visualisers are more versatile than the Massif Format


I also wonder if you can add a paragraph or two describing how to apply xtree analysis when profiling large programs, if possible?

Comment 10 Philippe Waroquiers 2016-11-04 22:34:50 UTC

(In reply to Ivo Raisr from comment #8)
> Sometimes the function name follows pattern: "XTthing_operation",
> sometimes "operationXT" and sometimes something completely different.
Fair enough. I will do a pass of pub_too_xtree.h and somewhat reduce
my name creation creativity :)

Comment 11 Philippe Waroquiers 2016-11-05 13:20:32 UTC

Created attachment 102047 [details]
xtree concept version 3, handling Ivo's comments

Comment 12 Philippe Waroquiers 2016-11-05 13:24:46 UTC

Thanks for the comments, should be handled in attached v3 patch.
For what concerns function names etc, I have (tried to more) systematically
use a prefix (such as XT_, or XT_Memory or XT_massif_... followed by
operation in lower case. This style is somewhat inspired from pub_tool_hashtable.h
and pub_tool_oset.h (even if there, the style was not systematically applied).
Documentation has also been updated.

Comment 13 Ivo Raisr 2016-11-05 20:18:49 UTC

Thanks for addressing my comments. Especially for providing hints on Xtree usage!
I am happy with the documentation and pub_tool_xtree.h interface.
However note I have not reviewed any implementation files.

The last remark I have: have you considered naming the xtree output file based on the executed program name, such as <program>.kcg? Currently the default xtmemory.kcg will get overwritten every time, even for different profiled programs.

Comment 14 Philippe Waroquiers 2016-11-05 20:34:32 UTC

(In reply to Ivo Raisr from comment #13)
> Thanks for addressing my comments. Especially for providing hints on Xtree
> usage!
> I am happy with the documentation and pub_tool_xtree.h interface.
> However note I have not reviewed any implementation files.
> 
> The last remark I have: have you considered naming the xtree output file
> based on the executed program name, such as <program>.kcg? Currently the
> default xtmemory.kcg will get overwritten every time, even for different
> profiled programs.
For the xtmemory gdbserver command: the user can give a different name if
desired, but the default is to overwrite. This is also the case
for other gdbserver commands producing files, e.g. massif snapshots.

For xtree memory files produced due to command line, the default is similar
to e.g. log files or callgrind output files, or ...:
filename includes %p expanded as pid, to avoid overwriting.
(this %p is needed in particular for forking programs).

Comment 15 Josef Weidendorfer 2016-11-08 23:53:09 UTC

Hi Philippe, nice patch!

I think it is good to move functionalty from tools to the core, if it
can make other tools better or easier to write. Attaching data to call
trees built from stack traces should fit here.
However, I did not check the large rewrite of massif.

General comments (I may be missing something here):
 
* is this bound to memory allocation, or is memory allocation just
  an example use case of the API? It would be nice if tools could
  attach arbitrary data to xtrees. Perhaps separate the memory allocation
  stuff into different files (e.g. xtree_memalloc.h).

* behavior of some functions in the XTree API depend on command line
  options. Wouldn't it be better for a tool using this API to be in
  full control here, ie. the tool parses command line
  options and pass them as flags?  About the changes in the malloc
  wrappers, I think it would be better if the tool can register handlers,
  and these handlers then call e.g. VG_(XTMemory_Full_alloc).

About the XTree API, why do you need these add/sub variants?
Why not just have a function to get a pointer to the value to be updated?
Any reduction operation may be useful, such as min/max.

Comment 16 Philippe Waroquiers 2016-11-09 20:26:44 UTC

(In reply to Josef Weidendorfer from comment #15)
> Hi Philippe, nice patch!
> 
> I think it is good to move functionalty from tools to the core, if it
> can make other tools better or easier to write. Attaching data to call
> trees built from stack traces should fit here.
> However, I did not check the large rewrite of massif.
> 
> General comments (I may be missing something here):
>  
> * is this bound to memory allocation, or is memory allocation just
>   an example use case of the API? It would be nice if tools could
>   attach arbitrary data to xtrees. Perhaps separate the memory allocation
>   stuff into different files (e.g. xtree_memalloc.h).
The API is done so that arbitrary data can be attached.
Of course, to be output as a callgrind or massif file, the data
must be translatable in (positive) integers values.

For what concerns separating the memory allocation stuff in another file:
that is a good suggestion, I will take a look at this. I think this
should be no problem.

> 
> * behavior of some functions in the XTree API depend on command line
>   options. Wouldn't it be better for a tool using this API to be in
>   full control here, ie. the tool parses command line
>   options and pass them as flags?
Basically, that is what the patch does:
command line options are parsed (specifically --xtree-memory=none|allocs|full)
and then the tool calls the memory support functions according to
the command line option (e.g. calling VG_(XTMemory_Full_alloc)).
The tool is in full control, e.g. it might implement 
   --xtree-memory=allocs
completely differently, if needed.

>  About the changes in the malloc
>   wrappers, I think it would be better if the tool can register handlers,
>   and these handlers then call e.g. VG_(XTMemory_Full_alloc).
I guess this could be done, but I think this would be less practical
than the current approach:  the malloc replacement is today
calling a tool replacement function.
So, these replacement functions are doing the job of e.g. recording
the allocated blocks in a tool specific way. In this replacement function,
the tool can (if desired) call e.g. VG_(XTMemory_Full_alloc)).
IMO, having another handler separated from the current handler function
would complexity things, as e.g. the tool would have to associate the
work/data done/prepared in the 'first handler' with the 'following' handler
calls related to xtree. So, in summary, there is now a handler that tells
the tool to e.g. execute a malloc, and this tool function can do the
needed work in one single invocation (e.g. re-using the execontext
done for tool purpose as data input for the xtree).


> 
> About the XTree API, why do you need these add/sub variants?
> Why not just have a function to get a pointer to the value to be updated?
> Any reduction operation may be useful, such as min/max.
Yes, any operation can be done in the add/sub : these functions
get a pointer to the value being updated.
It might be possible to use maybe a more generic name than add/sub
(maybe  Operate(...) ?), but then we will need more arguments for Operate
to indicate what this has to really do, and m_xtree.c itself need to do some
additions to e.g. produce some totals. So, one way or another, there is
an (somewhat flexible) notion of addition which is needed.
The current add/sub_data_fn are matching the current usage (xtree memory).
As a follow-up, I intend to work on a 'syscall xtree' which e.g. might
capture min/max (or maybe an histogram) of values. If needed, we might
revisit the interface of the *_data_fn (e.g. I suspect we might need
a destroy_data_fn if we want a more flexible/variable data set).

What I can do in any case is to add some comments
to indicate that the semantic of 'add' is quite flexible, and that e.g.
a 'max' or 'min' can be computed as part of the an 'add'.

Comment 17 Josef Weidendorfer 2016-11-10 17:38:38 UTC

(In reply to Philippe Waroquiers from comment #16)
> >  About the changes in the malloc
> >   wrappers, I think it would be better if the tool can register handlers,
> >   and these handlers then call e.g. VG_(XTMemory_Full_alloc).
> I guess this could be done, but I think this would be less practical
> than the current approach:  the malloc replacement is today
> calling a tool replacement function.

Ah, I see. Then that's fine with me.

> > About the XTree API, why do you need these add/sub variants?
> > Why not just have a function to get a pointer to the value to be updated?
> > Any reduction operation may be useful, such as min/max.
> Yes, any operation can be done in the add/sub : these functions
> get a pointer to the value being updated.

I just wondered if the xtree API could work without having to pass
add/sub_data_fn handlers to make it simpler. I see that the add
handler is used for calculating the totals in the callgrind output.
What about "reduce" instead of "add"? It's a reduction operation to
be used for propagating the value up the call chains. You could do
this before the callgrind output to get smaller files.
If callgrind_annotate/kcachegrind will be able at some point
to consume call chains, it has to be another format anyway.


> As a follow-up, I intend to work on a 'syscall xtree' which e.g. might
> capture min/max (or maybe an histogram) of values.

Yes, statistics are interesting: min/max/variance/average and histograms.
I always wanted to do that in callgrind...

Comment 18 Julian Seward 2016-11-10 18:22:45 UTC

Some comments on the v3 patch:

Mostly it looks fine.  Below [1] some small comments.

Larger comments:

* I assume there are no regressions with correctness or performance
  with this, yes?  Could you do a self-host Memcheck run at some point?

* The large size of the patch concerns me a bit.  I would be happier
  if it could be split into two parts:
  
  (1) that creates m_xtree.c and refactors Massif to use it, but
      does not change the user-visible functionality at all.  So
      it is an implementation-only change, and
      
  (2) a patch that builds on (1), that supplies the new functionality.

  Is that possible, if it is not a lot of work?


[1] Small comments:

----------------

+typedef
+   struct { Addr a; const HChar* sym_name; PtrdiffT offset; }
+   Sym_Name_CacheEnt;

I prefer to have something more descriptive than just "a" for the
first field.  Since it is debug-info related, it would also be good to
clarify whether this is an SVMA or an AVMA or something else, per
comment at the top of m_debuginfo/debuginfo.c.  I suspect it's a SVMA,
in which case a good name would by "sym_svma".

----------------

+"    --xtree-memory=none|allocs|full   profile heap memory in an xtree [none]\n"
+"                              and produces a report at the end of the execution\n"
+"                              none: no profiling, allocs: current allocated\n"
+"                              size/blocks, full: profile current and cumulative\n"
+"                              allocated size/blocks and freed size/blocks.\n"
+"    --xtree-memory-file=<file>   xtree memory report file [xtmemory.kcg.%%p]\n"

This flag only has effect for tools that replace malloc, correct?  Is
it listed in the section "user options for Valgrind tools that replace
malloc:" ?

----------------

+   This file is part of Valgrind, a dynamic binary instrumentation
+   framework.
+
+   Copyright (C) 2016-2016 Philippe Waroquiers

For m_xtree.c, if there is a lot of code in there which has been moved
from massif and/or callgrind, and is not much changed, I think it
would be diplomatic to add a line or two explaining that the original
authors were Nick and/or Josef.  See the top of coregrind/m_wordfm.c for
an example.

----------------

 // growing such a block, but for consistency (it also simplifies things) we
 // ignore such reallocs as well.
+// XTREE??? why can't we just consider that a realloc of an ignored
+// alloc is just a new alloc (i.e. do not remove the old sz from the stats).

and again later.  The "XTREE???" is confusing -- I don't know what it
signifies.  Can you maybe write instead something like "PW Nov 2016,
xtree work:"?  I do that in comments from time to time.

----------------

Per previous comments in the bug re CamelCase vs snake_case, I really
have no problem mixing them.  I like to use camelcase, with a capital
first letter for type names, and snake case when function names get
long.  But no fixed rules.

Comment 19 Josef Weidendorfer 2016-11-10 20:55:24 UTC

(In reply to Julian Seward from comment #18)
> +   Copyright (C) 2016-2016 Philippe Waroquiers
> 
> For m_xtree.c, if there is a lot of code in there which has been moved
> from massif and/or callgrind, and is not much changed, I think it
> would be diplomatic to add a line or two explaining that the original
> authors were Nick and/or Josef.

I assume the callgrind output code used the format spec, and no other code.
So no point to add me :-)

Comment 20 Philippe Waroquiers 2016-11-11 13:06:33 UTC

(In reply to Josef Weidendorfer from comment #17)
> I just wondered if the xtree API could work without having to pass
> add/sub_data_fn handlers to make it simpler. I see that the add
> handler is used for calculating the totals in the callgrind output.
> What about "reduce" instead of "add"? It's a reduction operation to
> be used for propagating the value up the call chains.
Yes, any kind of operation can in fact be done by add (or sub) functions,
and so they effectively play the role of Reduce functions. For the moment,
I have however kept the current add/sub approach, as at least currenly
it is assumed the module will be mostly used to track consumption of some
resources, which I think is more clearly modelled with add/sub.
I have added some comments to explain that these add/sub have no specific
constraints and so are equivalent to Reduce functions. The interface might
be changed if we see many uses where a Reduce based terminology is more
appropriate. 

Regarding comment 15 "(Perhaps separate the memory allocation stuff into different file)" : I have now separated this from xtree, in 
pub_tool_xtmemory.h/m_xtmemory.c.

Comment 21 Philippe Waroquiers 2016-11-11 13:18:58 UTC

(In reply to Julian Seward from comment #18)
> * I assume there are no regressions with correctness or performance
>   with this, yes?  Could you do a self-host Memcheck run at some point?
Massif functional behaviour is unchanged (with the exception of some inversion
when 2 stacktraces have allocated the same amount: the old and new implementation
do not output such 'equal consumers' in the same order.
Massif performance is improved (in CPU and memory).
For other tools (memcheck, helgrind) : if the xtree feature is not used,
the impact is one additional 'if' condition
in the malloc/free interception code.

> 
> * The large size of the patch concerns me a bit.  I would be happier
>   if it could be split into two parts:
>   
>   (1) that creates m_xtree.c and refactors Massif to use it, but
>       does not change the user-visible functionality at all.  So
>       it is an implementation-only change, and
>       
>   (2) a patch that builds on (1), that supplies the new functionality.
> 
>   Is that possible, if it is not a lot of work?
Ok, I will (try to) commit in some (smaller) pieces (which will I hope give
a buildable V at each commit).

> 
> 
> [1] Small comments:
> 
> ----------------
> 
> +typedef
> +   struct { Addr a; const HChar* sym_name; PtrdiffT offset; }
> +   Sym_Name_CacheEnt;
> 
> I prefer to have something more descriptive than just "a" for the
> first field. 
Renamed to sym_avma, as suggested/

> 
> ----------------
> 
> +"    --xtree-memory=none|allocs|full   profile heap memory in an xtree
> [none]\n"
> +"                              and produces a report at the end of the
> execution\n"
> +"                              none: no profiling, allocs: current
> allocated\n"
> +"                              size/blocks, full: profile current and
> cumulative\n"
> +"                              allocated size/blocks and freed
> size/blocks.\n"
> +"    --xtree-memory-file=<file>   xtree memory report file
> [xtmemory.kcg.%%p]\n"
> 
> This flag only has effect for tools that replace malloc, correct?  Is
> it listed in the section "user options for Valgrind tools that replace
> malloc:" ?
Yes, it is listed there in -help output, and documented in the user manual
around malloc/free related arguments.

> 
> ----------------
> 
> +   This file is part of Valgrind, a dynamic binary instrumentation
> +   framework.
> +
> +   Copyright (C) 2016-2016 Philippe Waroquiers
> 
> For m_xtree.c, if there is a lot of code in there which has been moved
> from massif and/or callgrind, and is not much changed, I think it
> would be diplomatic to add a line or two explaining that the original
> authors were Nick and/or Josef.  See the top of coregrind/m_wordfm.c for
> an example.
m_xtree.c is a completely new implementation of the Xtree (I have recuperated
a few lines for the production of the massif header).
In any case, I have added a paragraph to mention that the xtree initial idea
was in massif developed by Nick.

> 
> ----------------
> 
>  // growing such a block, but for consistency (it also simplifies things) we
>  // ignore such reallocs as well.
> +// XTREE??? why can't we just consider that a realloc of an ignored
> +// alloc is just a new alloc (i.e. do not remove the old sz from the stats).
> 
> and again later.  The "XTREE???" is confusing -- I don't know what it
> signifies.  Can you maybe write instead something like "PW Nov 2016,
> xtree work:"?  I do that in comments from time to time.
Updated as suggested.

> 
> ----------------
> 
> Per previous comments in the bug re CamelCase vs snake_case, I really
> have no problem mixing them.  I like to use camelcase, with a capital
> first letter for type names, and snake case when function names get
> long.  But no fixed rules.
Yes, there is no fixed rule.
The comment of Ivo was in any case handled, as there was really too much
name variation in the API.

Comment 22 Julian Seward 2017-03-06 14:16:54 UTC

Can this be closed now?  In the NEWS file it is listed as being closed.