Summary: | execution tree xtree concept | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Philippe Waroquiers <philippe.waroquiers> |
Component: | general | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | wishlist | CC: | ivosh, josef.weidendorfer, philippe.waroquiers |
Priority: | NOR | ||
Version First Reported In: | 3.12 SVN | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
patch that implements the xtree concept
new image file, to save as docs/images/kcachegrind_xtree.png to produce the doc patch that implements the xtree concept V2 xtree concept version 3, handling Ivo's comments |
Description
Philippe Waroquiers
2016-10-31 21:48:38 UTC
Created attachment 101932 [details]
patch that implements the xtree concept
Created attachment 101933 [details]
new image file, to save as docs/images/kcachegrind_xtree.png to produce the doc
The doc will not build without this file, and svn diff does not show
binary files, so attaching it separately.
Thank you for this patch, Philippe. I was concentrating on the interface pub_tool_xtree.h in the first run and here are my questions and comments: 1. void*(*alloc_fn)(const HChar*, SizeT) in VG_(newXT)(). What is 'const HChar *' used for? I can imagine 'SizeT' is for the size. Perhaps if you can name the arguments, it would be clearer? Or you can give a hint that VG_(malloc)() and VG_(free)() are typically used for alloc_fn and free_fn? 2. What is 'ec' and 'IP' as referred to in VG_(newXT)()'? It is not explained in the file header. 3. What is argument 'cc' used for in VG_(newXT)()? 4. Functions in this header file use a mixture of camelCase and all_lower_case which is inconsistent and disturbs reader's eyes. Can you do something with it? 5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(), VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs instead of 'void *'? Am I missing something? 6. XtAllocsEvents does not need to be exported. 7. For VG_(XtMemoryFull_free)(), does providing ec_alloc brings an unnecessary burden on the tool? Perhaps m_xtree.c could cache it? 8. VG_(XtMemory_report)() talks about xtree-memory=full, --xtree-memory=allocs and --xtree-memory=none. How one does set these? 9. "typedef void MsFile;" Surely we can do better with implementation hiding. Several lines above "typedef struct _XTree XTree;" was used, so what about: typedef struct _MsFile MsFile; 10. I also wonder why callgrind and massif specific functionality is contained in generic module xtree? Perhaps it could be located in the tools themselves? The idea of associating some data to execution contexts reminds me of an aggregation and quantization capabilities of DTrace in Solaris/illumos when applied to stack traces... Oh, the world is so small! Thanks for the comments, here is some feedback. I will attach a new patch version, with an updated pub_tool_xtree.h. (In reply to Ivo Raisr from comment #3) > Thank you for this patch, Philippe. > I was concentrating on the interface pub_tool_xtree.h in the first run and > here are my questions and comments: > > 1. void*(*alloc_fn)(const HChar*, SizeT) in VG_(newXT)(). > What is 'const HChar *' used for? I can imagine 'SizeT' is for the size. > Perhaps if you can name the arguments, it would be clearer? Or you can give > a hint that VG_(malloc)() and VG_(free)() are typically used for alloc_fn > and free_fn? This uses the classical alloc_fn and free_fn convention, used also in other modules. See e.g. pub_tool_xarray.h or pub_tool_rangemap.h. I agree that it would be more readable to have parameter names or possibly/preferably we should define function types e.g. in pub_tool_basics.h. I have added this to my TODO list, to rework all occurences of alloc_fn/free_fn > > 2. What is 'ec' and 'IP' as referred to in VG_(newXT)()'? It is not > explained in the file header. IP is Instruction Pointer, ec is exe context. I have added some more explanation in the PURPOSE comment at the top. > > 3. What is argument 'cc' used for in VG_(newXT)()? Classical 'cost centre' for modules that are allocating memory. I have added a comment line to clarify cc is allocation cost centre. > > 4. Functions in this header file use a mixture of camelCase and > all_lower_case which is inconsistent and disturbs reader's eyes. Can you do > something with it? Humph, that is a very valuable comment, but I am not sure what style to use. Most of the current code is itself a mixture of upper case/lower case, VG_(...), ... For example, in pub_tool_hashtable.h, we find a.o. VgHashNode** VG_(HT_to_array) void VG_(HT_ResetIter) ( VgHashTable *table ); So, it seems that the convention is that there is no convention, and this new module follows the 'no convention' convention. But if we agree on a convention, I can rework this. (my preference would be to Words_Separated_By_Underscore, this is longer but IMO more readable and can be supported automatically by re-formatter tools). > > 5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(), > VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs > instead of 'void *'? Am I missing something? These are functions to be used as init_data_fn/add_data_fn/sub_data_fn, so they match the argument profiles as found in VG_(newXT) > > 6. XtAllocsEvents does not need to be exported. Fixed. > > 7. For VG_(XtMemoryFull_free)(), does providing ec_alloc brings an > unnecessary burden on the tool? Perhaps m_xtree.c could cache it? In fact, the idea is that the tool is itself maintaining the alloc ec, and so the cpu/memory cost is then neglectible. If m_xtree.c would itself either maintain or (re-)compute the ec_alloc, then that would bring cpu and/or memory overhead, and the xtree interface would need to receive the allocated or block address. > > 8. VG_(XtMemory_report)() talks about xtree-memory=full, > --xtree-memory=allocs and --xtree-memory=none. How one does set these? These are the 3 values for the new command line option --xtree-memory. I have clarified the comment. > > 9. "typedef void MsFile;" Surely we can do better with implementation > hiding. Several lines above "typedef struct _XTree XTree;" was used, so > what about: > typedef struct _MsFile MsFile; In reality, MsFile is a VgFile, but I do not want to expose this, so using typedef void MsFile; was hiding it without introducing an artificial struct _MsFile. As the user cannot do much with a void, this looks to hide what it is. Not sure about what to do here. > > 10. I also wonder why callgrind and massif specific functionality is > contained in generic module xtree? Perhaps it could be located in the tools > themselves? These are implementing core functionality to output an xtree in a callgrind format or a massif format. memcheck/helgrind and massif can now each output an xtree in massif or callgrind format. So, this code cannot go into the respective tools, it must be in core. Created attachment 101950 [details]
patch that implements the xtree concept V2
Handled Ivo's comments
(In reply to Ivo Raisr from comment #3) > 5. I wonder why VG_(init_XtAllocs)(), VG_(add_XtAllocs)(), > VG_(sub_XtAllocs)() and VG_(img_XtAllocs)() do not use structure XtAllocs > instead of 'void *'? Am I missing something? Re-reading the code, I see that these functions also do not need to be exported. I have also fixed this (but not in patch V2). > > 6. XtAllocsEvents does not need to be exported. Thank you, Philippe, for answering my questions. I have an additional one and then some docs stuff below. >> 4. Functions in this header file use a mixture of camelCase and >> all_lower_case which is inconsistent and disturbs reader's eyes. Can you do >> something with it? > Humph, that is a very valuable comment, but I am not sure what style > to use. Alright. But you can unify the function names so their names are consistent among themselves. Consider this random selection: VG_(newXT)() VG_(xt_filter_maybe_below_main)() VG_(xt_offset_main_or_below_main)() VG_(init_XtAllocs)() VG_(XtMemoryFull_init)() VG_(XtMemoryFull_resizeInPlace)() VG_(XtMemory_report)() VG_(snapshotXT)() VG_(subFromXT)() VG_(n_ips_sel)() VG_(XtPrint_callgrind)() VG_(XtClose_massif)() Sometimes the function name follows pattern: "XTthing_operation", sometimes "operationXT" and sometimes something completely different. As for the documentation, the concept is explained very well. Thank you for this! I found just few typos: <para> When set to <varname>none</varname>, no memory execution - tree is be produced.</para> + tree is produced.</para> <para> When set to <varname>allocs</varname>, the memory execution - tree gives the current number of allocated bytes and current number of + tree gives the current number of allocated bytes and the current number of allocated blocks. </para> - bytes and blocks, the total number of freed bypes and + bytes and blocks, the total number of freed bytes and - an heap usage xtree graphical representation produced by + a heap usage xtree graphical representation produced by - <para><option>curB</option> current number of bytes allocated. The + <para><option>curB</option> current number of Bytes allocated. The - <para><option>curBk</option> current number of blocks allocated, + <para><option>curBk</option> current number of Blocks allocated, - <para>An xtree file in a Callgrind Format contains a single callgraph, + <para>An xtree file in the Callgrind Format contains a single callgraph, - <para>An xtree file in a Massif Format contains one detailed tree + <para>An xtree file in the Massif Format contains one detailed tree - visualiser are more versatile that the Massif Format + visualisers are more versatile than the Massif Format I also wonder if you can add a paragraph or two describing how to apply xtree analysis when profiling large programs, if possible? (In reply to Ivo Raisr from comment #8) > Sometimes the function name follows pattern: "XTthing_operation", > sometimes "operationXT" and sometimes something completely different. Fair enough. I will do a pass of pub_too_xtree.h and somewhat reduce my name creation creativity :) Created attachment 102047 [details]
xtree concept version 3, handling Ivo's comments
Thanks for the comments, should be handled in attached v3 patch. For what concerns function names etc, I have (tried to more) systematically use a prefix (such as XT_, or XT_Memory or XT_massif_... followed by operation in lower case. This style is somewhat inspired from pub_tool_hashtable.h and pub_tool_oset.h (even if there, the style was not systematically applied). Documentation has also been updated. Thanks for addressing my comments. Especially for providing hints on Xtree usage! I am happy with the documentation and pub_tool_xtree.h interface. However note I have not reviewed any implementation files. The last remark I have: have you considered naming the xtree output file based on the executed program name, such as <program>.kcg? Currently the default xtmemory.kcg will get overwritten every time, even for different profiled programs. (In reply to Ivo Raisr from comment #13) > Thanks for addressing my comments. Especially for providing hints on Xtree > usage! > I am happy with the documentation and pub_tool_xtree.h interface. > However note I have not reviewed any implementation files. > > The last remark I have: have you considered naming the xtree output file > based on the executed program name, such as <program>.kcg? Currently the > default xtmemory.kcg will get overwritten every time, even for different > profiled programs. For the xtmemory gdbserver command: the user can give a different name if desired, but the default is to overwrite. This is also the case for other gdbserver commands producing files, e.g. massif snapshots. For xtree memory files produced due to command line, the default is similar to e.g. log files or callgrind output files, or ...: filename includes %p expanded as pid, to avoid overwriting. (this %p is needed in particular for forking programs). Hi Philippe, nice patch! I think it is good to move functionalty from tools to the core, if it can make other tools better or easier to write. Attaching data to call trees built from stack traces should fit here. However, I did not check the large rewrite of massif. General comments (I may be missing something here): * is this bound to memory allocation, or is memory allocation just an example use case of the API? It would be nice if tools could attach arbitrary data to xtrees. Perhaps separate the memory allocation stuff into different files (e.g. xtree_memalloc.h). * behavior of some functions in the XTree API depend on command line options. Wouldn't it be better for a tool using this API to be in full control here, ie. the tool parses command line options and pass them as flags? About the changes in the malloc wrappers, I think it would be better if the tool can register handlers, and these handlers then call e.g. VG_(XTMemory_Full_alloc). About the XTree API, why do you need these add/sub variants? Why not just have a function to get a pointer to the value to be updated? Any reduction operation may be useful, such as min/max. (In reply to Josef Weidendorfer from comment #15) > Hi Philippe, nice patch! > > I think it is good to move functionalty from tools to the core, if it > can make other tools better or easier to write. Attaching data to call > trees built from stack traces should fit here. > However, I did not check the large rewrite of massif. > > General comments (I may be missing something here): > > * is this bound to memory allocation, or is memory allocation just > an example use case of the API? It would be nice if tools could > attach arbitrary data to xtrees. Perhaps separate the memory allocation > stuff into different files (e.g. xtree_memalloc.h). The API is done so that arbitrary data can be attached. Of course, to be output as a callgrind or massif file, the data must be translatable in (positive) integers values. For what concerns separating the memory allocation stuff in another file: that is a good suggestion, I will take a look at this. I think this should be no problem. > > * behavior of some functions in the XTree API depend on command line > options. Wouldn't it be better for a tool using this API to be in > full control here, ie. the tool parses command line > options and pass them as flags? Basically, that is what the patch does: command line options are parsed (specifically --xtree-memory=none|allocs|full) and then the tool calls the memory support functions according to the command line option (e.g. calling VG_(XTMemory_Full_alloc)). The tool is in full control, e.g. it might implement --xtree-memory=allocs completely differently, if needed. > About the changes in the malloc > wrappers, I think it would be better if the tool can register handlers, > and these handlers then call e.g. VG_(XTMemory_Full_alloc). I guess this could be done, but I think this would be less practical than the current approach: the malloc replacement is today calling a tool replacement function. So, these replacement functions are doing the job of e.g. recording the allocated blocks in a tool specific way. In this replacement function, the tool can (if desired) call e.g. VG_(XTMemory_Full_alloc)). IMO, having another handler separated from the current handler function would complexity things, as e.g. the tool would have to associate the work/data done/prepared in the 'first handler' with the 'following' handler calls related to xtree. So, in summary, there is now a handler that tells the tool to e.g. execute a malloc, and this tool function can do the needed work in one single invocation (e.g. re-using the execontext done for tool purpose as data input for the xtree). > > About the XTree API, why do you need these add/sub variants? > Why not just have a function to get a pointer to the value to be updated? > Any reduction operation may be useful, such as min/max. Yes, any operation can be done in the add/sub : these functions get a pointer to the value being updated. It might be possible to use maybe a more generic name than add/sub (maybe Operate(...) ?), but then we will need more arguments for Operate to indicate what this has to really do, and m_xtree.c itself need to do some additions to e.g. produce some totals. So, one way or another, there is an (somewhat flexible) notion of addition which is needed. The current add/sub_data_fn are matching the current usage (xtree memory). As a follow-up, I intend to work on a 'syscall xtree' which e.g. might capture min/max (or maybe an histogram) of values. If needed, we might revisit the interface of the *_data_fn (e.g. I suspect we might need a destroy_data_fn if we want a more flexible/variable data set). What I can do in any case is to add some comments to indicate that the semantic of 'add' is quite flexible, and that e.g. a 'max' or 'min' can be computed as part of the an 'add'. (In reply to Philippe Waroquiers from comment #16) > > About the changes in the malloc > > wrappers, I think it would be better if the tool can register handlers, > > and these handlers then call e.g. VG_(XTMemory_Full_alloc). > I guess this could be done, but I think this would be less practical > than the current approach: the malloc replacement is today > calling a tool replacement function. Ah, I see. Then that's fine with me. > > About the XTree API, why do you need these add/sub variants? > > Why not just have a function to get a pointer to the value to be updated? > > Any reduction operation may be useful, such as min/max. > Yes, any operation can be done in the add/sub : these functions > get a pointer to the value being updated. I just wondered if the xtree API could work without having to pass add/sub_data_fn handlers to make it simpler. I see that the add handler is used for calculating the totals in the callgrind output. What about "reduce" instead of "add"? It's a reduction operation to be used for propagating the value up the call chains. You could do this before the callgrind output to get smaller files. If callgrind_annotate/kcachegrind will be able at some point to consume call chains, it has to be another format anyway. > As a follow-up, I intend to work on a 'syscall xtree' which e.g. might > capture min/max (or maybe an histogram) of values. Yes, statistics are interesting: min/max/variance/average and histograms. I always wanted to do that in callgrind... Some comments on the v3 patch: Mostly it looks fine. Below [1] some small comments. Larger comments: * I assume there are no regressions with correctness or performance with this, yes? Could you do a self-host Memcheck run at some point? * The large size of the patch concerns me a bit. I would be happier if it could be split into two parts: (1) that creates m_xtree.c and refactors Massif to use it, but does not change the user-visible functionality at all. So it is an implementation-only change, and (2) a patch that builds on (1), that supplies the new functionality. Is that possible, if it is not a lot of work? [1] Small comments: ---------------- +typedef + struct { Addr a; const HChar* sym_name; PtrdiffT offset; } + Sym_Name_CacheEnt; I prefer to have something more descriptive than just "a" for the first field. Since it is debug-info related, it would also be good to clarify whether this is an SVMA or an AVMA or something else, per comment at the top of m_debuginfo/debuginfo.c. I suspect it's a SVMA, in which case a good name would by "sym_svma". ---------------- +" --xtree-memory=none|allocs|full profile heap memory in an xtree [none]\n" +" and produces a report at the end of the execution\n" +" none: no profiling, allocs: current allocated\n" +" size/blocks, full: profile current and cumulative\n" +" allocated size/blocks and freed size/blocks.\n" +" --xtree-memory-file=<file> xtree memory report file [xtmemory.kcg.%%p]\n" This flag only has effect for tools that replace malloc, correct? Is it listed in the section "user options for Valgrind tools that replace malloc:" ? ---------------- + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2016-2016 Philippe Waroquiers For m_xtree.c, if there is a lot of code in there which has been moved from massif and/or callgrind, and is not much changed, I think it would be diplomatic to add a line or two explaining that the original authors were Nick and/or Josef. See the top of coregrind/m_wordfm.c for an example. ---------------- // growing such a block, but for consistency (it also simplifies things) we // ignore such reallocs as well. +// XTREE??? why can't we just consider that a realloc of an ignored +// alloc is just a new alloc (i.e. do not remove the old sz from the stats). and again later. The "XTREE???" is confusing -- I don't know what it signifies. Can you maybe write instead something like "PW Nov 2016, xtree work:"? I do that in comments from time to time. ---------------- Per previous comments in the bug re CamelCase vs snake_case, I really have no problem mixing them. I like to use camelcase, with a capital first letter for type names, and snake case when function names get long. But no fixed rules. (In reply to Julian Seward from comment #18) > + Copyright (C) 2016-2016 Philippe Waroquiers > > For m_xtree.c, if there is a lot of code in there which has been moved > from massif and/or callgrind, and is not much changed, I think it > would be diplomatic to add a line or two explaining that the original > authors were Nick and/or Josef. I assume the callgrind output code used the format spec, and no other code. So no point to add me :-) (In reply to Josef Weidendorfer from comment #17) > I just wondered if the xtree API could work without having to pass > add/sub_data_fn handlers to make it simpler. I see that the add > handler is used for calculating the totals in the callgrind output. > What about "reduce" instead of "add"? It's a reduction operation to > be used for propagating the value up the call chains. Yes, any kind of operation can in fact be done by add (or sub) functions, and so they effectively play the role of Reduce functions. For the moment, I have however kept the current add/sub approach, as at least currenly it is assumed the module will be mostly used to track consumption of some resources, which I think is more clearly modelled with add/sub. I have added some comments to explain that these add/sub have no specific constraints and so are equivalent to Reduce functions. The interface might be changed if we see many uses where a Reduce based terminology is more appropriate. Regarding comment 15 "(Perhaps separate the memory allocation stuff into different file)" : I have now separated this from xtree, in pub_tool_xtmemory.h/m_xtmemory.c. (In reply to Julian Seward from comment #18) > * I assume there are no regressions with correctness or performance > with this, yes? Could you do a self-host Memcheck run at some point? Massif functional behaviour is unchanged (with the exception of some inversion when 2 stacktraces have allocated the same amount: the old and new implementation do not output such 'equal consumers' in the same order. Massif performance is improved (in CPU and memory). For other tools (memcheck, helgrind) : if the xtree feature is not used, the impact is one additional 'if' condition in the malloc/free interception code. > > * The large size of the patch concerns me a bit. I would be happier > if it could be split into two parts: > > (1) that creates m_xtree.c and refactors Massif to use it, but > does not change the user-visible functionality at all. So > it is an implementation-only change, and > > (2) a patch that builds on (1), that supplies the new functionality. > > Is that possible, if it is not a lot of work? Ok, I will (try to) commit in some (smaller) pieces (which will I hope give a buildable V at each commit). > > > [1] Small comments: > > ---------------- > > +typedef > + struct { Addr a; const HChar* sym_name; PtrdiffT offset; } > + Sym_Name_CacheEnt; > > I prefer to have something more descriptive than just "a" for the > first field. Renamed to sym_avma, as suggested/ > > ---------------- > > +" --xtree-memory=none|allocs|full profile heap memory in an xtree > [none]\n" > +" and produces a report at the end of the > execution\n" > +" none: no profiling, allocs: current > allocated\n" > +" size/blocks, full: profile current and > cumulative\n" > +" allocated size/blocks and freed > size/blocks.\n" > +" --xtree-memory-file=<file> xtree memory report file > [xtmemory.kcg.%%p]\n" > > This flag only has effect for tools that replace malloc, correct? Is > it listed in the section "user options for Valgrind tools that replace > malloc:" ? Yes, it is listed there in -help output, and documented in the user manual around malloc/free related arguments. > > ---------------- > > + This file is part of Valgrind, a dynamic binary instrumentation > + framework. > + > + Copyright (C) 2016-2016 Philippe Waroquiers > > For m_xtree.c, if there is a lot of code in there which has been moved > from massif and/or callgrind, and is not much changed, I think it > would be diplomatic to add a line or two explaining that the original > authors were Nick and/or Josef. See the top of coregrind/m_wordfm.c for > an example. m_xtree.c is a completely new implementation of the Xtree (I have recuperated a few lines for the production of the massif header). In any case, I have added a paragraph to mention that the xtree initial idea was in massif developed by Nick. > > ---------------- > > // growing such a block, but for consistency (it also simplifies things) we > // ignore such reallocs as well. > +// XTREE??? why can't we just consider that a realloc of an ignored > +// alloc is just a new alloc (i.e. do not remove the old sz from the stats). > > and again later. The "XTREE???" is confusing -- I don't know what it > signifies. Can you maybe write instead something like "PW Nov 2016, > xtree work:"? I do that in comments from time to time. Updated as suggested. > > ---------------- > > Per previous comments in the bug re CamelCase vs snake_case, I really > have no problem mixing them. I like to use camelcase, with a capital > first letter for type names, and snake case when function names get > long. But no fixed rules. Yes, there is no fixed rule. The comment of Ivo was in any case handled, as there was really too much name variation in the API. Can this be closed now? In the NEWS file it is listed as being closed. |