Summary: | --xml=yes should obey --gen-suppressions=all | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Dan Kegel <dank> |
Component: | general | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | wishlist | CC: | ebaron, jruderman, njn, timurrrr, tom |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | blocking3.5.0 | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Dan Kegel
2009-05-01 02:33:00 UTC
We got a bug report (#191067) only two days ago complaining that error messages produced by the suppression file reader were not in XML when --xml=yes is specified, and that this could crash an XML parser. Ie. that XML and plaintext were being intermingled. And now you want us to intermingle XML and plaintext when --gen-suppressions=all is specified. Maybe we could solve bot problems by adding a new tag to the Valgrind XML format that says "what follows is plaintext best consumed by a human". With that in place, maybe we could do XML output in a less ad hoc manner; currently it's very ad hoc, there's lots of "if (VG_(clo_xml)) ..." tests sprinkled throughout. Maybe every call to VG_(message) and similar functions should be required to have a normal form and an XML form. But then the parsing of the xml file would be ad hoc. No need to stray from xml orthodoxy; just create a <suppression> record whose content is a multiline next field containing the suppression record. er, multiline *text* field. I agree with Nick (comment #1) that we can't just let any old bits of text be emitted into the XML, even if wrapped up in tags. This would make it nigh impossible to nail down the grammar, since it would surely allow arbitrary bits of text (in tags) to appear anywhere in the output. A more general solution is now available in branches/MESSAGING_TIDYUP (r9829 and later). This causes XML to be emitted on its own fd independent of the other text, does not inhibit any of the normal text output when --xml=yes is specified, and allows independent specification of the destination for the normal text and XML output streams. For example, I can do ./vg-in-place -q --gen-suppressions=all \ --log-file=output.txt \ --xml-file=output.xml \ --xml=yes memcheck/tests/origin5-bz2 with the result that the suppressions end up in output.txt and the XML in output.xml. *** Bug 191067 has been marked as a duplicate of this bug. *** I was hoping that my web front end for viewing valgrind results could have a checkbox for 'show suppressions'. The change Julian offers does not make that easy or even possible. I'm getting the feeling you folks are losing touch with your users when it comes to the xml file... To be more specific, I wanted to have an icon next to each error; clicking on the icon would show the suppression for that error. This would be easy to implement if the suppression was made part of the XML file, in the same record as the error itself. I think you're being a bit unfair Dan. What Julian said was that he didn't think it was reasonable to let arbitrary text through into the XML output, even if surrounded by some sort of <extraoutput> type tag. He didn't say anything about whether specific things like suppressions should be included as either structured or unstructured XML of some sort. Perhaps. I am awful cranky today. In any case, a workaround for this problem might be to use --demangle=no, generate suppressions manually, then use c++filt on each error to demangle the stack manually. That lets you get both mangled suppressions and unmangled errors in the same record parsed out of XML without waiting for Valgrind to enhance its XML format to encode suppressions. Basically, Dan is asking for suppressions to be emitted in XML form. This wasn't very clear from the original report; at least, I didn't understand it, as comments #1 and #2 show. Julian, in comment #4, wasn't addressing the original (unclear) request, but the comment I made in #1. I think marking #191067 as a duplicate of this bug was a mistake. Julian's change on the branch is addressing #191067 and my comment #1, but not the original request of this bug. Yes, apologies for being unclear. I agree, it's probably not a dup. Re #4, this change was done precisely to address complaints from users who read the XML, who are fed up -- and rightly so -- with their parsers breaking every time V unexpectedly emits some random-ass bit of text into the XML stream. Anyway. Having the suppressions emitted in XML form along with the errors seems reasonable enough (although, in general, copying the generated suppressions verbatim into .supp files isn't a great idea, as they are quite often too specific, and need generalising by hand.) We'll have to bump the protocol version number (to 4) so that existing parsers know the format has changed. That's no big deal. I'll hack something up later today. Re printing suppressions in XML form, try branches/MESSAGING_TIDYUP r9837 (but note this is still a work in progress). I want the suppressions to be in the XML file (e.g. a <suppression> inside each <error>) so I don't have to worry about which suppression goes with which error. But I don't want them "in XML form" like <stack> is, because the suppressions file isn't in that form. Instead, the suppression text should be a (properly-escaped) large text node in a <suppression> element. It's probably best to use a <![CDATA[ ... ]]> section, since that requires less escaping. That way, if someone cuts and pastes from the XML file rather than using an XML parser, they won't have to deal with unescaping too much. I don't understand the concern about "nailing down the grammar". As long as it's valid XML, users should be using an XML parser and ignoring the contents of elements they don't care about. I finally took a look at the MESSAGING_TIDYUP branch. It outputs suppressions in the xml file in the format <suppression> <sname>insert_a_suppression_name_here</sname> <skind>Memcheck:Leak</skind> <sframe> <fun>malloc</fun> </sframe> <sframe> <fun>nss_parse_service_list</fun> </sframe> <sframe> <fun>__nss_database_lookup</fun> </sframe> <sframe> <obj>*</obj> </sframe> <sframe> <obj>*</obj> </sframe> <sframe> <fun>getpwuid_r@@GLIBC_2.2.5</fun> </sframe> <sframe> <fun>getpwuid</fun> </sframe> <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe> <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe> <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe> <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe> <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe> </suppression> That seems awfully complicated to me, especially seeing as what we do with it is just strip off the XML so we can use it. Like Jesse, I would prefer a CDATA block containing exactly what is supposed to go in the suppressions file. Or are you going to switch that to XML too? (I hope not.) (In reply to comment #15) > > That seems awfully complicated to me, especially seeing as what > we do with it is just strip off the XML so we can use it. > > Like Jesse, I would prefer a CDATA block containing exactly what is supposed > to go in the suppressions file. Or are you going to switch that to XML > too? (I hope not.) I agree that is probably more useful. I suspect Julian did the full XML thing because (again) what you wanted wasn't clear. (Seems to be an ongoing problem with this bug :) Julian, what's the status of this now that the messaging branch has been merged with the trunk? (In reply to comments #14, #15) > It's probably best to use a <![CDATA[ ... ]]> section, since that requires Right. I've got a patch which produces the output shown below. Doing this was a massive pita because it involved largely refactoring the code for generating suppressions, but anyway, it's done. <suppression> <sname>insert_a_suppression_name_here</sname> <skind>Memcheck:Addr1</skind> <sframe> <fun>ddd</fun> </sframe> <sframe> <fun>ccc</fun> </sframe> <sframe> <fun>bbb</fun> </sframe> <sframe> <fun>aaa</fun> </sframe> <sframe> <fun>main</fun> </sframe> <rawtext> <![CDATA[ { <insert a suppression name here> Memcheck:Addr1 fun:ddd fun:ccc fun:bbb fun:aaa fun:main } ]]> </rawtext> </suppression> Just what the doctor ordered. Looking forward to seeing that patch; hopefully I can throw away the python code I was using to simulate this. (In reply to comment #18) > Right. I've got a patch which produces the output shown below. Committed, r10822, but note the following: This still isn't 100% right in the sense that the CDATA block data needs to be split across multiple blocks if it should ever contain the CDATA end mark "]]>". The Protocol 4 spec has this right even though the implementation currently doesn't. 10822 contains stupid bug; you need >= 10825. Closing -- reopen ASAP if obvious borkage appears. |