Bug 191189

Summary: --xml=yes should obey --gen-suppressions=all
Product: [Developer tools] valgrind Reporter: Dan Kegel <dank>
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: wishlist CC: ebaron, jruderman, njn, timurrrr, tom
Priority: NOR    
Version: unspecified   
Target Milestone: blocking3.5.0   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Dan Kegel 2009-05-01 02:33:00 UTC
Version:           svn (using Devel)
OS:                Linux
Installed from:    Compiled sources

Valgrinding the chromium test suite takes a long time,
and it would be nice if we got suppressions out
of each run without having to rerun part of it.
We use the xml output, which AFAIK doesn't contain
generated suppressions, even when the --gen-suppressions=all
flag is given.
Comment 1 Nicholas Nethercote 2009-05-01 02:54:39 UTC
We got a bug report (#191067) only two days ago complaining that error messages produced by the suppression file reader were not in XML when --xml=yes is specified, and that this could crash an XML parser.  Ie. that XML and plaintext were being intermingled.  And now you want us to intermingle XML and plaintext when --gen-suppressions=all is specified.

Maybe we could solve bot problems by adding a new tag to the Valgrind XML format that says "what follows is plaintext best consumed by a human".

With that in place, maybe we could do XML output in a less ad hoc manner;  currently it's very ad hoc, there's lots of "if (VG_(clo_xml)) ..." tests sprinkled throughout.  Maybe every call to VG_(message) and similar functions should be required to have a normal form and an XML form.
Comment 2 Dan Kegel 2009-05-01 03:17:10 UTC
But then the parsing of the xml file would be ad hoc. 
No need to stray from xml orthodoxy; just create a 
<suppression> record whose content is a multiline
next field containing the suppression record.
Comment 3 Dan Kegel 2009-05-01 03:17:48 UTC
er, multiline *text* field.
Comment 4 Julian Seward 2009-05-11 00:35:03 UTC
I agree with Nick (comment #1) that we can't just let any old bits
of text be emitted into the XML, even if wrapped up in tags.  This
would make it nigh impossible to nail down the grammar, since it
would surely allow arbitrary bits of text (in tags) to appear anywhere
in the output.

A more general solution is now available in branches/MESSAGING_TIDYUP
(r9829 and later).  This causes XML to be emitted on its own fd
independent of the other text, does not inhibit any of the normal
text output when --xml=yes is specified, and allows independent 
specification of the destination for the normal text and XML output
streams.

For example, I can do

./vg-in-place -q --gen-suppressions=all \
              --log-file=output.txt \
              --xml-file=output.xml \
              --xml=yes memcheck/tests/origin5-bz2

with the result that the suppressions end up in output.txt and the
XML in output.xml.
Comment 5 Julian Seward 2009-05-11 00:40:25 UTC
*** Bug 191067 has been marked as a duplicate of this bug. ***
Comment 6 Dan Kegel 2009-05-11 01:08:15 UTC
I was hoping that my web front end for viewing valgrind results
could have a checkbox for 'show suppressions'.
The change Julian offers does not make that easy or even possible.

I'm getting the feeling you folks are losing touch with your users
when it comes to the xml file...
Comment 7 Dan Kegel 2009-05-11 01:10:46 UTC
To be more specific, I wanted to have an icon next to each error;
clicking on the icon would show the suppression for that error.
This would be easy to implement if the suppression was made
part of the XML file, in the same record as the error itself.
Comment 8 Tom Hughes 2009-05-11 01:27:11 UTC
I think you're being a bit unfair Dan. What Julian said was that he didn't think it was reasonable to let arbitrary text through into the XML output, even if surrounded by some sort of <extraoutput> type tag.

He didn't say anything about whether specific things like suppressions should be included as either structured or unstructured XML of some sort.
Comment 9 Dan Kegel 2009-05-11 01:31:01 UTC
Perhaps.  I am awful cranky today.

In any case, a workaround for this problem might
be to use --demangle=no, generate suppressions
manually, then use c++filt on each error to 
demangle the stack manually.  
That lets you get both mangled suppressions and 
unmangled errors in the same record parsed out of XML
without waiting for Valgrind to enhance its XML format
to encode suppressions.
Comment 10 Nicholas Nethercote 2009-05-11 01:46:34 UTC
Basically, Dan is asking for suppressions to be emitted in XML form.  This wasn't very clear from the original report;  at least, I didn't understand it, as comments #1 and #2 show.

Julian, in comment #4, wasn't addressing the original (unclear) request, but the comment I made in #1.

I think marking #191067 as a duplicate of this bug was a mistake.  Julian's change on the branch is addressing #191067 and my comment #1, but not the original request of this bug.
Comment 11 Dan Kegel 2009-05-11 02:35:27 UTC
Yes, apologies for being unclear.  I agree, it's probably not a dup.
Comment 12 Julian Seward 2009-05-11 09:36:50 UTC
Re #4, this change was done precisely to address complaints from
users who read the XML, who are fed up -- and rightly so -- with 
their parsers breaking every time V unexpectedly emits some 
random-ass bit of text into the XML stream.

Anyway.  Having the suppressions emitted in XML form along with
the errors seems reasonable enough (although, in general, copying
the generated suppressions verbatim into .supp files isn't a great
idea, as they are quite often too specific, and need generalising 
by hand.)

We'll have to bump the protocol version number (to 4) so that existing
parsers know the format has changed.  That's no big deal.  I'll hack
something up later today.
Comment 13 Julian Seward 2009-05-13 10:33:26 UTC
Re printing suppressions in XML form, try branches/MESSAGING_TIDYUP
r9837 (but note this is still a work in progress).
Comment 14 Jesse Ruderman 2009-06-25 03:59:49 UTC
I want the suppressions to be in the XML file (e.g. a <suppression> inside each <error>) so I don't have to worry about which suppression goes with which error.  But I don't want them "in XML form" like <stack> is, because the suppressions file isn't in that form.  Instead, the suppression text should be a (properly-escaped) large text node in a <suppression> element.

It's probably best to use a <![CDATA[ ... ]]> section, since that requires less escaping.  That way, if someone cuts and pastes from the XML file rather than using an XML parser, they won't have to deal with unescaping too much.

I don't understand the concern about "nailing down the grammar".  As long as it's valid XML, users should be using an XML parser and ignoring the contents of elements they don't care about.
Comment 15 Dan Kegel 2009-06-25 04:41:25 UTC
I finally took a look at the MESSAGING_TIDYUP branch. It outputs suppressions
in the xml file in the format

  <suppression>
    <sname>insert_a_suppression_name_here</sname>
    <skind>Memcheck:Leak</skind>
    <sframe> <fun>malloc</fun> </sframe>
    <sframe> <fun>nss_parse_service_list</fun> </sframe>
    <sframe> <fun>__nss_database_lookup</fun> </sframe>
    <sframe> <obj>*</obj> </sframe>
    <sframe> <obj>*</obj> </sframe>
    <sframe> <fun>getpwuid_r@@GLIBC_2.2.5</fun> </sframe>
    <sframe> <fun>getpwuid</fun> </sframe>
    <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe>
    <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe>
    <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe>
    <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe>
    <sframe> <obj>/usr/bin/vim.tiny</obj> </sframe>
  </suppression>

That seems awfully complicated to me, especially seeing as what
we do with it is just strip off the XML so we can use it.

Like Jesse, I would prefer a CDATA block containing exactly what is supposed
to go in the suppressions file.   Or are you going to switch that to XML
too?  (I hope not.)
Comment 16 Nicholas Nethercote 2009-06-25 05:04:41 UTC
(In reply to comment #15)
>
> That seems awfully complicated to me, especially seeing as what
> we do with it is just strip off the XML so we can use it.
> 
> Like Jesse, I would prefer a CDATA block containing exactly what is supposed
> to go in the suppressions file.   Or are you going to switch that to XML
> too?  (I hope not.)

I agree that is probably more useful.  I suspect Julian did the full XML thing because (again) what you wanted wasn't clear.  (Seems to be an ongoing problem with this bug :)
Comment 17 Nicholas Nethercote 2009-07-21 06:58:21 UTC
Julian, what's the status of this now that the messaging branch has been merged with the trunk?
Comment 18 Julian Seward 2009-08-15 23:36:26 UTC
(In reply to comments #14, #15)
> It's probably best to use a <![CDATA[ ... ]]> section, since that requires

Right.  I've got a patch which produces the output shown below.  Doing
this was a massive pita because it involved largely refactoring the code
for generating suppressions, but anyway, it's done.

  <suppression>
    <sname>insert_a_suppression_name_here</sname>
    <skind>Memcheck:Addr1</skind>
    <sframe> <fun>ddd</fun> </sframe>
    <sframe> <fun>ccc</fun> </sframe>
    <sframe> <fun>bbb</fun> </sframe>
    <sframe> <fun>aaa</fun> </sframe>
    <sframe> <fun>main</fun> </sframe>
    <rawtext>
<![CDATA[
{
   <insert a suppression name here>
   Memcheck:Addr1
   fun:ddd
   fun:ccc
   fun:bbb
   fun:aaa
   fun:main
}
]]>
    </rawtext>
  </suppression>
Comment 19 Dan Kegel 2009-08-15 23:39:18 UTC
Just what the doctor ordered.   Looking forward to seeing
that patch; hopefully I can throw away the python code
I was using to simulate this.
Comment 20 Julian Seward 2009-08-16 00:43:23 UTC
(In reply to comment #18)
> Right.  I've got a patch which produces the output shown below.

Committed, r10822, but note the following:

This still isn't 100% right in the sense that the CDATA block data
needs to be split across multiple blocks if it should ever contain the
CDATA end mark "]]>".  The Protocol 4 spec has this right even though
the implementation currently doesn't.
Comment 21 Julian Seward 2009-08-16 02:49:31 UTC
10822 contains stupid bug; you need >= 10825.

Closing -- reopen ASAP if obvious borkage appears.