Bug 309425 - Provide a --sigill-diagnostics flag to suppress illegal instruction reporting
Summary: Provide a --sigill-diagnostics flag to suppress illegal instruction reporting
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.9.0.SVN
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-11-02 14:39 UTC by Mark Wielaard
Modified: 2012-12-06 18:09 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
VEX part of sigill_diag patch (9.45 KB, patch)
2012-11-02 14:41 UTC, Mark Wielaard
Details
coregrind and doc part of sigill_diag patch (5.69 KB, patch)
2012-11-02 14:43 UTC, Mark Wielaard
Details
VEX part of sigill_diag patch (24.05 KB, patch)
2012-11-04 16:28 UTC, Mark Wielaard
Details
coregrind, doc and help part of sigill_diag patch (6.08 KB, patch)
2012-11-04 16:30 UTC, Mark Wielaard
Details
coregrind, doc, help and test part of sigill_diag patch (7.31 KB, patch)
2012-11-04 16:42 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2012-11-02 14:39:29 UTC
This is a followup to https://bugs.kde.org/show_bug.cgi?id=308573 where some illegal (ppc) instructions were recognized/decoded, but not translated (because they are invalid in some modes). The original trigger for that bug report was that there are programs which do explicitly try some instructions to figure out cpu features (they install a SIGILL handler). There was a proposal in that patch to make "decoded/known, but not translated instructions, less verbose (just generate the SIGILL signal and be done with it). But after some discussion on irc this seemed not ideal. Neither is detecting whether a program installs a SIGILL handler, since there are programs that do that for crash detection, but don't actually "gracefully" handle the illegal instruction so would like to know about it (it could also be valgrind really not properly handling a valid instruction).

So since there doesn't seem to be a very good default policy/trick a new flag seems most appropriate. The patches to VEX and coregrind implement this:

       --sigill-diagnostics=<yes|no> [default: yes]
           Enable/disable printing of illegal instruction diagnostics. Enabled
           by default, but defaults to disabled when --quiet is given. The
           default can always be explicitly overridden by given this option.

           When enabled a warning message will be printed with some
           diagnostics whenever some instruction is encountered that valgrind
           cannot decode or translate before the program is given a SIGILL
           signal. Often an illegal instruction indicates a bug in the program
           or missing support for the particular instruction in Valgrind. But
           some programs do deliberately try to execute an instruction that
           might be missing and trap the SIGILL signal to detect processor
           features.

Reproducible: Always




Note that even with --sigill-diagnostics=no a program that doesn't install its own SIGILL handler will still make valgrind produce:

==30229== Process terminating with default action of signal 4 (SIGILL)
==30229==  Illegal opcode at address 0x4006A1
==30229==    at 0x4006A1: xgetbv (ill.c:55)
==30229==    by 0x400AE3: main (ill.c:181)

Even in --quiet mode. It just won't print the whole VEX/coregrind banner:

vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xD0 0x48 0x89 0xC3 0x48 0x89
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==30347== valgrind: Unrecognised instruction at address 0x4006a1.
==30347==    at 0x4006A1: xgetbv (ill.c:55)
==30347==    by 0x400AE3: main (ill.c:181)
==30347== Your program just tried to execute an instruction that Valgrind
==30347== did not recognise.  There are two possible reasons for this.
==30347== 1. Your program has a bug and erroneously jumped to a non-code
==30347==    location.  If you are running Memcheck and you just saw a
==30347==    warning about a bad jump, it's probably your program's fault.
==30347== 2. The instruction is legitimate but Valgrind doesn't handle it,
==30347==    i.e. it's Valgrind's fault.  If you think this is the case or
==30347==    you are not sure, please let us know and we'll try to fix it.
==30347== Either way, Valgrind will now raise a SIGILL signal which will
==30347== probably kill your program.
Comment 1 Mark Wielaard 2012-11-02 14:41:48 UTC
Created attachment 74934 [details]
VEX part of sigill_diag patch
Comment 2 Mark Wielaard 2012-11-02 14:43:20 UTC
Created attachment 74935 [details]
coregrind and doc part of sigill_diag patch
Comment 3 Mark Wielaard 2012-11-02 15:37:52 UTC
Quick review on irc by sewardj, about the global variable .. Bool vex_sigill_diag = True;
- global vars .. if we ever go MT one day, they will become a major pira
- basically this extra bool needs to passed from the central coordination department (main_main.c) to the front ends (xx_toIR.c)
- this bool you could get it passed through to the front ends without a global var, at the cost of tiresome plumbing to do more parameter passing through bb_to_IR (or whatever its called)

Will adapt the patch to do that.
Comment 4 Philippe Waroquiers 2012-11-02 17:16:14 UTC
(In reply to comment #3)
Also, would be nice to add the new option in the --help output
(probably in the uncommon user options)
Comment 5 Mark Wielaard 2012-11-04 16:28:59 UTC
Created attachment 75002 [details]
VEX part of sigill_diag patch

Updated version that passes through the sigill_diag Bool to the front ends without using a global var as suggested in comment #3.
Comment 6 Mark Wielaard 2012-11-04 16:30:28 UTC
Created attachment 75003 [details]
coregrind, doc and help part of sigill_diag patch

Updated patch that includes help text as suggested in comment #4.
Comment 7 Mark Wielaard 2012-11-04 16:42:49 UTC
Created attachment 75004 [details]
coregrind, doc, help and test part of sigill_diag patch

Even more updated patch that adds the correct exp for cmdline1 and cmdline2 tests for the new --help text.
Comment 8 Julian Seward 2012-12-06 18:09:42 UTC
Committed, r13164, r2582.  Thanks.