Bug 495816 - s390x: disassembler segfault for C[G]RT and CL[G]RT
Summary: s390x: disassembler segfault for C[G]RT and CL[G]RT
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR crash
Target Milestone: ---
Assignee: Florian Krohm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-05 07:55 UTC by Florian Krohm
Modified: 2025-03-05 22:38 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
patch as described in comment #1 (11.55 KB, patch)
2024-11-13 17:13 UTC, Florian Krohm
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Krohm 2024-11-05 07:55:51 UTC
For example:

==3731031==    at 0x8000FF0A8: vex_strlen (main_util.c:268)
==3731031==    by 0x8001A68F9: cab_operand (s390_disasm.c:121)
==3731031==    by 0x8001A78B7: s390_disasm (s390_disasm.c:416)
==3731031==    by 0x800120EBD: s390_format_RRF_U0RR (guest_s390_toIR.c:3112)
==3731031==    by 0x800172A7B: s390_decode_4byte_and_irgen (guest_s390_toIR.c:20649)
==3731031==    by 0x800176BCD: s390_decode_and_irgen (guest_s390_toIR.c:22944)
==3731031==    by 0x80017E5FD: disInstr_S390 (guest_s390_toIR.c:23047)
Comment 1 Florian Krohm 2024-11-13 17:12:11 UTC
This is the story:

The IR for C[G]RT and CL[G]RT is constructed with the help of function
s390_format_RRF_U0RR passing S390_XMNM_CAB to the xmnm_kind parameter.

The disassembly command is encoded like so:

     if (UNLIKELY(vex_traceflags & VEX_TRACE_FE))
        s390_disasm(ENC3(XMNM, GPR, GPR), xmnm_kind, m3, r1, r2);

When decoding that command in function s390_disasm and upon encountering an
S390_XMNM_CAB argument this happens:

     case S390_XMNM_CAB:
        mnm  = va_arg(args, HChar *);
        mask = va_arg(args, UInt);
        p  += vex_sprintf(p, "%s", mnemonic(cab_operand(mnm, mask)));

But the expected string is not provided. Instead m3 is interpreted as a
string which then causes the segfault.

Now, we cannot simply fix this by passing the mnemonic to s390_disasm
because s390_format_RRF_U0RR is also used to build IR for LOCR, LOCGR,
and LOCFHR passing e.g. S390_XMNM_LOCR to xmnm_kind. That would break:

     case S390_XMNM_LOCR:
        mask = va_arg(args, UInt);
        mnm  = cls_operand(kind, mask);
        p  += vex_sprintf(p, "%s", mnemonic(mnm));

It would also be wrong to split s390_format_RRF_U0RR into two functions by
introducing, say, s390_format_RRF_U0RR_cat to handle the "compare and trap"
insns and s390_format_RRF_U0RR_loc to handle the various "load on condition"
insns. The reason is that we want the opcode format names in
s390_format_XXXXX to match those specified in <binutils>/opcodes/s390-opc.txt

Looking at those snippets.... Why are they not symmetric? Meaning, why isn't
there a mnemonic passed for the "load on condition" opcodes? After all both
"compare and trap" and "load on condition" are opcodes with extended
mnemonics and need a base mnemonic to build the extended ones from.
The answer is that function cls_operand provides those base mnemonics.

     static const HChar *
     cls_operand(Int kind, UInt mask)
     {
        const HChar *prefix;

        switch (kind) {
        case S390_XMNM_LOCR:   prefix = "locr";  break;
        case S390_XMNM_LOCGR:  prefix = "locgr"; break;
        ......

That is conceptually wrong. Mnemonics should be passed down to s390_disasm
which (ideally)  should have no knowledge of them. This is mostly so with
exceptions:

     case 0xb9f2: s390_format_RRF_U0RR(s390_irgen_LOCR, RRF3_r3(ovl),
                                       RRF3_r1(ovl), RRF3_r2(ovl), S390_XMNM_LOCR);  goto ok;

     static const HChar *
     s390_irgen_LOCR(UChar m3, UChar r1, UChar r2)
     {
        next_insn_if(binop(Iop_CmpEQ32, s390_call_calculate_cond(m3), mkU32(0)));
        put_gpr_w1(r1, get_gpr_w1(r2));

        return "locr";       //  base mnemonic returned (as it should be)
     }

     static void
     s390_format_RRF_U0RR(const HChar *(*irgen)(UChar m3, UChar r1, UChar r2),
                           UChar m3, UChar r1, UChar r2, Int xmnm_kind)
     {
        irgen(m3, r1, r2);  // base mnemonic IGNORED

        if (UNLIKELY(vex_traceflags & VEX_TRACE_FE))
           s390_disasm(ENC3(XMNM, GPR, GPR), xmnm_kind, m3, r1, r2);
     }

So the proper fix is to 
(1) introduce S390_XMNM_CLS,
(2) pass down the mnemonic to s390_disasm, and
(3) change function cls_operand to be symmetric with cab_operand by also taking in the base mnemonic.
Comment 2 Florian Krohm 2024-11-13 17:13:53 UTC
Created attachment 175780 [details]
patch as described in comment #1
Comment 3 Florian Krohm 2025-03-05 22:38:13 UTC
This bug was fixed in 1e694434a5cd2a0352e97f872ebd6922129c0282.