For example: ==3731031== at 0x8000FF0A8: vex_strlen (main_util.c:268) ==3731031== by 0x8001A68F9: cab_operand (s390_disasm.c:121) ==3731031== by 0x8001A78B7: s390_disasm (s390_disasm.c:416) ==3731031== by 0x800120EBD: s390_format_RRF_U0RR (guest_s390_toIR.c:3112) ==3731031== by 0x800172A7B: s390_decode_4byte_and_irgen (guest_s390_toIR.c:20649) ==3731031== by 0x800176BCD: s390_decode_and_irgen (guest_s390_toIR.c:22944) ==3731031== by 0x80017E5FD: disInstr_S390 (guest_s390_toIR.c:23047)
This is the story: The IR for C[G]RT and CL[G]RT is constructed with the help of function s390_format_RRF_U0RR passing S390_XMNM_CAB to the xmnm_kind parameter. The disassembly command is encoded like so: if (UNLIKELY(vex_traceflags & VEX_TRACE_FE)) s390_disasm(ENC3(XMNM, GPR, GPR), xmnm_kind, m3, r1, r2); When decoding that command in function s390_disasm and upon encountering an S390_XMNM_CAB argument this happens: case S390_XMNM_CAB: mnm = va_arg(args, HChar *); mask = va_arg(args, UInt); p += vex_sprintf(p, "%s", mnemonic(cab_operand(mnm, mask))); But the expected string is not provided. Instead m3 is interpreted as a string which then causes the segfault. Now, we cannot simply fix this by passing the mnemonic to s390_disasm because s390_format_RRF_U0RR is also used to build IR for LOCR, LOCGR, and LOCFHR passing e.g. S390_XMNM_LOCR to xmnm_kind. That would break: case S390_XMNM_LOCR: mask = va_arg(args, UInt); mnm = cls_operand(kind, mask); p += vex_sprintf(p, "%s", mnemonic(mnm)); It would also be wrong to split s390_format_RRF_U0RR into two functions by introducing, say, s390_format_RRF_U0RR_cat to handle the "compare and trap" insns and s390_format_RRF_U0RR_loc to handle the various "load on condition" insns. The reason is that we want the opcode format names in s390_format_XXXXX to match those specified in <binutils>/opcodes/s390-opc.txt Looking at those snippets.... Why are they not symmetric? Meaning, why isn't there a mnemonic passed for the "load on condition" opcodes? After all both "compare and trap" and "load on condition" are opcodes with extended mnemonics and need a base mnemonic to build the extended ones from. The answer is that function cls_operand provides those base mnemonics. static const HChar * cls_operand(Int kind, UInt mask) { const HChar *prefix; switch (kind) { case S390_XMNM_LOCR: prefix = "locr"; break; case S390_XMNM_LOCGR: prefix = "locgr"; break; ...... That is conceptually wrong. Mnemonics should be passed down to s390_disasm which (ideally) should have no knowledge of them. This is mostly so with exceptions: case 0xb9f2: s390_format_RRF_U0RR(s390_irgen_LOCR, RRF3_r3(ovl), RRF3_r1(ovl), RRF3_r2(ovl), S390_XMNM_LOCR); goto ok; static const HChar * s390_irgen_LOCR(UChar m3, UChar r1, UChar r2) { next_insn_if(binop(Iop_CmpEQ32, s390_call_calculate_cond(m3), mkU32(0))); put_gpr_w1(r1, get_gpr_w1(r2)); return "locr"; // base mnemonic returned (as it should be) } static void s390_format_RRF_U0RR(const HChar *(*irgen)(UChar m3, UChar r1, UChar r2), UChar m3, UChar r1, UChar r2, Int xmnm_kind) { irgen(m3, r1, r2); // base mnemonic IGNORED if (UNLIKELY(vex_traceflags & VEX_TRACE_FE)) s390_disasm(ENC3(XMNM, GPR, GPR), xmnm_kind, m3, r1, r2); } So the proper fix is to (1) introduce S390_XMNM_CLS, (2) pass down the mnemonic to s390_disasm, and (3) change function cls_operand to be symmetric with cab_operand by also taking in the base mnemonic.
Created attachment 175780 [details] patch as described in comment #1
This bug was fixed in 1e694434a5cd2a0352e97f872ebd6922129c0282.