valgrind --tool=none /usr/bin/python3 test.py ==27866== Nulgrind, the minimal Valgrind tool ==27866== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote. ==27866== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info ==27866== Command: /usr/bin/python3 test.py ==27866== vex: the `impossible' happened: s390_insn_store_emit: unknown dst->tag for HRcVec128 vex storage: T total 555804192 bytes allocated vex storage: P total 0 bytes allocated valgrind: the 'impossible' happened: LibVEX called failure_exit(). host stacktrace: ==27866== at 0x800032C8C: show_sched_status_wrk (m_libcassert.c:388) ==27866== by 0x800032E75: report_and_quit (m_libcassert.c:459) ==27866== by 0x800033077: vgPlain_core_panic_at (m_libcassert.c:535) ==27866== by 0x80003309D: vgPlain_core_panic (m_libcassert.c:545) ==27866== by 0x80004C495: failure_exit (m_translate.c:751) ==27866== by 0x80010BF79: vpanic (main_util.c:255) ==27866== by 0x80018817B: s390_insn_store_emit (host_s390_defs.c:8388) ==27866== by 0x800199483: emit_S390Instr (host_s390_defs.c:11340) ==27866== by 0x800108F93: LibVEX_Translate (main_main.c:1125) ==27866== by 0x80004EDC5: vgPlain_translate (m_translate.c:1813) ==27866== by 0x80001A57D: handle_chain_me (scheduler.c:1167) ==27866== by 0x80001D515: vgPlain_scheduler (scheduler.c:1516) ==27866== by 0x800091105: run_a_thread_NORETURN (syswrap-linux.c:103) ==27866== by 0xFFFFFFFFFFFFFFFF: ??? sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 27866) ==27866== at 0x4B713F2: PyParser_New (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x4B5D6F3: PyParser_ParseFileObject (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x4B5F471: PyParser_ASTFromFileObject (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x4B5FC1D: PyRun_FileExFlags (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x4B630EF: PyRun_SimpleFileExFlags (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x4D0B409: Py_Main (in /usr/lib64/libpython3.6m.so.1.0) ==27866== by 0x108CBB: main (in /usr/libexec/platform-python3.6) client stack range: [0x1FFEFEC000 0x1FFF000FFF] client SP: 0x1FFEFFF700 valgrind stack range: [0x1002D52000 0x1002E51FFF] top usage: 12912 of 1048576 This is in host_s390_defs.c (s390_insn_store_emit): if (hregClass(insn->variant.store.src) == HRcVec128) { vassert(insn->size == 16); switch (dst->tag) { case S390_AMODE_B12: case S390_AMODE_BX12: return s390_emit_VST(buf, r, x, b, d); default: vpanic("s390_insn_store_emit: unknown dst->tag for HRcVec128"); } } With some debugging we see that this is S390_AMODE_B20 insn: v-store %v31,-24(%r4) 16 bytes dst: -24(%r4)
Note that test.py doesn't have to be anything fancy. In this case it is simply print ("hello world")
BTW. When the vpanic happens this is the function it claims the inferior is in: Dump of assembler code for function PyCFunction_NewEx: 0x0000000004b8f858 <+0>: stmg %r9,%r15,72(%r15) 0x0000000004b8f85e <+6>: larl %r1,0x4e35320 0x0000000004b8f864 <+12>: lgr %r9,%r2 0x0000000004b8f868 <+16>: lay %r15,-160(%r15) 0x0000000004b8f86e <+22>: lgr %r11,%r3 0x0000000004b8f872 <+26>: lgr %r10,%r4 0x0000000004b8f876 <+30>: ltg %r2,0(%r1) 0x0000000004b8f87c <+36>: je 0x4b8f93c <PyCFunction_NewEx+228> 0x0000000004b8f880 <+40>: stg %r9,16(%r2) 0x0000000004b8f886 <+46>: lrl %r4,0x4e35328 0x0000000004b8f88c <+52>: lg %r0,24(%r2) 0x0000000004b8f892 <+58>: lgrl %r3,0x4daaa38 0x0000000004b8f898 <+64>: stg %r3,8(%r2) 0x0000000004b8f89e <+70>: ahi %r4,-1 0x0000000004b8f8a2 <+74>: stgrl %r0,0x4e35320 0x0000000004b8f8a8 <+80>: mvghi 0(%r2),1 0x0000000004b8f8ae <+86>: strl %r4,0x4e35328 0x0000000004b8f8b4 <+92>: mvghi 40(%r2),0 0x0000000004b8f8ba <+98>: cgije %r11,0,0x4b8f966 <PyCFunction_NewEx+270> 0x0000000004b8f8c0 <+104>: agsi 0(%r11),1 0x0000000004b8f8c6 <+110>: stg %r11,24(%r2) 0x0000000004b8f8cc <+116>: cgijne %r10,0,0x4b8f9de <PyCFunction_NewEx+390> 0x0000000004b8f8d2 <+122>: mvghi 32(%r2),0 0x0000000004b8f8d8 <+128>: lg %r5,-8(%r2) 0x0000000004b8f8de <+134>: lay %r14,-24(%r2) 0x0000000004b8f8e4 <+140>: srag %r9,%r5,1 0x0000000004b8f8ea <+146>: cgijne %r9,-2,0x4b8f9e8 <PyCFunction_NewEx+400> 0x0000000004b8f8f0 <+152>: risbg %r11,%r5,63,191,0 0x0000000004b8f8f6 <+158>: lghi %r1,-6 0x0000000004b8f8fa <+162>: ogr %r11,%r1 0x0000000004b8f8fe <+166>: stg %r11,-8(%r2) 0x0000000004b8f904 <+172>: lgrl %r10,0x4daaf28 0x0000000004b8f90a <+178>: lg %r3,0(%r10) 0x0000000004b8f910 <+184>: lg %r4,8(%r3) 0x0000000004b8f916 <+190>: vlvgp %v0,%r3,%r4 0x0000000004b8f91c <+196>: vst %v0,0(%r14) 0x0000000004b8f922 <+202>: stg %r14,0(%r4) 0x0000000004b8f928 <+208>: lg %r9,0(%r10) 0x0000000004b8f92e <+214>: stg %r14,8(%r9) 0x0000000004b8f934 <+220>: lmg %r9,%r15,232(%r15) 0x0000000004b8f93a <+226>: br %r14 0x0000000004b8f93c <+228>: lgrl %r2,0x4daaa38 0x0000000004b8f942 <+234>: brasl %r14,0x4b1e968 <_PyObject_GC_New@plt> 0x0000000004b8f948 <+240>: cgije %r2,0,0x4b8f934 <PyCFunction_NewEx+220> 0x0000000004b8f94e <+246>: stg %r9,16(%r2) 0x0000000004b8f954 <+252>: mvghi 40(%r2),0 0x0000000004b8f95a <+258>: cgije %r11,0,0x4b8f966 <PyCFunction_NewEx+270> 0x0000000004b8f960 <+264>: agsi 0(%r11),1 0x0000000004b8f966 <+270>: stg %r11,24(%r2) 0x0000000004b8f96c <+276>: cgijne %r10,0,0x4b8f9de <PyCFunction_NewEx+390> 0x0000000004b8f972 <+282>: stg %r10,32(%r2) 0x0000000004b8f978 <+288>: lg %r0,-8(%r2) 0x0000000004b8f97e <+294>: lay %r14,-24(%r2) 0x0000000004b8f984 <+300>: srag %r11,%r0,1 0x0000000004b8f98a <+306>: cgijne %r11,-2,0x4b8f9e8 <PyCFunction_NewEx+400> 0x0000000004b8f990 <+312>: risbg %r10,%r0,63,191,0 0x0000000004b8f996 <+318>: lghi %r1,-6 0x0000000004b8f99a <+322>: ogr %r10,%r1 0x0000000004b8f99e <+326>: stg %r10,-8(%r2) 0x0000000004b8f9a4 <+332>: lgrl %r3,0x4daaf28 0x0000000004b8f9aa <+338>: lay %r9,-24(%r2) 0x0000000004b8f9b0 <+344>: lg %r4,0(%r3) 0x0000000004b8f9b6 <+350>: lg %r5,8(%r4) 0x0000000004b8f9bc <+356>: vlvgp %v2,%r4,%r5 0x0000000004b8f9c2 <+362>: vst %v2,0(%r9) 0x0000000004b8f9c8 <+368>: stg %r14,0(%r5) 0x0000000004b8f9ce <+374>: lg %r11,0(%r3) 0x0000000004b8f9d4 <+380>: stg %r14,8(%r11) 0x0000000004b8f9da <+386>: j 0x4b8f934 <PyCFunction_NewEx+220> 0x0000000004b8f9de <+390>: agsi 0(%r10),1 0x0000000004b8f9e4 <+396>: j 0x4b8f972 <PyCFunction_NewEx+282> 0x0000000004b8f9e8 <+400>: larl %r2,0x4d0cf90 0x0000000004b8f9ee <+406>: brasl %r14,0x4b1fbc8 <Py_FatalError@plt> End of assembler dump.
Created attachment 125898 [details] First attempt at a fix This tries to prevent the bad addressing mode for vector store operations. So far I can't reproduce the problem, so I can't test whether it helps.
(In reply to Andreas Arnez from comment #3) > Created attachment 125898 [details] > First attempt at a fix > > This tries to prevent the bad addressing mode for vector store operations. > So far I can't reproduce the problem, so I can't test whether it helps. This fixes the original issue for me and make nonexp-regcheck produces: == 731 tests, 4 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/memcmptest (stderr) memcheck/tests/vbit-test/vbit-test (stderr) drd/tests/bar_bad (stderr) drd/tests/tc04_free_lock (stderr) I believe all 4 failures are pre-existing and not related to this patch.
And after installing some more things and doing a full make regcheck I get even more tests passing and only 3 failures: == 756 tests, 3 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/memcmptest (stderr) memcheck/tests/vbit-test/vbit-test (stderr) drd/tests/tc04_free_lock (stderr) (I think the bar/bad one fails non-deterministically)
(In reply to Mark Wielaard from comment #5) [...] > memcheck/tests/memcmptest (stderr) This is a known one on my to-do list (low-prio). It requires a fix in the implementation of the CLC instruction. > memcheck/tests/vbit-test/vbit-test (stderr) This is also known. The test uses And1 and Or1, which are not implemented on s390x yet. (Unfortunately the test still fails even after implementing them, that's another issue.) > drd/tests/tc04_free_lock (stderr) Fails for me as well, with a small diff in the backtrace. Haven't investigated this too much, maybe a problem in the test case? > (I think the bar/bad one fails non-deterministically) I don't see this failing. How does the failure look like in your case?
(In reply to Andreas Arnez from comment #6) > (In reply to Mark Wielaard from comment #5) > > (I think the bar/bad one fails non-deterministically) > I don't see this failing. How does the failure look like in your case? Of course now I cannot get it to fail. I don't think it has anything to do with this bug/patch though. I have also tested the patch against Fedora s390x and things look fine there too.
(In reply to Mark Wielaard from comment #7) > I have also tested the patch against Fedora s390x and things look fine there > too. Good! Unless there are any further comments, I'll push this early next week.
OK, pushed as git commit f27fe920cd321ca3cf4bc03a72879fd18bf2736f.