Each of the "vector multiply" instructions VME, VMO, VMLE, VMLO as well as the "vector multiply and add" instructions VMAE, VMAO, VMALE, VMALO multiply the corresponding even or odd lanes of two input vectors and generate a double-wide result. Now there's confusion in Valgrind about what "even" and "odd" means in this context. While the architecture specifies big-endian lane numbering (highest lane=zero), Valgrind IROps such as Iop_MullEven32Ux4 are supposed to behave exactly the other way around. (See the comment in libvex_ir.h, which states "lowest lane=zero".) However, the s390x-specific IR translation as well as the code generation ignore this and behave as if Iop_MullEven32Ux4 operated on the higher lanes. While the calculated result is not affected by this discrepancy, the tracking of undefined bits with memcheck is. Thus, if VME is invoked on two vectors whose lower lanes are uninitialized and the result vector is used, memcheck complains: ==57866== Use of uninitialised value of size 8 ==57866== at 0x10005F0: depend_on (vme.c:15) ==57866== by 0x1000977: test_vme_2 (vme.c:62) ==57866== by 0x1001055: do_valid (vme.c:76) ==57866== by 0x10011CF: main (vme.c:99)
I'll apply a fix that sticks to the definition of "even" lanes in libvex_ir.h, where the lanes are numbered from low to high. Note that I would not use such a numbering scheme in general on big-endian platforms. In my view, "multiply even" could also be viewed as a naming accident; if it were called something like "multiply from low", there was no need to consider lane numbering. In this way the confusion could be avoided.
(In reply to Andreas Arnez from comment #1) > > In my view, "multiply even" could also be viewed as a naming > accident; if it were called something like "multiply from low", there was > no need to consider lane numbering. In this way the confusion could be > avoided. I like the idea of renaming the IROp if that makes things clearer.
(In reply to Florian Krohm from comment #2) > I like the idea of renaming the IROp if that makes things clearer. I tend to agree. I'm hesitant to do that as part of the fix though, since it affects many architectures. Thus I'll apply the fix first; then we can discuss the renaming patch independently from that.
Pushed a fix and a test case: 5deca19cc Bug 509517 - s390x: Add even/odd-lane memcheck test for VME etc. 6ac493c0e Bug 509517 - s390x: Fix even/odd lane confusion for VME etc.