vex: priv/guest_arm64_toIR.c:4166 (dis_ARM64_load_store): Assertion `0' failed. vex storage: T total 196295680 bytes allocated vex storage: P total 0 bytes allocated valgrind: the 'impossible' happened: LibVEX called failure_exit(). The code which triggers this is apparently: Thread 1: status = VgTs_Runnable ==3062== at 0x4AF9444: _gcry_kdf_pkdf2 (kdf.c:196) ==3062== by 0x4AF95CF: _gcry_kdf_derive (kdf.c:279) ==3062== by 0x4AE54EF: gcry_kdf_derive (visibility.c:1262) ==3062== by 0x487A51B: crypt_pbkdf (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0) ==3062== by 0x487CBA3: LUKS_open_key (keymanage.c:935) ==3062== by 0x487CE63: LUKS_open_key_with_hdr (keymanage.c:991) ==3062== by 0x4869ECB: volume_key_by_terminal_passphrase (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0) ==3062== by 0x486DE53: crypt_activate_by_passphrase (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0) ==3062== by 0x407E87: action_open_luks (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup) ==3062== by 0x40955B: action_open (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup) ==3062== by 0x409B37: run_action (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup) ==3062== by 0x40AA9B: main (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup) It seems to be one of the following instructions, most likely ubfx: sbuf[saltlen] = (lidx >> 24); 1c4: 1e27000a fmov s10, w0 /home/rjones/d/fedora/libgcrypt/master/libgcrypt-1.6.1/cipher/kdf.c:196 sbuf[saltlen + 1] = (lidx >> 16); 1c8: d3505ee0 ubfx x0, x23, #16, #8 1cc: f9005ba0 str x0, [x29,#176] /home/rjones/d/fedora/libgcrypt/master/libgcrypt-1.6.1/cipher/kdf.c:197 sbuf[saltlen + 2] = (lidx >> 8); 1d0: d3483ee0 ubfx x0, x23, #8, #8 1d4: f90053a0 str x0, [x29,#160]
Looking at the valgrind code, it's most probably the store instruction: 1cc: f9005ba0 str x0, [x29,#176] I removed the failing assert (see attached patch) and reran valgrind. However the program behaviour changes (versus without valgrind). This leads me to think the code which the assert is protecting is actually wrong.
Created attachment 87931 [details] INCORRECT patch which removes the assertion. This is the patch I tested. Note it is probably incorrect, so do not apply it.
Looking much more closely at the valgrind VEX code, I believe the affected instruction is not the str mentioned above, but this str: /usr/src/debug/libgcrypt-1.6.1/cipher/kdf.c:195 for (iter = 0; iter < iterations; iter++) { _gcry_md_reset (md); if (!iter) /* Compute U_1: */ { sbuf[saltlen] = (lidx >> 24); 1e458: 3c3c680a str b10, [x0,x28]
(In reply to Richard Jones from comment #3) > 1e458: 3c3c680a str b10, [x0,x28] This is correct -- the assert you removed pertains to 00 111100 001 Rm option S 10 Rn Rt STR Bt, [Xn|SP, R<m>{ext/sh}] I'm a bit surprised the behaviour changed, though. I don't see anything obviously wrong with that case. Does it change with --tool=none? Is this some crypto code that is trying to generate entropy by reading junk on the stack (etc) and hence you expect to get different behaviour? I ask because of the references to salting.
> I'm a bit surprised the behaviour changed, though. I don't see > anything obviously wrong with that case. Does it change with > --tool=none? With the assert removed and using --tool=none, the behaviour is the same as in the non-valgrind case. The default tool is memcheck. Could using memcheck change the behaviour, while at the same time not printing any kind of warning/ error message? > Is this some crypto code that is trying to generate > entropy by reading junk on the stack (etc) and hence you expect to get > different behaviour? I ask because of the references to salting. Well it's libgcrypt & cryptsetup. I would *hope* that it wouldn't try to read uninitialized data. It's aarch64 so the tools may be broken or immature. The reason I'm here in the first place is because I'm chasing down what looks like it might be a code gen / alignment / uninitialized data problem. But I haven't worked out exactly what. Was hoping valgrind would help me :-)
(In reply to Richard Jones from comment #5) > With the assert removed and using --tool=none, the behaviour is > the same as in the non-valgrind case. Different behaviour with none and memcheck is -- at least for the mature V ports -- generally a sign of the application under test being broken in some subtle way. > The default tool is memcheck. Could using memcheck change the > behaviour, while at the same time not printing any kind of warning/ > error message? Are there any threads involved? Both tools will change the thread interleaving in a big way compared to native execution. Have you tried with --fair-sched=yes?
cryptsetup doesn't appear to link to pthread or use threads. Yes, I know cryptsetup (or more likely, gcc) is broken. Just not sure exactly how :-(
Fixed, r2943, r14458. I also enabled the analogous 16-bit store case while I was at it.