Bug 337762 - vex: priv/guest_arm64_toIR.c:4166 (dis_ARM64_load_store): Assertion `0' failed.
Summary: vex: priv/guest_arm64_toIR.c:4166 (dis_ARM64_load_store): Assertion `0' failed.
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.9.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-24 10:24 UTC by Richard Jones
Modified: 2014-09-04 11:46 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
INCORRECT patch which removes the assertion. (558 bytes, patch)
2014-07-24 11:30 UTC, Richard Jones
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Jones 2014-07-24 10:24:33 UTC
vex: priv/guest_arm64_toIR.c:4166 (dis_ARM64_load_store): Assertion `0' failed.
vex storage: T total 196295680 bytes allocated
vex storage: P total 0 bytes allocated

valgrind: the 'impossible' happened:
   LibVEX called failure_exit().

The code which triggers this is apparently:

Thread 1: status = VgTs_Runnable
==3062==    at 0x4AF9444: _gcry_kdf_pkdf2 (kdf.c:196)
==3062==    by 0x4AF95CF: _gcry_kdf_derive (kdf.c:279)
==3062==    by 0x4AE54EF: gcry_kdf_derive (visibility.c:1262)
==3062==    by 0x487A51B: crypt_pbkdf (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0)
==3062==    by 0x487CBA3: LUKS_open_key (keymanage.c:935)
==3062==    by 0x487CE63: LUKS_open_key_with_hdr (keymanage.c:991)
==3062==    by 0x4869ECB: volume_key_by_terminal_passphrase (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0)
==3062==    by 0x486DE53: crypt_activate_by_passphrase (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/lib/.libs/libcryptsetup.so.4.6.0)
==3062==    by 0x407E87: action_open_luks (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup)
==3062==    by 0x40955B: action_open (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup)
==3062==    by 0x409B37: run_action (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup)
==3062==    by 0x40AA9B: main (in /home/rjones/d/fedora/cryptsetup/master/cryptsetup-1.6.5/src/.libs/lt-cryptsetup)

It seems to be one of the following instructions, most likely ubfx:

              sbuf[saltlen]     = (lidx >> 24);
 1c4:   1e27000a        fmov    s10, w0
/home/rjones/d/fedora/libgcrypt/master/libgcrypt-1.6.1/cipher/kdf.c:196
              sbuf[saltlen + 1] = (lidx >> 16);
 1c8:   d3505ee0        ubfx    x0, x23, #16, #8
 1cc:   f9005ba0        str     x0, [x29,#176]
/home/rjones/d/fedora/libgcrypt/master/libgcrypt-1.6.1/cipher/kdf.c:197
              sbuf[saltlen + 2] = (lidx >> 8);
 1d0:   d3483ee0        ubfx    x0, x23, #8, #8
 1d4:   f90053a0        str     x0, [x29,#160]
Comment 1 Richard Jones 2014-07-24 11:29:03 UTC
Looking at the valgrind code, it's most probably the store instruction:

 1cc:   f9005ba0        str     x0, [x29,#176]

I removed the failing assert (see attached patch) and reran valgrind.

However the program behaviour changes (versus without valgrind).  This
leads me to think the code which the assert is protecting is actually wrong.
Comment 2 Richard Jones 2014-07-24 11:30:08 UTC
Created attachment 87931 [details]
INCORRECT patch which removes the assertion.

This is the patch I tested.  Note it is probably incorrect, so do
not apply it.
Comment 3 Richard Jones 2014-07-24 12:13:49 UTC
Looking much more closely at the valgrind VEX code, I believe the
affected instruction is not the str mentioned above, but this str:

/usr/src/debug/libgcrypt-1.6.1/cipher/kdf.c:195
      for (iter = 0; iter < iterations; iter++)
        {
          _gcry_md_reset (md);
          if (!iter) /* Compute U_1:  */
            {
              sbuf[saltlen]     = (lidx >> 24);
   1e458:       3c3c680a        str     b10, [x0,x28]
Comment 4 Julian Seward 2014-07-24 12:58:25 UTC
(In reply to Richard Jones from comment #3)
>    1e458:       3c3c680a        str     b10, [x0,x28]

This is correct -- the assert you removed pertains to
00 111100 001 Rm option S  10 Rn Rt  STR Bt, [Xn|SP, R<m>{ext/sh}]

I'm a bit surprised the behaviour changed, though.  I don't see
anything obviously wrong with that case.  Does it change with
--tool=none?  Is this some crypto code that is trying to generate
entropy by reading junk on the stack (etc) and hence you expect to get
different behaviour?  I ask because of the references to salting.
Comment 5 Richard Jones 2014-07-24 13:18:12 UTC
> I'm a bit surprised the behaviour changed, though.  I don't see
> anything obviously wrong with that case.  Does it change with
> --tool=none?

With the assert removed and using --tool=none, the behaviour is
the same as in the non-valgrind case.

The default tool is memcheck.  Could using memcheck change the
behaviour, while at the same time not printing any kind of warning/
error message?

> Is this some crypto code that is trying to generate
> entropy by reading junk on the stack (etc) and hence you expect to get
> different behaviour?  I ask because of the references to salting.

Well it's libgcrypt & cryptsetup.  I would *hope* that it wouldn't
try to read uninitialized data.  It's aarch64 so the tools may be
broken or immature.

The reason I'm here in the first place is because I'm chasing
down what looks like it might be a code gen / alignment / uninitialized
data problem.  But I haven't worked out exactly what.  Was hoping
valgrind would help me :-)
Comment 6 Julian Seward 2014-07-24 13:27:50 UTC
(In reply to Richard Jones from comment #5)
> With the assert removed and using --tool=none, the behaviour is
> the same as in the non-valgrind case.

Different behaviour with none and memcheck is -- at least for the
mature V ports -- generally a sign of the application under test being
broken in some subtle way.

> The default tool is memcheck.  Could using memcheck change the
> behaviour, while at the same time not printing any kind of warning/
> error message?

Are there any threads involved?  Both tools will change the thread
interleaving in a big way compared to native execution.  Have you
tried with --fair-sched=yes?
Comment 7 Richard Jones 2014-07-24 13:42:55 UTC
cryptsetup doesn't appear to link to pthread or use threads.

Yes, I know cryptsetup (or more likely, gcc) is broken.  Just not
sure exactly how :-(
Comment 8 Julian Seward 2014-09-04 11:46:31 UTC
Fixed, r2943, r14458.  I also enabled the analogous 16-bit store
case while I was at it.