Bug 381556 - arm64: Handle feature registers access on 4.11 Linux kernel or later
Summary: arm64: Handle feature registers access on 4.11 Linux kernel or later
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.13 SVN
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-23 03:44 UTC by Siddhesh Poyarekar
Modified: 2021-07-30 03:46 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
mask arm64 hwcaps (2.99 KB, patch)
2018-06-19 12:11 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Siddhesh Poyarekar 2017-06-23 03:44:35 UTC
Since linux 4.11, the arm64 kernel now emulates the mrs instruction for userspace and exposes some feature registers, namely:

- ID_AA64ISAR0_EL1
- ID_AA64PFR0_EL1
- MIDR_EL1

glibc 2.26 (releasing in August) uses MIDR_EL1 to select its multiarch routines and binaries running under valgrind on a 4.11 arm64 kernel will fail with an unhandled instruction error.  This was reported by Florian Weimer of Red Hat on Fedora rawhide:

ARM64 front end: branch_etc
disInstr(arm64): unhandled instruction 0xD5380000
disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000
==924== valgrind: Unrecognised instruction at address 0x11f548.
==924==    at 0x11F548: init_cpu_features (cpu-features.c:32)
==924==    by 0x11F548: dl_platform_init (dl-machine.h:241)
==924==    by 0x11F548: _dl_sysdep_start (dl-sysdep.c:231)
==924==    by 0x10981B: _dl_start_final (rtld.c:412)
==924==    by 0x109AAB: _dl_start (rtld.c:520)
==924==    by 0x108F47: ??? (in
Comment 1 Mark Wielaard 2017-06-23 10:57:09 UTC
See also this Fedora bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1464211
valgrind: Mask CPUID support in HWCAP on aarch64

This is what I do as workaround for now in the fedora valgrind package:

diff --git a/coregrind/m_initimg/initimg-linux.c b/coregrind/m_initimg/initimg-linux.c
index 30e1f85..387beae 100644
--- a/coregrind/m_initimg/initimg-linux.c
+++ b/coregrind/m_initimg/initimg-linux.c
@@ -703,6 +703,12 @@ Addr setup_client_stack( void*  init_sp,
                   (and anything above) are not supported by Valgrind. */
                auxv->u.a_val &= VKI_HWCAP_S390_TE - 1;
             }
+#           elif defined(VGP_arm64_linux)
+            {
+               /* Linux 4.11 started pupulating this for arm64, but we
+                  currently don't support any. */
+               auxv->u.a_val = 0;
+            }
 #           endif
             break;
 #        if defined(VGP_ppc64be_linux) || defined(VGP_ppc64le_linux)
Comment 2 Mark Wielaard 2017-07-06 09:02:54 UTC
I cannot find any information on the arm64 HWCAP. There is just this list of constants in the kernel sources:

#define HWCAP_FP                (1 << 0)
#define HWCAP_ASIMD             (1 << 1)
#define HWCAP_EVTSTRM           (1 << 2)
#define HWCAP_AES               (1 << 3)
#define HWCAP_PMULL             (1 << 4)
#define HWCAP_SHA1              (1 << 5)
#define HWCAP_SHA2              (1 << 6)
#define HWCAP_CRC32             (1 << 7)
#define HWCAP_ATOMICS           (1 << 8)
#define HWCAP_FPHP              (1 << 9)
#define HWCAP_ASIMDHP           (1 << 10)
#define HWCAP_CPUID             (1 << 11)
#define HWCAP_ASIMDRDM          (1 << 12)
#define HWCAP_JSCVT             (1 << 13)
#define HWCAP_FCMA              (1 << 14)
#define HWCAP_LRCPC             (1 << 15)

Without knowing more about what these stand for it seems best to just mask them all out as done in comment #1.

Does someone know which instruction sets the HWCAPS bits stand for and which ones are currently (fully) implemented in valgrind. Then we can more selectively enable some of the bits.
Comment 3 Peter Maydell 2017-07-06 09:39:22 UTC
I believe there's a plan in the works to add a docs patch to the kernel for the hwcap bit meanings, but in the meantime the kernel code for them is here:
http://elixir.free-electrons.com/linux/latest/source/arch/arm64/kernel/cpufeature.c#L886

and it basically defines each HWCAP in terms of what the values in the relevant fields of the architectural ID registers (ID_AA64*) need to be -- you can look those up in the ARM ARM to find their meanings. The only one not in that table I think is HWCAP_CPUID, which is the "support emulation of access to CPU ID feature registers" bit.
Comment 4 Mark Wielaard 2018-06-18 13:14:40 UTC
I pushed the workaround mentioned in comment #1:

commit ad4481d23aa54ad947f7dcd194f1233e0b99c70f
Author: Mark Wielaard <mark@klomp.org>
Date:   Mon Jun 18 15:07:27 2018 +0200

    Add workaround for arm64 AT_HWCAP on newer kernels. Bug KDE#381556.
    
    Starting with linux 4.11 the kernel started to populate the AT_HWCAPS
    auxv entry. And glibc 2.26 now uses this to see whether it can use the
    mrs instruction and certain feature registers on arm64. Since these
    are not supported under valgrind this causes an unhandled instruction
    error. Workaround this for now my just clearing the AT_HWCAPS on arm64.
    
    This should be fixed properly by someone with knowledge of what each
    of the arm64 HWCAPS bits mean and which bits correspond to instructions
    and registers supported by VEX or not.
    https://bugs.kde.org/show_bug.cgi?id=381556

Keeping this bug open, so this can be fixed properly.
Comment 5 Peter Maydell 2018-06-18 13:54:04 UTC
That workaround change looks wrong to me. Surely Valgrind supports at least FP and Neon ?
Comment 6 Peter Maydell 2018-06-18 13:55:15 UTC
Looking back in the history, my comment #3 should have enough information for somebody who knows what instructions Valgrind implements to be able to set the hwcaps appropriately.
Comment 7 Mark Wielaard 2018-06-18 14:41:15 UTC
(In reply to Peter Maydell from comment #5)
> That workaround change looks wrong to me. Surely Valgrind supports at least
> FP and Neon ?

I think so, but nobody seems to test the AT_HWCAP bits for them it seems.
Comment 8 Mark Wielaard 2018-06-18 14:46:02 UTC
(In reply to Peter Maydell from comment #6)
> Looking back in the history, my comment #3 should have enough information
> for somebody who knows what instructions Valgrind implements to be able to
> set the hwcaps appropriately.

Hopefully. It wasn't immediately clear to me what part of the code your comment precisely refers to (the link goes to the middle of that file). And I am not sure what ID_AA64* precisely is, or what "ARM ARM" refers to. So I am not able to find the table you refer to to look it up.

I am sure it is all pretty obvious and clear to someone who does know more about arm64 cpu instructions.
Comment 9 Tom Hughes 2018-06-18 14:51:33 UTC
ARM ARM is the ARM Architecture Reference Manual, aka the official documentation of the instruction set.
Comment 10 Mark Wielaard 2018-06-18 14:57:27 UTC
(In reply to Tom Hughes from comment #9)
> ARM ARM is the ARM Architecture Reference Manual, aka the official
> documentation of the instruction set.

Thanks. Is that document publicly available anywhere? The only thing I could found on the arm website was unavailable because it was Restricted Access to registered ARM customers only. Could someone post the table from that document if they have it?
Comment 11 Peter Maydell 2018-06-18 15:18:48 UTC
The A-profile Arm ARM can be downloaded from https://developer.arm.com/products/architecture/a-profile/docs without requiring a login/clickthrough/etc.
Comment 12 Peter Maydell 2018-06-18 15:26:43 UTC
Sorry about the bogus kernel-source link -- I made the mistake of linking to the 'latest' version, which of course is a moving target so the line number reference gets out of date. Here's the link to a specific kernel version:
https://elixir.bootlin.com/linux/v4.17.2/source/arch/arm64/kernel/cpufeature.c#L1221

which has an array with an entry for each hwcap, so eg
   HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_AES_SHIFT, FTR_UNSIGNED, 2, CAP_HWCAP, HWCAP_PMULL),

says that the PMULL hwcap bit is set if the AES field in the system ID_AA64ISAR0_EL1 ID register is at least 2. You can then look that register up in the Arm ARM to see that it means the PMULL hwcap should be set if we have the PMULL/PMULL2 insns (as well as the AESE/AESD/AESMC/AESIMC insns that we have for a 1 value in that ID register and which the kernel reports via the AES hwcap).
Comment 13 Peter Maydell 2018-06-18 15:46:12 UTC
Based on a quick grep of guest_arm64_toIR.c to see what insns it has, I think Valgrind should be setting the hwcap bits AES, PMULL, SHA1, SHA2, CRC32, FP, ASIMD and making the rest zero.
Comment 14 Mark Wielaard 2018-06-19 12:11:19 UTC
Created attachment 113427 [details]
mask arm64 hwcaps

This patch should do the right thing.
It simply adds the VKI_HWCAPs to vki/vki-arm64-linux.h and masks AT_HWCAP with (VKI_HWCAP_AES | VKI_HWCAP_PMULL | VKI_HWCAP_SHA1 | VKI_HWCAP_SHA2 | VKI_HWCAP_CRC32 | VKI_HWCAP_FP | VKI_HWCAP_ASIMD).

The setup that I have doesn't have new enough glibc to test this against. But I can see that the AT_HWCAP is limited to the above set. It seems glibc checks for HWCAP_CPUID, which this machine doesn't have anyway. But this should work on more modern setups.
Comment 15 Mark Wielaard 2018-06-21 06:27:01 UTC
commit fbbb696c5d1e93d4ac6cb548c68bb3f443ceef42
Author: Mark Wielaard <mark@klomp.org>
Date:   Tue Jun 19 18:00:45 2018 +0200

    Mask AT_HWCAPS on arm64 to those instructions VEX implements.
    
    This patch makes sure that the process running under valgrind only sees
    the AES, PMULL, SHA1, SHA2, CRC32, FP, and ASIMD features in auxv AT_HWCAPS.
    
    https://bugs.kde.org/show_bug.cgi?id=381556
Comment 16 kevinz 2021-07-30 03:46:24 UTC
Does this mask still need now?

We are running PMDK(https://github.com/pmem/pmdk) test suite with Valgrind on aarch64.
Now we find that with Valgrind some CPU features(here we need dcpop) is masked due to this patch and that will induce the code always can not run the CPU flushing code: https://github.com/pmem/pmdk/blob/master/src/libpmem2/aarch64/arm_cacheops.h#L61