Bug 470978

Summary: s390x: Valgrind cannot start qemu-kvm when "sysctl vm.allocate_pgste=0"
Product: [Developer tools] valgrind Reporter: Andreas Arnez <arnez>
Component: generalAssignee: Andreas Arnez <arnez>
Status: RESOLVED FIXED    
Severity: normal CC: mark, tom
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Build with -Wl,--s390-pgste if the linker supports it

Description Andreas Arnez 2023-06-13 13:22:22 UTC
qemu-kvm needs the PGSTE mode to be enabled by the kernel. The kernel activates this mode upon exec() when recognizing the s390-specific ELF section PT_S390_PGSTE. Another option is to activate the mode for the whole system (with performance penalty) with the sysctl setting `vm.allocate_psgte'.
But when the system-wide setting is not enabled and qemu-kvm is run under Valgrind, the PGSTE mode will not be enabled, leading to a failure like this:

  ioctl(KVM_CREATE_VM) failed: 22 Invalid argument
  Host kernel setup problem detected. Please verify:
  - for kernels supporting the switch_amode or user_mode parameters, whether
  user space is running in primary address space
  - for kernels supporting the vm.allocate_pgste sysctl, whether it is enabled
  qemu-kvm: failed to initialize kvm: Invalid argument
Comment 1 Tom Hughes 2023-06-13 15:05:34 UTC
Does the kernel provide an API to allow user space to activate this?

If it doesn't then there isn't much valgrind can do, other than provide an option to add that ELF section to the valgrind binary, which would then mean all valgrind use would have it activated.
Comment 2 Mark Wielaard 2023-06-13 15:26:09 UTC
(In reply to Tom Hughes from comment #1)
> Does the kernel provide an API to allow user space to activate this?

The only APIs are running with systctl vm.allocate_psgte=1 systemwide, so all processes run with 4K page tables,
or adding the (empty) PT_S390_PGSTE phdr segment to the process.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23fefe119ceb5fb0c7d3321010620010a4eddb18

> If it doesn't then there isn't much valgrind can do, other than provide an
> option to add that ELF section to the valgrind binary, which would then mean
> all valgrind use would have it activated.

Yeah. Although slightly wasteful just adding PT_S390_PGSTE to valgrind seems the simplest workaround (it just means there are 4K page tables are always used).

This is the configure check that qemu used to add it:
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg04363.html
Comment 3 Andreas Arnez 2023-06-15 16:18:44 UTC
Created attachment 159694 [details]
Build with -Wl,--s390-pgste if the linker supports it

This patch should enable building with -Wl,--s390-pgste. I've tested that the Valgrind tools are actually built with that flag on a system where the linker supports this. Note that I have *not* tested running qemu-kvm yet. Also, I'd appreciate if someone with more autoconf knowledge could review this.
Comment 4 Mark Wielaard 2023-06-16 11:38:38 UTC
The new configure check and tool ldflags addition look good to me.
Comment 5 Andreas Arnez 2023-06-28 14:22:23 UTC
(In reply to Mark Wielaard from comment #4)
> The new configure check and tool ldflags addition look good to me.
Thanks for checking! I pushed this now.