Bug 398883

Summary: valgrind incorrectly assumes ABI on PowerPC based on endianness
Product: valgrind Reporter: A. Wilcox (awilfox) <awilfox>
Component: generalAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: crash CC: cel, kde
Priority: NOR    
Version: 3.13.0   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: Ensure ELFv2 is supported on PPC64

Description A. Wilcox (awilfox) 2018-09-20 18:10:28 UTC
valgrind (mainly coregrind, though there are a few misc bits in memcheck it looks like) assumes ELFv1 ABI on big endian and ELFv2 ABI on little endian.

Neither of these assumptions are correct; the musl libc (used by Void, Adélie, Alpine, Sabotage, and other Linux distros) uses ELFv2 ABI on big endian, and older releases of OpenSuSE use ELFv1 ABI on little endian.

I plan on submitting a patch, either here or on Differential, in the coming weeks.  My change plan is:


* add definition, VGP_ppc64_linux_abi1 and VGP_ppc64_linux_abi2

* abi1 is defined if (defined(VGP_ppc64be_linux) && (!defined(_CALL_ELF) || _CALL_ELF == 1) || (defined(VGP_ppc64le_linux) && defined(_CALL_ELF) && _CALL_ELF == 1)

* abi2 is defined if (defined(VGP_ppc64le_linux) && (!defined(_CALL_ELF) || _CALL_ELF == 2) || (defined(VGP_ppc64be_linux) && defined(_CALL_ELF) && _CALL_ELF == 2)

* then, in the C code, defined(VGP_ppc64le_linux) becomes defined(VGP_ppc64_linux_abi2)

* and defined(VGP_ppc64be_linux) becomes defined(VGP_ppc64_linux_abi1)
Comment 1 A. Wilcox (awilfox) 2018-09-25 21:27:04 UTC
Okay, it doesn't have to be so hard.  There is already VG_PLAT_USES_PPCTOC; so this change just needs to actually use that define instead of just endianness to determine what ABI to use.
Comment 2 A. Wilcox (awilfox) 2018-09-27 22:27:12 UTC
Created attachment 115278 [details]
Ensure ELFv2 is supported on PPC64

This patch has been tested on PPC64 BE, both ELFv1 (Debian) and ELFv2 (Adélie).  I don't have any LE hardware to test on, but there should be no functional change.

I've tested memcheck and callgrind on both ELF ABIs on a POWER9 host running big-endian and found no issues.
Comment 3 Julian Seward 2019-03-10 07:30:47 UTC
Carl, do you have any opinion on this?
Comment 4 Carl Love 2019-03-11 16:00:41 UTC
IBM only supports ELFv1 ABI on Big endian machines and ELFv2 ABI on Little endian machines.  I checked a little on the statements about the ELF usages.  From what I was told OpenSuSE uses ELFv2 ABI on little endian.  So, I would be curious where to get the info that OpenSuSE is using ELFv1 ABI on Little Endian.  Similarly, with the musl libc.  From my discussion with the GCC team, they said that musl has been known to play around with ELFv2 ABI on Big Endian but they are off on their own in this regard as it isn't supported, recommended or tested by IBM.  The key thing is the OS, library, Valgrind etc.  all need to either use ELFv1 ABI or ELFv2 ABI.  You can't mix these.

Given that IBM does not recommend using ELFv1 on Little endian and ELFv2 on Big endian, that is not to say it can't be done.  The proposed patch is not necessarily invalid but IBM will not support these but will also not stand in the way of someone else trying it but they are on their own for support.
Comment 5 A. Wilcox (awilfox) 2019-03-11 17:17:18 UTC
I noted that it was specifically older versions of OpenSuSE.  Specifically, the very first release.  I believe it is long deprecated; it still, however, does exist.

> Similarly, with the musl libc.  From my discussion with the GCC team, they said that musl has been known to play around with ELFv2 ABI on Big Endian but they are off on their own in this regard as it isn't supported, recommended or tested by IBM.

I fear "play around" may be a wee dismissive.  It's a serious port and it has serious consumers, including Adélie Linux and Void Linux.  Additionally, FreeBSD is planning to move to ELFv2 on Big Endian, which means eventually FreeBSD will need to undefine VG_PLAT_USES_PPCTOC as well. :)
Comment 6 Brandon Bergren 2019-03-11 17:32:41 UTC
While I agree that using ELFv1 on LE is a completely unsupported configuration and need not be supported, the OpenPOWER ELF V2 ABI is fully defined and intended for use in both little- and big-endian environments.

Due to the improved toolchain support (specifically, lld's dropping of ELFv1 support entirely, and ELFv2 being a lot less "weird" of an ABI meaning it tends to have less esoteric issues and can more easily leverage mainstream optimizations) I see the future of BE on POWER to be ELFv2 as well.

And yes, FreeBSD is moving towards ELFv2 at a rapid pace. (I personally am running ELFv2 userlands exclusively in FreeBSD on my POWER9 machines.)
Comment 7 Brandon Bergren 2019-05-14 23:21:29 UTC
Ahh, I think we were using different meanings for the word "supported". I was speaking in terms of "what can the hardware do, is specified in the ABI, and is currently in use in the wild", not "what you can actually get official support for".