Bug 411189

Summary: Valgrind does not support POWER9 "DARN" RNG instructions
Product: [Developer tools] valgrind Reporter: Jack Lloyd <lloyd>
Component: generalAssignee: Julian Seward <jseward>
Status: CLOSED FIXED    
Severity: normal CC: cel, mark, noloader, will_schmidt
Priority: NOR    
Version: 3.15 SVN   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Jack Lloyd 2019-08-22 20:26:22 UTC
SUMMARY

Valgrind does not currently support the POWER9 DARN RNG instructions. These are similar to x86 RDRAND/RDSEED.

STEPS TO REPRODUCE

$ cat min_darn.c
#include <stdint.h>
#include <stdio.h>

int main()
{
  uint64_t darn = __builtin_darn();
  printf("%016llX\n", darn);
  return 0;
}
$ powerpc64le-unknown-linux-gnu-gcc -mcpu=power9 -O min_darn.c -o min_darn
$ ./min_darn
51302492414386A8 # this will differ each time the program is run
$ valgrind ./min_darn

OBSERVED RESULT

==129755== Memcheck, a memory error detector
==129755== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==129755== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==129755== Command: ./min_darn
==129755== 
disInstr(ppc): unhandled instruction: 0x7C8105E6
                 primary 31(0x1F), secondary 1510(0x5E6)
==129755== valgrind: Unrecognised instruction at address 0x100005d0.
==129755==    at 0x100005D0: main (in /home/lloyd/min_darn)
==129755== Your program just tried to execute an instruction that Valgrind
==129755== did not recognise.  There are two possible reasons for this.
==129755== 1. Your program has a bug and erroneously jumped to a non-code
==129755==    location.  If you are running Memcheck and you just saw a
==129755==    warning about a bad jump, it's probably your program's fault.
==129755== 2. The instruction is legitimate but Valgrind doesn't handle it,
==129755==    i.e. it's Valgrind's fault.  If you think this is the case or
==129755==    you are not sure, please let us know and we'll try to fix it.
==129755== Either way, Valgrind will now raise a SIGILL signal which will
==129755== probably kill your program.
==129755== 
==129755== Process terminating with default action of signal 4 (SIGILL)
==129755==  Illegal opcode at address 0x100005D0
==129755==    at 0x100005D0: main (in /home/lloyd/min_darn)
==129755== 
==129755== HEAP SUMMARY:
==129755==     in use at exit: 0 bytes in 0 blocks
==129755==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==129755== 
==129755== All heap blocks were freed -- no leaks are possible
==129755== 
==129755== For lists of detected and suppressed errors, rerun with: -s
==129755== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction


EXPECTED RESULT

It should run ;)

SOFTWARE/OS VERSIONS
Valgrind 3.15.0 release
GCC 9.2.0 release
POWER9 running CentOS 7, kernel 4.14.0

ADDITIONAL INFORMATION
A POWER9 machine is available on the GCC compile farm (gcc135).
Comment 1 Jeffrey Walton 2020-05-15 19:06:34 UTC
Add a "mee too". Crypto++ is dying during analysis.

The issue should also affect OpenSSL and GnuTLS. Maybe even GnuPG.

=====

The Crypto++ use case is a bit different. Compiler support for DARN is a bit lacking among GCC, Clang and XLC. Crypto++ just issues byte codes for the instruction (https://github.com/weidai11/cryptopp/blob/master/darn.cpp#L43):

    do
    {
        __asm__ __volatile__ (
            #if (CRYPTOPP_BIG_ENDIAN)
            ".byte 0x7c, 0x60, 0x05, 0xe6  \n\t"  // r3 = darn 3, 0
            "mr %0, 3                      \n\t"  // val = r3
            #else
            ".byte 0xe6, 0x05, 0x60, 0x7c  \n\t"  // r3 = darn 3, 0
            "mr %0, 3                      \n\t"  // val = r3
            #endif
            : "=r" (*ptr) : : "r3"
        );
    } while (*ptr == 0xFFFFFFFFu);
Comment 2 Jeffrey Walton 2020-05-15 19:20:42 UTC
Here is OpenSSL's use of the DARN generator OpenSSL is also issuing byte codes (https://github.com/openssl/openssl/blob/master/crypto/perlasm/ppc-xlate.pl#L279):

# PowerISA 3.0 stuff
my $maddhdu	= sub { vfour(@_,49); };
my $maddld	= sub { vfour(@_,51); };
my $darn = sub {
    my ($f, $rt, $l) = @_;
    "	.long	".sprintf "0x%X",(31<<26)|($rt<<21)|($l<<16)|(755<<1);
};
Comment 3 Mark Wielaard 2021-02-23 15:26:15 UTC
This doesn't solve this bug, DARN is still not implemented, but now valgrind will not advertise DARN is available so programs should not use it when running under valgrind.

commit ea98cccb4d50a8740708507c4c72cfb1e6c88e38
Author: Mark Wielaard <mark@klomp.org>
Date:   Tue Feb 23 16:19:26 2021 +0100

    Filter out unsupported instructions from HWCAP2 on powerpc.
    
    Valgrind currently doesn't support the DARN random number instruction
    and the SCV syscall instruction. Filter them out of HWCAP2 so glibc
    and applications don't try to use them when running under valgrind.
    
    Also suppress printing a log message for scv instructions in the
    instruction stream.
    
    Reported by: Florian Weimer <fweimer@redhat.com>
    
    DARN bug: https://bugs.kde.org/show_bug.cgi?id=411189
    SCV bug: https://bugs.kde.org/show_bug.cgi?id=431157
Comment 4 Carl Love 2021-09-16 15:14:54 UTC
Darn instruction support committed.

commit 8afb49abe04a341d60b441c1f09a956aeccf0bbb
Author: Carl Love <cel@us.ibm.com>
Date:   Mon Mar 22 17:55:05 2021 -0500

    PPC64: Add support for the darn instruction
Comment 5 Carl Love 2021-10-01 21:28:09 UTC
Tested the test program on a Power 9 box with the latest Valgrind source tree.

 valgrind ./min_darn
==2840566== Memcheck, a memory error detector
==2840566== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2840566== Using Valgrind-3.18.0.GIT and LibVEX; rerun with -h for copyright info
==2840566== Command: ./min_darn
==2840566== 
13F736A2E6879909
==2840566== 
==2840566== HEAP SUMMARY:
==2840566==     in use at exit: 0 bytes in 0 blocks
==2840566==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==2840566== 
==2840566== All heap blocks were freed -- no leaks are possible
==2840566== 
==2840566== For lists of detected and suppressed errors, rerun with: -s
==2840566== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


The test program seems to work fine.
Comment 6 Carl Love 2021-10-01 21:29:35 UTC
Jack Lloyd, please verify the fix works for you using the current upstream Valgrind repository.  If the issue is fixed please close this issue.  Otherwise let me know and I will look at it again.  Thanks.
Comment 7 Carl Love 2021-10-14 16:41:27 UTC
No response from Jack.  The bug has been fixed.  Closing.