Bug 402519

Summary: POWER 3.0 addex instruction incorrectly implemented
Product: [Developer tools] valgrind Reporter: Julian Seward <jseward>
Component: vexAssignee: Mark Wielaard <mark>
Status: RESOLVED FIXED    
Severity: normal CC: cel, mark
Priority: NOR    
Version First Reported In: 3.15 SVN   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Fix addex instruction implementation

Description Julian Seward 2018-12-24 09:41:49 UTC
SUMMARY

(from private email)

instruction addex [..] is fundamentally broken and does not generate and/or consume OV correctly.  Tested with 3.13 and 3.14.

STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
MacOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 Mark Wielaard 2018-12-31 21:04:48 UTC
Some more details from private email(s).

This was originally reported as an issue with GMP on power9.

Here is some example code:

File t.c:

#include <stdio.h>
unsigned long f();
int
main ()
{
  unsigned long foo = f();
  for (int i = 56; i >= 0; i -= 8)
    printf ("%c", (char) (foo >> i));
  printf ("\n");
  return 0;
}

File f.s:

        .globl  f
        .type   f, @function
f:      li      5, 0x6E63
        addis   5, 5, 0x7269
        li      6, 0x6C67
        addis   6, 6, 0x5661
        insrdi  5, 6, 32, 0

        li      3, -1
        subfo   0, 0, 0         # OV <- 0
        addex   4, 3, 3, 0      # OV <- 1
        addex   3, 3, 3, 0      # r3 <- 0xff..ff + 0xff..ff + 1 = 0xff..fe = -2

        sub     3, 5, 3
        blr

$ gcc -std=c99 t.c f.s
$ valgrind ./a.out

Correct output is "Valgrind".
Output under Valgrind: "Valgrine".

There is actually a testcase for addex in valgrind
none/tests/ppc64/test_isa_3_0.c. But it is commented out.
And indeed, it fails when enabled.

The issue seems to be that the XER OV flag is set wrongly.
The code in calculate_XER_OV_64 () does:

       /* (argL^argR^-1) & (argL^res) & (1<<63)  ? 1:0 */
       // i.e. ((both_same_sign) & (sign_changed) & (sign_mask))

Which checks for two's complements overflow.

But addex is akin to adde except that addex inputs carry from OV and outputs
carry to OV, while adde of course uses CA.

The meaning of OV for other instructions is signed overflow.
Comment 2 Mark Wielaard 2018-12-31 21:45:43 UTC
Created attachment 117217 [details]
Fix addex instruction implementation

https://code.wildebeest.org/git/user/mjw/valgrind/commit/?h=power9-addex
Comment 3 Mark Wielaard 2019-01-04 12:39:18 UTC
Carl reviewed and double checked API test results.
Pushed to master.

commit 2c1f016e634bf79faf45e81c14c955c711bc202f
Author: Mark Wielaard <mark@klomp.org>
Date:   Mon Dec 31 22:26:31 2018 +0100

    Bug 402519 - POWER 3.0 addex instruction incorrectly implemented
    
    addex uses OV as carry in and carry out. For all other instructions
    OV is the signed overflow flag. And instructions like adde use CA
    as carry.
    
    Replace set_XER_OV_OV32 with set_XER_OV_OV32_ADDEX, which will
    call calculate_XER_CA_64 and calculate_XER_CA_32, but with OV
    as input, and sets OV and OV32.
    
    Enable test_addex in none/tests/ppc64/test_isa_3_0.c and update
    the expected output. test_addex would fail to match the expected
    output before this patch.