Bug 402519

Summary:	POWER 3.0 addex instruction incorrectly implemented
Product:	[Developer tools] valgrind	Reporter:	Julian Seward <jseward>
Component:	vex	Assignee:	Mark Wielaard <mark>
Status:	RESOLVED FIXED
Severity:	normal	CC:	cel, mark
Priority:	NOR
Version First Reported In:	3.15 SVN
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:
Attachments:	Fix addex instruction implementation

Description Julian Seward 2018-12-24 09:41:49 UTC

SUMMARY

(from private email)

instruction addex [..] is fundamentally broken and does not generate and/or consume OV correctly.  Tested with 3.13 and 3.14.

STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
MacOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION

Comment 1 Mark Wielaard 2018-12-31 21:04:48 UTC

Some more details from private email(s).

This was originally reported as an issue with GMP on power9.

Here is some example code:

File t.c:

#include <stdio.h>
unsigned long f();
int
main ()
{
  unsigned long foo = f();
  for (int i = 56; i >= 0; i -= 8)
    printf ("%c", (char) (foo >> i));
  printf ("\n");
  return 0;
}

File f.s:

        .globl  f
        .type   f, @function
f:      li      5, 0x6E63
        addis   5, 5, 0x7269
        li      6, 0x6C67
        addis   6, 6, 0x5661
        insrdi  5, 6, 32, 0

        li      3, -1
        subfo   0, 0, 0         # OV <- 0
        addex   4, 3, 3, 0      # OV <- 1
        addex   3, 3, 3, 0      # r3 <- 0xff..ff + 0xff..ff + 1 = 0xff..fe = -2

        sub     3, 5, 3
        blr

$ gcc -std=c99 t.c f.s
$ valgrind ./a.out

Correct output is "Valgrind".
Output under Valgrind: "Valgrine".

There is actually a testcase for addex in valgrind
none/tests/ppc64/test_isa_3_0.c. But it is commented out.
And indeed, it fails when enabled.

The issue seems to be that the XER OV flag is set wrongly.
The code in calculate_XER_OV_64 () does:

       /* (argL^argR^-1) & (argL^res) & (1<<63)  ? 1:0 */
       // i.e. ((both_same_sign) & (sign_changed) & (sign_mask))

Which checks for two's complements overflow.

But addex is akin to adde except that addex inputs carry from OV and outputs
carry to OV, while adde of course uses CA.

The meaning of OV for other instructions is signed overflow.

Comment 2 Mark Wielaard 2018-12-31 21:45:43 UTC

Created attachment 117217 [details]
Fix addex instruction implementation

https://code.wildebeest.org/git/user/mjw/valgrind/commit/?h=power9-addex

Comment 3 Mark Wielaard 2019-01-04 12:39:18 UTC

Carl reviewed and double checked API test results.
Pushed to master.

commit 2c1f016e634bf79faf45e81c14c955c711bc202f
Author: Mark Wielaard <mark@klomp.org>
Date:   Mon Dec 31 22:26:31 2018 +0100

    Bug 402519 - POWER 3.0 addex instruction incorrectly implemented
    
    addex uses OV as carry in and carry out. For all other instructions
    OV is the signed overflow flag. And instructions like adde use CA
    as carry.
    
    Replace set_XER_OV_OV32 with set_XER_OV_OV32_ADDEX, which will
    call calculate_XER_CA_64 and calculate_XER_CA_32, but with OV
    as input, and sets OV and OV32.
    
    Enable test_addex in none/tests/ppc64/test_isa_3_0.c and update
    the expected output. test_addex would fail to match the expected
    output before this patch.