Bug 433801

Summary:	PPC ISA 3.1 support is missing, part 10
Product:	[Developer tools] valgrind	Reporter:	Carl Love <cel>
Component:	sgcheck	Assignee:	Julian Seward <jseward>
Status:	CLOSED FIXED
Severity:	normal	CC:	will_schmidt
Priority:	NOR
Version:	unspecified
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed In:
Attachments:	Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations functional tests for Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations functional tests for Reduced-Precision: Missing Integer-based Outer Product Operations functional support ISA 3.1 for reduced precision outer product operations functional support ISA 3.1 for reduced precision outer product operations functional support for ISA 3.1 Reduced precision missing integer-based outer product operations

Description Carl Love 2021-03-01 17:33:52 UTC

This bugzilla is for the final set of patches to add support for the Power ISA 3.1 instructions. 

The previous set of patches is in bugzilla 429375.

Comment 1 Carl Love 2021-03-01 17:39:31 UTC

Created attachment 136291 [details]
Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

Add support for:

pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update)
pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
  multiply, Positive accumulate
pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
  multiply, Negative accumulate
pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
  multiply, Positive accumulate
pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
  multiply, Negative accumulate
xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update)
xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive
  accumulate
xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative
  accumulate
xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive
  accumulate
xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative
  accumulate
xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format
xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format

Comment 2 Carl Love 2021-03-01 17:40:25 UTC

Created attachment 136292 [details]
functional tests for Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

Functional tests for the Reduced-Precision - bfloat16 Outer Product & 
 Format Conversion Operations instructions

Comment 3 Carl Love 2021-03-01 17:41:14 UTC

Created attachment 136293 [details]
PPC64: Reduced-Precision: Missing Integer-based Outer   Product Operations

Add support for:

pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed
   Masked
pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
   multiply, Positive accumulate), Prefixed Masked
pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
   Saturation (Positive multiply, Positive accumulate), Prefixed Masked
xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update)
xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
   multiply, Positive accumulate)
xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
   Saturation (Positive multiply, Positive accumulate)

Comment 4 Carl Love 2021-03-01 17:42:02 UTC

Created attachment 136294 [details]
functional tests for  Reduced-Precision: Missing Integer-based Outer   Product Operations

Functional tests for the  Reduced-Precision: Missing Integer-based Outer 
 Product Operations instructions

Comment 5 Julian Seward 2021-03-29 10:42:40 UTC

(In reply to Carl Love from comment #1)
> Created attachment 136291 [details]
> Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

+static Float conv_bf16_to_float( UInt input )
+{
..
+     output is 64-bit float.
+     bias +127, exponent 8-bits, fraction 22-bits

Is this comment correct?  1 sign bit + 8 exponent bits + 22 mantissa bits
looks much more like a 32-bit float than a 64-bit float.

--

Is there an inconsistency in naming these functions?  It appears that in
some places, a 32-bit float is called `_float`  in the name, but in others
it is called `_f32`.  Eg.

+static Float conv_bf16_to_float( UInt input )
vs
+static UInt conv_f32_to_bf16( UInt input )

Can you either fix the inconsistencies (if they exist) and/or also add a
comment at the top explaining the naming?

---

+ULong convert_from_f32tobf16_helper( ULong src ) {

In this file, either mark functions as 'static' or add a comment saying they
are called from generated code.

Comment 6 Julian Seward 2021-03-29 10:45:19 UTC

(In reply to Carl Love from comment #3)
> Created attachment 136293 [details]
> PPC64: Reduced-Precision: Missing Integer-based Outer   Product Operations

-static UInt exts8( UInt src)
+static ULong exts8( UInt src)

-static UInt extz8( UInt src)
+static ULong extz8( UInt src)

Mark these as 'static'.

Otherwise, OK to land.

Comment 7 Julian Seward 2021-03-29 10:47:40 UTC

(In reply to Carl Love from comment #2)
> Created attachment 136292 [details]
> functional tests for Reduced-Precision - bfloat16 Outer Product &   Format
> Conversion Operations

OK to land.  Please make sure though that all the new files get included
in the tarball.

Comment 8 Julian Seward 2021-03-29 10:48:23 UTC

(In reply to Carl Love from comment #4)
> Created attachment 136294 [details]
> functional tests for  Reduced-Precision: Missing Integer-based Outer  
> Product Operations

OK to land.  Again, please ensure any new files end up in the tarball.

Comment 9 Carl Love 2021-03-31 16:26:14 UTC

Created attachment 137201 [details]
functional support ISA 3.1 for reduced precision outer product operations

Updating patch with requested changes

Comment 10 Carl Love 2021-03-31 16:48:56 UTC

Created attachment 137202 [details]
functional support ISA 3.1 for reduced precision outer product operations

uploaded the wrong file last time

Comment 11 Carl Love 2021-03-31 16:53:16 UTC

Created attachment 137203 [details]
functional support for ISA 3.1 Reduced precision missing integer-based outer product operations

Update the functional support for the missing integer-based outer product operations

Comment 12 Carl Love 2021-03-31 16:57:50 UTC

Patches committed

commit c589b652939655090c005a982a71f50c489fb5ce (HEAD -> master, origin/master, origin/HEAD)
Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com>
Date:   Fri Feb 12 16:00:53 2021 -0500

    Reduced precision Missing Integer based outer tests

commit e09fdaf569b975717465ed8043820d0198d4d47d
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Feb 26 16:05:12 2021 -0600

    PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations
    
    Add support for:
    
    pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed
       Masked
    pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
       multiply, Positive accumulate), Prefixed Masked
    pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
       Saturation (Positive multiply, Positive accumulate), Prefixed Masked
    xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update)
    xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
       multiply, Positive accumulate)
    xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
       Saturation (Positive multiply, Positive accumulate)

commit c8fa838be405d7ac43035dcf675bf490800c26ec
Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com>
Date:   Fri Feb 12 15:59:32 2021 -0500

    Reduced Precision bfloat16 outer product tests

commit 078f89e99b6f62e043f6138c6a7ae238befc1f2a
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Feb 26 15:46:55 2021 -0600

    PPC64: Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations
    
    Add support for:
    
    pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update)
    pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
      multiply, Positive accumulate
    pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
      multiply, Negative accumulate
    pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
      multiply, Positive accumulate
    pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
      multiply, Negative accumulate
    xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update)
    xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive
      accumulate
    xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative
      accumulate
    xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive
      accumulate
    xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative
      accumulate
    xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format
    xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format

closing bug

Comment 13 Carl Love 2021-03-31 16:58:27 UTC

Closing the bugzilla

ISA 3.1 support is now complete