Bug 433801 - PPC ISA 3.1 support is missing, part 10
Summary: PPC ISA 3.1 support is missing, part 10
Status: CLOSED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: sgcheck (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-01 17:33 UTC by Carl Love
Modified: 2021-03-31 16:58 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations (21.74 KB, patch)
2021-03-01 17:39 UTC, Carl Love
Details
functional tests for Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations (175.15 KB, patch)
2021-03-01 17:40 UTC, Carl Love
Details
PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations (8.41 KB, patch)
2021-03-01 17:41 UTC, Carl Love
Details
functional tests for Reduced-Precision: Missing Integer-based Outer Product Operations (78.74 KB, patch)
2021-03-01 17:42 UTC, Carl Love
Details
functional support ISA 3.1 for reduced precision outer product operations (79.41 KB, patch)
2021-03-31 16:26 UTC, Carl Love
Details
functional support ISA 3.1 for reduced precision outer product operations (21.92 KB, patch)
2021-03-31 16:48 UTC, Carl Love
Details
functional support for ISA 3.1 Reduced precision missing integer-based outer product operations (8.41 KB, patch)
2021-03-31 16:53 UTC, Carl Love
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Love 2021-03-01 17:33:52 UTC
This bugzilla is for the final set of patches to add support for the Power ISA 3.1 instructions. 

The previous set of patches is in bugzilla 429375.
Comment 1 Carl Love 2021-03-01 17:39:31 UTC
Created attachment 136291 [details]
Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

Add support for:

pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update)
pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
  multiply, Positive accumulate
pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
  multiply, Negative accumulate
pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
  multiply, Positive accumulate
pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
  multiply, Negative accumulate
xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update)
xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive
  accumulate
xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative
  accumulate
xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive
  accumulate
xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative
  accumulate
xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format
xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format
Comment 2 Carl Love 2021-03-01 17:40:25 UTC
Created attachment 136292 [details]
functional tests for Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

Functional tests for the Reduced-Precision - bfloat16 Outer Product & 
 Format Conversion Operations instructions
Comment 3 Carl Love 2021-03-01 17:41:14 UTC
Created attachment 136293 [details]
PPC64: Reduced-Precision: Missing Integer-based Outer   Product Operations

Add support for:

pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed
   Masked
pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
   multiply, Positive accumulate), Prefixed Masked
pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
   Saturation (Positive multiply, Positive accumulate), Prefixed Masked
xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update)
xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
   multiply, Positive accumulate)
xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
   Saturation (Positive multiply, Positive accumulate)
Comment 4 Carl Love 2021-03-01 17:42:02 UTC
Created attachment 136294 [details]
functional tests for  Reduced-Precision: Missing Integer-based Outer   Product Operations

Functional tests for the  Reduced-Precision: Missing Integer-based Outer 
 Product Operations instructions
Comment 5 Julian Seward 2021-03-29 10:42:40 UTC
(In reply to Carl Love from comment #1)
> Created attachment 136291 [details]
> Reduced-Precision - bfloat16 Outer Product &   Format Conversion Operations

+static Float conv_bf16_to_float( UInt input )
+{
..
+     output is 64-bit float.
+     bias +127, exponent 8-bits, fraction 22-bits

Is this comment correct?  1 sign bit + 8 exponent bits + 22 mantissa bits
looks much more like a 32-bit float than a 64-bit float.

--

Is there an inconsistency in naming these functions?  It appears that in
some places, a 32-bit float is called `_float`  in the name, but in others
it is called `_f32`.  Eg.

+static Float conv_bf16_to_float( UInt input )
vs
+static UInt conv_f32_to_bf16( UInt input )

Can you either fix the inconsistencies (if they exist) and/or also add a
comment at the top explaining the naming?

---

+ULong convert_from_f32tobf16_helper( ULong src ) {

In this file, either mark functions as 'static' or add a comment saying they
are called from generated code.
Comment 6 Julian Seward 2021-03-29 10:45:19 UTC
(In reply to Carl Love from comment #3)
> Created attachment 136293 [details]
> PPC64: Reduced-Precision: Missing Integer-based Outer   Product Operations

-static UInt exts8( UInt src)
+static ULong exts8( UInt src)

-static UInt extz8( UInt src)
+static ULong extz8( UInt src)

Mark these as 'static'.

Otherwise, OK to land.
Comment 7 Julian Seward 2021-03-29 10:47:40 UTC
(In reply to Carl Love from comment #2)
> Created attachment 136292 [details]
> functional tests for Reduced-Precision - bfloat16 Outer Product &   Format
> Conversion Operations

OK to land.  Please make sure though that all the new files get included
in the tarball.
Comment 8 Julian Seward 2021-03-29 10:48:23 UTC
(In reply to Carl Love from comment #4)
> Created attachment 136294 [details]
> functional tests for  Reduced-Precision: Missing Integer-based Outer  
> Product Operations

OK to land.  Again, please ensure any new files end up in the tarball.
Comment 9 Carl Love 2021-03-31 16:26:14 UTC
Created attachment 137201 [details]
functional support ISA 3.1 for reduced precision outer product operations

Updating patch with requested changes
Comment 10 Carl Love 2021-03-31 16:48:56 UTC
Created attachment 137202 [details]
functional support ISA 3.1 for reduced precision outer product operations

uploaded the wrong file last time
Comment 11 Carl Love 2021-03-31 16:53:16 UTC
Created attachment 137203 [details]
functional support for ISA 3.1 Reduced precision missing integer-based outer product operations

Update the functional support for the missing integer-based outer product operations
Comment 12 Carl Love 2021-03-31 16:57:50 UTC
Patches committed

commit c589b652939655090c005a982a71f50c489fb5ce (HEAD -> master, origin/master, origin/HEAD)
Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com>
Date:   Fri Feb 12 16:00:53 2021 -0500

    Reduced precision Missing Integer based outer tests

commit e09fdaf569b975717465ed8043820d0198d4d47d
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Feb 26 16:05:12 2021 -0600

    PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations
    
    Add support for:
    
    pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed
       Masked
    pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
       multiply, Positive accumulate), Prefixed Masked
    pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
       Saturation (Positive multiply, Positive accumulate), Prefixed Masked
    xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update)
    xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive
       multiply, Positive accumulate)
    xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with
       Saturation (Positive multiply, Positive accumulate)

commit c8fa838be405d7ac43035dcf675bf490800c26ec
Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com>
Date:   Fri Feb 12 15:59:32 2021 -0500

    Reduced Precision bfloat16 outer product tests

commit 078f89e99b6f62e043f6138c6a7ae238befc1f2a
Author: Carl Love <cel@us.ibm.com>
Date:   Fri Feb 26 15:46:55 2021 -0600

    PPC64: Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations
    
    Add support for:
    
    pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update)
    pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
      multiply, Positive accumulate
    pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive
      multiply, Negative accumulate
    pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
      multiply, Positive accumulate
    pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative
      multiply, Negative accumulate
    xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update)
    xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive
      accumulate
    xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative
      accumulate
    xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive
      accumulate
    xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative
      accumulate
    xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format
    xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format

closing bug
Comment 13 Carl Love 2021-03-31 16:58:27 UTC
Closing the bugzilla

ISA 3.1 support is now complete