This bugzilla is for the final set of patches to add support for the Power ISA 3.1 instructions. The previous set of patches is in bugzilla 429375.
Created attachment 136291 [details] Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations Add support for: pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive accumulate pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative accumulate pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive accumulate pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative accumulate xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update) xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive accumulate xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative accumulate xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive accumulate xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative accumulate xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format
Created attachment 136292 [details] functional tests for Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations Functional tests for the Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations instructions
Created attachment 136293 [details] PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations Add support for: pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed Masked pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive multiply, Positive accumulate), Prefixed Masked pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturation (Positive multiply, Positive accumulate), Prefixed Masked xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive multiply, Positive accumulate) xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturation (Positive multiply, Positive accumulate)
Created attachment 136294 [details] functional tests for Reduced-Precision: Missing Integer-based Outer Product Operations Functional tests for the Reduced-Precision: Missing Integer-based Outer Product Operations instructions
(In reply to Carl Love from comment #1) > Created attachment 136291 [details] > Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations +static Float conv_bf16_to_float( UInt input ) +{ .. + output is 64-bit float. + bias +127, exponent 8-bits, fraction 22-bits Is this comment correct? 1 sign bit + 8 exponent bits + 22 mantissa bits looks much more like a 32-bit float than a 64-bit float. -- Is there an inconsistency in naming these functions? It appears that in some places, a 32-bit float is called `_float` in the name, but in others it is called `_f32`. Eg. +static Float conv_bf16_to_float( UInt input ) vs +static UInt conv_f32_to_bf16( UInt input ) Can you either fix the inconsistencies (if they exist) and/or also add a comment at the top explaining the naming? --- +ULong convert_from_f32tobf16_helper( ULong src ) { In this file, either mark functions as 'static' or add a comment saying they are called from generated code.
(In reply to Carl Love from comment #3) > Created attachment 136293 [details] > PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations -static UInt exts8( UInt src) +static ULong exts8( UInt src) -static UInt extz8( UInt src) +static ULong extz8( UInt src) Mark these as 'static'. Otherwise, OK to land.
(In reply to Carl Love from comment #2) > Created attachment 136292 [details] > functional tests for Reduced-Precision - bfloat16 Outer Product & Format > Conversion Operations OK to land. Please make sure though that all the new files get included in the tarball.
(In reply to Carl Love from comment #4) > Created attachment 136294 [details] > functional tests for Reduced-Precision: Missing Integer-based Outer > Product Operations OK to land. Again, please ensure any new files end up in the tarball.
Created attachment 137201 [details] functional support ISA 3.1 for reduced precision outer product operations Updating patch with requested changes
Created attachment 137202 [details] functional support ISA 3.1 for reduced precision outer product operations uploaded the wrong file last time
Created attachment 137203 [details] functional support for ISA 3.1 Reduced precision missing integer-based outer product operations Update the functional support for the missing integer-based outer product operations
Patches committed commit c589b652939655090c005a982a71f50c489fb5ce (HEAD -> master, origin/master, origin/HEAD) Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com> Date: Fri Feb 12 16:00:53 2021 -0500 Reduced precision Missing Integer based outer tests commit e09fdaf569b975717465ed8043820d0198d4d47d Author: Carl Love <cel@us.ibm.com> Date: Fri Feb 26 16:05:12 2021 -0600 PPC64: Reduced-Precision: Missing Integer-based Outer Product Operations Add support for: pmxvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update), Prefixed Masked pmxvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive multiply, Positive accumulate), Prefixed Masked pmxvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturation (Positive multiply, Positive accumulate), Prefixed Masked xvi16ger2 VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp VSX Vector 16-bit Signed Integer GER (rank-2 update) (Positive multiply, Positive accumulate) xvi8ger4spp VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturation (Positive multiply, Positive accumulate) commit c8fa838be405d7ac43035dcf675bf490800c26ec Author: root <root@ltcden3-lp13.aus.stglabs.ibm.com> Date: Fri Feb 12 15:59:32 2021 -0500 Reduced Precision bfloat16 outer product tests commit 078f89e99b6f62e043f6138c6a7ae238befc1f2a Author: Carl Love <cel@us.ibm.com> Date: Fri Feb 26 15:46:55 2021 -0600 PPC64: Reduced-Precision - bfloat16 Outer Product & Format Conversion Operations Add support for: pmxvbf16ger2 Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) pmxvbf16ger2pp Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive accumulate pmxvbf16ger2pn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative accumulate pmxvbf16ger2np Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive accumulate pmxvbf16ger2nn Prefixed Masked VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative accumulate xvbf16ger2VSX Vector bfloat16 GER (Rank-2 Update) xvbf16ger2pp VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Positive accumulate xvbf16ger2pn VSX Vector bfloat16 GER (Rank-2 Update) Positive multiply, Negative accumulate xvbf16ger2np VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Positive accumulate xvbf16ger2nn VSX Vector bfloat16 GER (Rank-2 Update) Negative multiply, Negative accumulate xvcvbf16sp VSX Vector Convert bfloat16 to Single-Precision format xvcvspbf16 VSX Vector Convert with round Single-Precision to bfloat16 format closing bug
Closing the bugzilla ISA 3.1 support is now complete