Skip to content

Commit

Permalink
Completing vghsh.vs/vgmul.vs descriptions
Browse files Browse the repository at this point in the history
  • Loading branch information
nibrunieAtSi5 committed Aug 14, 2023
1 parent a1bfcfc commit bc7f527
Show file tree
Hide file tree
Showing 2 changed files with 48 additions and 26 deletions.
21 changes: 12 additions & 9 deletions doc/vector/insns/vghsh.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Vector Add-Multiply over GHASH Galois-Field

Mnemonic::
vghsh.vv vd, vs2, vs1 +
vghsh.vs vd, vs2, vs1
vghsh.vs vd, rs2, vs1

Encoding (Vector-Vector)::
[wavedrom, , svg]
Expand Down Expand Up @@ -40,6 +40,7 @@ Encoding (Vector-Scalar)::

Reserved Encodings::
* `SEW` is any value other than 32
* `vghsh.vs` encoding (except if `Zvkgb` is enabled)

Arguments::

Expand All @@ -62,7 +63,15 @@ Arguments::
Description::
A single "iteration" of the GHASH~H~ algorithm is performed.

This instruction treats all of the inputs and outputs as 128-bit polynomials and

The previous partial hashes are read as 4-element groups from 'vd',
the cipher texts are read as 4-element groups from `vs1`
and the hash subkeys are read from either the corresponding 4-element group
in `vs2` (vector-vector form) or the scalar element group in `vs2`
(vector-scalar form, `Zvkgb` only). The resulting partial hashes are writen as 4-element groups into `vd`.


This instruction treats all of the input and output element groups as 128-bit polynomials and
performs operations over GF[2].
It produces the next partial hash (Y~i+1~) by adding the current partial
hash (Y~i~) to the cipher text block (X~i~) and then multiplying (over GF(2^128^))
Expand Down Expand Up @@ -92,17 +101,11 @@ with the NIST specification. These reversals are inexpensive to implement as the
swap bit positions and therefore do not require any logic.
====

[NOTE]
====
Since the same hash subkey `H` will typically be used repeatedly on a given message,
a future extension might define a vector-scalar version of this instruction where
`vs2` is the scalar element group. This would help reduce register pressure when `LMUL` > 1.
====

Operation::
[source,pseudocode]
--
function clause execute (VGHSH(vs2, vs1, vd)) = {
function clause execute (VGHSH(vs2, vs1, vd, suffix)) = {
// operands are input with bits reversed in each byte
if(LMUL*VLEN < EGW) then {
handle_illegal(); // illegal instruction exception
Expand Down
53 changes: 36 additions & 17 deletions doc/vector/insns/vgmul.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Vector Multiply over GHASH Galois-Field
Mnemonic::
vgmul.vv vd, vs2

Encoding::
Encoding (Vector-Vector)::
[wavedrom, , svg]
....
{reg:[
Expand All @@ -20,8 +20,25 @@ Encoding::
{bits: 6, name: '101000'},
]}
....


Encoding (Vector-Scalar)::
[wavedrom, , svg]
....
{reg:[
{bits: 7, name: 'OP-P'},
{bits: 5, name: 'vd'},
{bits: 3, name: 'OPMVV'},
{bits: 5, name: '10001'},
{bits: 5, name: 'vs2'},
{bits: 1, name: '1'},
{bits: 6, name: '101001'},
]}
....

Reserved Encodings::
* `SEW` is any value other than 32
* `SEW` is any value other than 32
* `vgmul.vs` encoding (except if `Zvkgb` is enabled)

Arguments::

Expand All @@ -40,9 +57,14 @@ Arguments::
| Vd | output | 128 | 4 | 32 | Product
|===

Description::
Description::
A GHASH~H~ multiply is performed.

The multipliers are read as 4-element groups from 'vd',
the multiplicands subkeys are read from either the corresponding 4-element group
in `vs2` (vector-vector form) or the scalar element group in `vs2`
(vector-scalar form, `Zvkgb` only). The resulting products are written as 4-element groups into `vd`.

This instruction treats all of the inputs and outputs as 128-bit polynomials and
performs operations over GF[2].
It produces the product over GF(2^128^) of the two 128-bit inputs.
Expand All @@ -67,27 +89,23 @@ with the NIST specification. These reversals are inexpensive to implement as the
swap bit positions and therefore do not require any logic.
====

[NOTE]
====
Since the same multiplicand will typically be used repeatedly on a given message,
a future extension might define a vector-scalar version of this instruction where
`vs2` is the scalar element group. This would help reduce register pressure when `LMUL` > 1.
====

[NOTE]
====
This instruction is identical to `vghsh.vv` with vs1=0.
The instruction `vgmul.vv` is identical to `vghsh.vv` with vs1=0.
This instruction is often used in GHASH code. In some cases it is followed
by an XOR to perform a multiply-add. Implementations may choose to fuse these
two instructions to improve performance on GHASH code that
doesn't use the add-multiply form of the `vghsh.vv` instruction.
two instructions to improve performance on GHASH code that
doesn't use the add-multiply form of the `vghsh.vv` instruction.
Similarly, the instruction `vgmul.vs` is identical to `vghsh.vs` with vs1=0.
====


Operation::
[source,pseudocode]
--
function clause execute (VGMUL(vs2, vs1, vd)) = {
function clause execute (VGMUL(vs2, vs1, vd, suffix)) = {
// operands are input with bits reversed in each byte
if(LMUL*VLEN < EGW) then {
handle_illegal(); // illegal instruction exception
Expand All @@ -96,10 +114,11 @@ function clause execute (VGMUL(vs2, vs1, vd)) = {

eg_len = (vl/EGS)
eg_start = (vstart/EGS)

foreach (i from eg_start to eg_len-1) {
let helem = if suffix == "vv" then i else 0;
let Y = brev8(get_velem(vd,EGW=128,i)); // Multiplier
let H = brev8(get_velem(vs2,EGW=128,i)); // Multiplicand
let H = brev8(get_velem(vs2,EGW=128, helem)); // Multiplicand
let Z : bits(128) = 0;

for (int bit = 0; bit < 128; bit++) {
Expand All @@ -113,7 +132,7 @@ function clause execute (VGMUL(vs2, vs1, vd)) = {
}


let result = brev8(Z);
let result = brev8(Z);
set_velem(vd, EGW=128, i, result);
}
RETIRE_SUCCESS
Expand All @@ -122,4 +141,4 @@ function clause execute (VGMUL(vs2, vs1, vd)) = {
--

Included in::
<<zvkg>>, <<zvkng>>, <<zvksg>>
<<zvkg>>, <<zvkgb>>, <<zvkng>>, <<zvksg>>

0 comments on commit bc7f527

Please sign in to comment.