Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[all]: Introduce basic support for SVE
Scalable Vector Extension (SVE) is Vector Length Agnostic (VLA): - Vector Length (VL) is a hardware implementation choice from 128 up to 2048 bits. - New programming model allows software to scale dynamically to available vector length. - No need to define a new ISA, rewrite or recompile for new vector lengths. Scalable vector registers: - Z0-Z31 extending NEON’s 128-bit V0-V31 - Packed DP, SP & HP floating-point elements - Packed 64, 32, 16 & 8-bit integer elements Scalable predicate registers: - P0-P7 governing predicates for load/store/arithmetic - P8-P15 additional predicates for loop management - FFR first fault register for software speculation Implementation choices and known limitations: - SVE memory model is not addressed - `herd7` implements 128 bits vector length (on top of existing Neon infrastructure) - `litmus7` uses ARM C Language Extensions (ACLE) for SVE + Building SVE test require `-ccopts "-march=armv8-a+sve -O2"` + Although `Z` registers overlap with `V` registers mixing them in litmus test would likely lead to compilation failure (due to difference in the ACLE types) + However, `V` register overlapped with `Z` register in `final` clause is supported (this way we can inspect content of `Z` register) - Following SVE instructions are implemented: + PTRUE (predicate) + MOV (immediate, unpredicated) + DUP (scalar) + ADD (vectors, unpredicated) + INDEX (immediate, scalar) + INDEX (immediates) + INDEX (scalar, immediate) + INDEX (scalars) + WHILELE (predicate) + WHILELT (predicate) + WHILELE (predicate) + WHILELO (predicate) + WHILELS (predicate) + LD1B (scalar plus immediate, single register) + LD1H (scalar plus immediate, single register) + LD1W (scalar plus immediate, single register) + LD1D (scalar plus immediate, single register) + LD1D (scalar plus scalar, single register) + LD1B (scalar plus scalar, single register) + LD1H (scalar plus scalar, single register) + LD1W (scalar plus scalar, single register) + LD1B (scalar plus vector) + LD1H (scalar plus vector) + LD1W (scalar plus vector) + LD1D (scalar plus vector) + LD2B (scalar plus immediate) + LD2H (scalar plus immediate) + LD2W (scalar plus immediate) + LD2D (scalar plus immediate) + LD2B (scalar plus scalar) + LD2H (scalar plus scalar) + LD2W (scalar plus scalar) + LD2D (scalar plus scalar) + LD3B (scalar plus immediate) + LD3H (scalar plus immediate) + LD3W (scalar plus immediate) + LD3D (scalar plus immediate) + LD3B (scalar plus scalar) + LD3H (scalar plus scalar) + LD3W (scalar plus scalar) + LD3D (scalar plus scalar) + LD4B (scalar plus immediate) + LD4H (scalar plus immediate) + LD4W (scalar plus immediate) + LD4D (scalar plus immediate) + LD4B (scalar plus scalar) + LD4H (scalar plus scalar) + LD4W (scalar plus scalar) + LD4D (scalar plus scalar) + ST1B (scalar plus immediate, single register) + ST1H (scalar plus immediate, single register) + ST1W (scalar plus immediate, single register) + ST1D (scalar plus immediate, single register) + ST1B (scalar plus scalar, single register) + ST1H (scalar plus scalar, single register) + ST1W (scalar plus scalar, single register) + ST1D (scalar plus scalar, single register) + ST1B (scalar plus vector) + ST1H (scalar plus vector) + ST1W (scalar plus vector) + ST1D (scalar plus vector) + ST2B (scalar plus immediate) + ST2H (scalar plus immediate) + ST2W (scalar plus immediate) + ST2D (scalar plus immediate) + ST2B (scalar plus scalar) + ST2H (scalar plus scalar) + ST2W (scalar plus scalar) + ST2D (scalar plus scalar) + ST3B (scalar plus immediate) + ST3H (scalar plus immediate) + ST3W (scalar plus immediate) + ST3D (scalar plus immediate) + ST3B (scalar plus scalar) + ST3H (scalar plus scalar) + ST3W (scalar plus scalar) + ST3D (scalar plus scalar) + ST4B (scalar plus immediate) + ST4H (scalar plus immediate) + ST4W (scalar plus immediate) + ST4D (scalar plus immediate) + ST4B (scalar plus scalar) + ST4H (scalar plus scalar) + ST4W (scalar plus scalar) + ST4D (scalar plus scalar) + SVE aliases for condition codes Signed-off-by: Vladimir Murzin <[email protected]>
- Loading branch information