This repository contains a BLAS implementation for the AMD AI Engine.
Note that the library is mostly a proof of concept and does not support all routines.
Level 1:
- ASUM
- AXPY
- DOT
- IAMAX
- NRM2
- ROT
- SCAL
Level 2:
- GEMV
- AMD Versal VCK5000
To compile the code generator, run ./configure.sh && cmake --build build
in the folder aieblas/
.
To run the benchmarks, first compile the code generator, and then build the benchmarks by running ./build-all.sh
in the folder benchmark/util
.