Allow bfloat16 computations on compatible CPUs with Intel Extension for PyTorch #3649

Modern CPUs have native AVX512 BF16 instructions, which significantly improves matmul and conv2d operations. With Bfloat16 instructions UNET steps are 40-50% faster on both AMD and Intel CPUs. There are minor visible changes with bf16, but no avalanche effects, so this feature is enabled by default with new `--use-cpu-bf16=auto` option. It can be disabled with `--use-cpu-bf16=no`. Signed-off-by: Sv. Lockal <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow bfloat16 computations on compatible CPUs with Intel Extension for PyTorch #3649

Allow bfloat16 computations on compatible CPUs with Intel Extension for PyTorch #3649

Commits on Aug 6, 2024

Allow bfloat16 computations on compatible CPUs with Intel Extension for PyTorch #3649

Are you sure you want to change the base?

Allow bfloat16 computations on compatible CPUs with Intel Extension for PyTorch #3649

Commits on Aug 6, 2024