06 Sep 14:48

nathanielsimard

76894ef

v0.9.0

Burn v0.9.0 sees the addition of the Burn Book, a new model repository, and many new operations and optimizations.

Burn Book

The Burn Book is available at https://burn-rs.github.io/book/

Burn Book setup and plan @nathanielsimard @wdoppenberg @antimora
Motivation & Getting started @louisfd @nathanielsimard
Basic Workflow: from training to inference @nathanielsimard @louisfd
Building blocks @nathanielsimard
ONNX models @antimora
Advanced sections @nathanielsimard

Model repository

The Model repository is available at https://github.com/burn-rs/models

Setup @nathanielsimard
Add SqueezeNet @antimora
Multiple models made with Burn @Gadersd
- Llama 2
- Whisper
- Stable Diffusion v1.4

Changes to Burn

Neural networks

Three new optimizers
- AdamW @wdoppenberg
- AdaGrad @CohenAriel
- RMSProp @AuruTus
Custom initializer for transformer-related modules @wbrickner
Cross Entropy with label smoothing and weights @ArvidHammarlund

Tensors

Many new operators
- cast @trfdeer @nathanielsimard
- clamp, clamp_min, clamp_max @antimora
- abs @mmalczak
- max_pool1d, max_pool with dilation @caiopiccirillo
- adaptive_avg_pool 1d and 2d @nathanielsimard
- conv_transpose 1d and 2d, with backward @nathanielsimard
- Not operator @louisfd
- Dim iterator @ArvidHammarlund
More tests for basic tensor ops @louisfd

Training

New training metrics @Elazrod56
- CPU temperature and use
- GPU temperature
- Memory use
Custom training and validation metric loggers @nathanielsimard
Migration from log4rs to tracing, better integration in a GUI app @dae
Training interruption @dae
New custom optimize method @nathanielsimard

Backends

WGPU backend
- Autotune @louisfd @nathanielsimard
  - Cache optimization @agelas
- Pseudo-random number generator @louisfd
- Fix configs @nathanielsimard
- Matmul optimization @louisfd
ndarray backend
- Optimization of argmin/argmax @DrChat
- Optimization of conv2d @DrChat
Candle backend @louisfd
- Support for all basic operations
- Work in progress

Dataset

Option for with or without replacement in dataset sampler @nathanielsimard

Import & ONNX

Refactor, performance, tests and fixes @antimora @Luni-4 @nathanielsimard, @Gadersd
New operators @Luni-4 @antimora @AuruTus
- Reshape
- Transpose
- Binary operators
- Concat
- Dropout
- Avg pool
- Softmax
- Conv1d, Conv2d
- Scalar and constants
- tanh
- clip

Fix

Hugging Face downloader Windows support @Macil
Fix grad replace and autodiff backward broadcast @nathanielsimard
Fix processed count at learning completion @dae
Adjust some flaky tests @dae
Ability to disable experiment logging @dae

Configuration

Rewrite publish and checks scripts in Rust, with cargo-xtask @Luni-4 @DrChat
Add Typos verification to checks @caiopiccirillo @antimora
Checks for Python and venv environment @mashirooooo
Feature flags for crates in different scenarios @dae

Documentation

Configuration doc for vscode environment setup @caiopiccirillo
Jupyter notebook examples @antimora
Readme updated @louisfd

Thanks

Thanks to all aforemetioned contributors and to our sponsors @smallstepman and @premAI-io.

Contributors

dae, Macil, and 19 other contributors

Assets 2

25 Jul 15:38

nathanielsimard

v0.8.0

7c9852b

v0.8.0

In this release, our main focus was on creating a new backend using wgpu.
We greatly appreciate the meaningful contributions made by the community across the project.
As usual, we have expanded the number of supported operations.

Changes

Tensor

Added Max/Minimum operation @nathanielsimard
Added average pooling 1D operation @nathanielsimard
Added Gather/Scatter operations @nathanielsimard
Added Mask Where operation @nathanielsimard
Refactor index related operations @nathanielsimard
- index, index_assign => slice, slice_assign
- index_select, index_select_assign => select, select_assign
New syntax sugar for transpose @wbrickner
Added SiLU activation function @Poxxy

Dataset

Added a dataset using Sqlite for storage. Now used to store huggingface datasets. @antimora
New speech command audio dataset. @antimora
Create python virtual environment for huggingface dependencies. @dengelt

Burn-Import

Big refactor to make it easier to support new operations. @nathanielsimard
Support bool element type. @maekawatoshiki
Added Add operator. @Luni-4
Added MaxPool2d operator. @Luni-4
Parse convolution 2D config. @Luni-4
Added sigmoid operation. @Luni-4

Backend

New burn-wgpu backend 🔥! @nathanielsimard @louisfd
- Tile 2D matrix multiplication
- All operations are supported
Improve performance of repeat with the tch backend. @nathanielsimard

Neural Networks

Added LSTM module. @agelas
Added GRU module. @agelas
Better weights initialization with added support for Xavier Glorot. @louisfd
Added MSE loss. @bioinformatist
Cleanup padding for convolution and pooling modules. @Luni-4
Added sinusoidal positional embedding module. @antimora

Fix

Deserialization of constant arrays. @nathanielsimard
Concat backward with only one dim. @nathanielsimard
Conv1d stride hardcoded to 1. @antimora
Fix arange with the tch backend. @nathanielsimard

Documentation

Improve documentation across the whole project ♥! @antimora

Thanks

Thanks to all contributors and to the sponsor @smallstepman.

Contributors

antimora, wbrickner, and 9 other contributors

Assets 2

06 May 15:03

nathanielsimard

v0.7.0

844b199

v0.7.0

Serialization

Serialization has been completely revamped since the last release. Modules, Optimizers, and Learning Rate Scheduler now have an associative type, allowing them to determine the type used for serializing and deserializing their state. The solution is documented in the new architecture doc.

State can be saved with any precision, regardless of the backend in use. Precision conversion is performed during serialization and deserialization, ensuring high memory efficiency since the model is not stored twice in memory with different precisions.

All saved states can be loaded from any backend. The precision of the serialized state must be set correctly, but the element types of the backend can be anything.

Multiple (de)serialization recorders are provided:

Default (compressed gzip with named message pack format)
Bincode
Compressed gzip bincode
Pretty JSON

Users can extend the current recorder using any serde implementation.

Multiple precision settings are available:

Half (f16, i16)
Full (f32, i32)
Double (f64, i64)

Users can extend the current settings using any supported number type.

Optimizer

The optimizer API has undergone a complete overhaul. It now supports the new serialization paradigm with a simplified trait definition. The learning rate is now passed as a parameter to the step method, making it easier to integrate the new learning rate scheduler. The learning rate configuration is now a part of the learner API. For more information, please refer to the documentation.

Gradient Clipping

You can now clip gradients by norm or by value. An integration is done with optimizers, and gradient clipping can be configured from optimizer configs (Adam & SGD).

Learning Rate Scheduler

A new trait has been introduced for creating learning rate schedulers. This trait follows a similar pattern as the Module and Optimizer APIs, utilizing an associative type that implements the Record trait for state (de)serialization.

The following learning rate schedulers are now available:

Noam learning scheduler
Constant learning scheduler

Module

The module API has undergone changes. There is no longer a need to wrap modules with the Param struct; only the Tensor struct requires a parameter ID.

All modules can now be created with their configuration and state, eliminating the unnecessary tensor initializations during model deployment for inference.

Convolution

Significant improvements have been made to support all convolution configurations. The stride, dilation, and groups can now be set, with full support for both inference and training.

Transposed convolutions are available in the backend API but do not currently support the backward pass. Once they are fully supported for both training and inference, they will be exposed as modules.

Pooling

The implementation of the average pooling module is now available.

Transformer

The transformer decoder has been implemented, offering support for efficient inference and autoregressive decoding by leveraging layer norms, position-wise feed forward, self-attention, and cross-attention caching.

Tensor

The developer experience of the Tensor API has been improved, providing more consistent error messages across different backends for common operations. The Tensor struct now implements Display, allowing values, shape, backend information, and other useful details to be displayed in an easily readable format.

New operations

The flatten operation
The mask scatter operation

Torch Backend

The Torch backend now supports bf16.

ONNX

The burn-import project now has the capability to generate the required Burn code and model state from an ONNX file, enabling users to easily import pre-trained models into Burn. The code generation utilizes the end user API, allowing the generated model to be fine-tuned and trained using the learner struct.

Please note that not all operations are currently supported, and assistance from the community is highly appreciated. For more details, please refer to the burn-import repository https://github.com/burn-rs/burn/tree/main/burn-import.

Bug Fixes

Backward pass issue when there is implicit broadcasting in add #181

Thanks 🙏

Thanks to all contributors @nathanielsimard , @antimora, @agelas, @bioinformatist, @sunny-g
Thanks to current sponsors: @smallstepman

Contributors

antimora, sunny-g, and 4 other contributors

Assets 2

21 Mar 14:40

nathanielsimard

v0.6.0

00625d1

v0.6.0

Backend API

Almost all tensor operations now receive owned tensors instead of references, which enables backend implementations to reuse tensor-allocated memory.
Backends now have a different type for their int tensor, with its own set of operations.
Removed the IntegerBackend type.
Simpler Element trait with fewer functions.
New index-related operations (index_select , index_select_assign , index_select_dim and index_select_dim_assign).

Tensor API

The Tensor struct now has a third generic parameter Kind with a default value of Float.
There are three kinds of tensors: Float, Bool, and Int,
- Float Tensor ⇒ Tensor<B, D> or Tensor<B, D, Float>
- Bool Tensor ⇒ Tensor<B, D, Bool>
- Int Tensor ⇒ Tensor<B, D, Int>
You still don’t have to import any trait to have functions enabled, but they have an extra constraint based on the kind of tensor, so you can’t call matmul on a bool tensor. All of it with zero match or if statement, just pure zero-cost abstraction.
The BoolTensor struct has been removed.

Autodiff

Not all tensors are tracked by default. You now have to call require_grad.
The state is not always captured. Operations manually have to clone the state they need for their backward step. This results in a massive performance enhancement.

No Std

Some Burn crates don't require std anymore, which enables them to run on any platform:
- burn-core
- burn-ndarray
- burn-common
- burn-tensor
We have a WebAssembly demo with MNIST inference. The code is also available here with a lot of details explaining the process of compiling a model to WebAssembly.

Performance

The Tch backend now leverages in-place operations.
The NdArray backend now leverages in-place operations.
The convolution and maxpooling layers in the NdArray backend have been rewritten with much better performance.
The cross-entropy loss module leverages the new index_select operation, resulting in a big performance boost when the number of classes is high.

And of course, a lot of fixes and enhancements everywhere.

Thanks to all the contributors for their work @antimora @twitchax @h4rr9

Contributors

twitchax, antimora, and h4rr9

Assets 2

12 Feb 20:55

nathanielsimard

v0.5.0

2401d8a

v0.5.0

New Modules for Vision Tasks

Conv1D, Conv2D currently without support for stride, dilation, or group convolution
MaxPool2D
BatchNorm2D

New General Tensor Operations

log1p thanks to @bioinformatist
sin, cos, tanh thanks to @makroiss

Breaking Changes

Devices are now passed by reference, thanks to feedback from @djdisodo.
The shape function now returns an owned struct, and backends no longer need to cache each shape.

Contributors

xmakro, bioinformatist, and djdisodo

Assets 2

30 Dec 20:38

nathanielsimard

v0.4.0

2f179f1

v0.4.0

Bump versions (#141)

Assets 2

20 Nov 18:31

nathanielsimard

v0.3.0

df95fc3

v0.3.0

Separed backend crates

Assets 2

Releases: tracel-ai/burn

v0.9.0

Burn Book

Model repository

Changes to Burn

Neural networks

Tensors

Training

Backends

Dataset

Import & ONNX

Fix

Configuration

Documentation

Thanks

Contributors

v0.8.0

Changes

Tensor

Dataset

Burn-Import

Backend

Neural Networks

Fix

Documentation

Thanks

Contributors

v0.7.0

Serialization

Optimizer

Gradient Clipping

Learning Rate Scheduler

Module

Convolution

Pooling

Transformer

Tensor

New operations

Torch Backend

ONNX

Bug Fixes

Thanks 🙏

Contributors

v0.6.0

Backend API

Tensor API

Autodiff

No Std

Performance

Contributors

v0.5.0

New Modules for Vision Tasks

New General Tensor Operations

Breaking Changes

Contributors

v0.4.0

v0.3.0