Skip to content

Latest commit

 

History

History
71 lines (54 loc) · 2.61 KB

README.md

File metadata and controls

71 lines (54 loc) · 2.61 KB

scalagrad

A tiny, scalar-valued, reverse mode automatic differentiation library inspired by PyTorch and Andrej Karpathy's micrograd.

scalagrad has been designed to educate myself, and to serve as a building block for further experimentation (e.g. source code transformation, tape-based autodiff, ...).

There's no performance optimizations, and the current code won't scale past toy datasets. The construction of a dynamic graph of derivatives is heavy on heap allocations, and the backward adjoint gradients calculation step requires a linear ordering of the nodes.The latter is achieved by a modified depth-first search on the graph (e.g. a topological sort). A good overview of this approach can be found in UWashington CSE599W lecture notes https://dlsys.cs.washington.edu/pdf/lecture4.pdf

Build

Build with ./gradlew build

Examples

This sections contains two basic examples of scalagrad in action. More can be found under tests.

Differentiation

The code below shows how to compute the derivative of the expression z = x * y + sin(x).

import scalagrad.Var

val x = Var(0.5)
val y = Var(4.2)
val z = x * y + sin(x)

z.backward()
println(z.grad)

Training a neural network

The code below shows how to train a 2-layers neural network to solve the Friedman1 regression problem.

import scalagrad.Var
import scalagrad.dataset.Friedman1
import scalagrad.loss.MSE
import scalagrad.network.Perceptron

val randomState = Some(12345)
val dataset = new Friedman1(numberOfSamples=100, randomState = randomState)
val model = Perceptron(2, List(16, 16, 1), randomState = randomState) // A 2-layers neural network
val optim = GradientDescent(model.parameters.toSeq)

val X = dataset.X.map(v => v.map(f => Var(f)))
val y = dataset.y.map(y => Var(y))

(0 until epochs).foreach(_ => {
  val output = X.map(x => model(x))
  val loss = MSE(y, output)
  val regularization = model.parameters.map(p => p**2).reduce(_ + _) * Var(1e-4)
  val lossFunction = loss + regularization

  optim.zeroGrad()
  lossFunction.backward()
  optim.step()
})

References