Skip to content

Latest commit

 

History

History
333 lines (231 loc) · 13 KB

changelog.md

File metadata and controls

333 lines (231 loc) · 13 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Added/New

Fixed/Removed

Internal

2.0.1 - 2024-10-25

Minor bug fixes and documentation polishing.

Added/New

Fixed/Removed

  • Deprecate Python 3.8 as it will reach its end of life in October 2024 (PR)

  • Improve intersphinx mapping to curvlinops objects (issue, PR)

Internal

  • Update Github action versions and cache pip (PR)

  • Re-activate Monte-Carlo tests, refactor, and reduce their run time (PR)

  • Add more matrices in visual tour code example and prettify plots (PR)

  • Prettify visualizations in spectral density example (PR)

2.0.0 - 2024-08-15

This major release is almost fully backward compatible with the 1.x.y release except for one API change in KFACLinearOperator. Most notably, it adds support for HuggingFace LLMs, ships a linear operator for the inverse of KFAC, and offers many performance improvements.

Breaking changes to 1.x.y

  • Remove loss_average argument from KFACLinearOperator PR

Added/New

  • Support HuggingFace LLMs and provide an example (PR)

  • Add Linear operator for the inverse of KFAC (KFACInverseLinearOperator) (PR)

    • Support exact and heuristic damping of the Kronecker factors when inverting (PR)

    • Add option to fall back to double precision if inversion fails in single precision (PR)

    • Add functionality to checkpoint a linear operator (PR)

  • Add Estimation method for the squared Frobenius norm of a linear operator (PR)

    • Improve efficiency (PR)
  • Add support for BCEWithLogitsLoss in FisherMCLinearOperator and KFACLinearOperator (PR)

  • Improvements to KFACLinearOperator

    • Add functionality to compute exact trace, determinant, log determinant, and frobenius norm of KFACLinearOperator (PR)

    • Add option to compute input-based curvature, known as FOOF/ISAAC (PR)

    • Compute KFAC matrices without overwriting values in .grad (PR)

    • Add functionality to checkpoint a linear operator (PR)

  • Add inverse linear operator LSMRInverseLinearOperator to multiply by solving a least-squares system with LSMR (PR)

  • Improve linear operator interface

    • Add num_data argument to manually specify number of data points in a data loader and avoid one pass through the data (PR)
    • Support block-diagonal approximations in HessianLinearOperator via a new block_sizes argument (PR)
  • Add option to multiply with KFAC and its inverse purely in PyTorch (PR)

  • Improve performance when multiplying linear operators onto a matrix (PR)

  • Improve performance of EFLinearOperator (PR1 PR2) and FisherMCLinearOperator (PR1 PR2)

  • Implement adjoint of SubmatrixLinearOperator (PR)

Fixed/Removed

  • Device error of random number generator for MCFisherLinearOperator and KFACLinearOperator when running on GPU (PR)

  • Broken parameter mapping for KFAC when loading a linear operator to a different device (PR)

  • Device errors in tests (PR)

  • Scaling issue for Fisher matrices and KFAC for model outputs with more than two dimensions and mean reduction (issue, PR1, PR2, PR3)

  • Fix from introducing Enums (PR)

  • Fix output shapes of KFAC's matvec for convolution weights (PR)

Internal

  • Use latest black (black==24.1.1) (PR)

  • Use module names instead of tensor addresses to identify parameters in KFAC (PR)

  • Include links to source code in the documentation (PR)

  • Run Github actions for pull requests to any branch (PR)

  • Deprecate pkg_resources (PR)

  • Migrate from setup.py to pyproject.toml (PR)

1.2.0 - 2024-01-12

This release ships with many new features and requires PyTorch 2:

Added/New

  • Linear operator for KFAC (Kronecker-Factored Approximate Curvature) with support for a broad range of options

    • Prototype (torch.nn.MSELoss and torch.nn.Linear) (PR)

    • Support with torch.nn.CrossEntropyLoss (PR)

    • Support empirical Fisher (using gradients from data distribution) (PR) and type-2 estimation (using columns from the Hessian's matrix square root) (PR)

    • Support arbitrary parameter order (PR), weight-only or bias-only layers (PR), and support treating weight and bias jointly (PR)

    • Support networks with in-place activations (PR)

    • Support models with >2d output (PR)

    • Support KFAC 'expand' and 'reduce' approximations for general weight-sharing layers (PR, paper)

    • Support torch.nn.Conv2d (PR)

  • Linear operator for taking sub-matrices of another linear operator (PR, example (PR))

  • Linear operator for approximate inversion via the Neumann series (PR, example (PR))

  • Linear operator for a neural network's output-parameter Jacobian (PR) and its transpose (PR)

  • Implement adjoint from scipy.sparse.linalg.LinearOperator interface (PR)

  • Example for Fisher-weighted model averaging (PR)

  • Trace estimation via vanilla Hutchinson (PR)

  • Trace estimation via Hutch++ (PR)

  • Diagonal estimation via Hutchinson (PR)

  • Experimental: Linear operator for the Hessian of the loss w.r.t. an intermediate feature (PR)

Fixed/Removed

  • Allow for partially specified boundaries of the spectrum inside the spectral density estimation methods and only estimate the missing boundary (PR)

  • Deprecate python 3.7 (PR)

  • For future releases, we will abandon the development branch and switch to a workflow where new features are directly merged into main.

Internal

  • Switch from functorch to torch.func in reference implementation of tests (PR)

1.1.0 - 2023-02-19

Adds various new features:

Added/New

  • Inverses of linear operators with multiplication via conjugate gradients (PR, example)

  • Spectral density estimation methods from papyan2020traces (PR, basic example)

    • Add caching to recycle Lanczos iterations between densities with different hyperparameters (PR, demo 1, demo 2)
  • Example visualizing different supported curvature matrices (PR, example)

  • Linear operator for the uncentered gradient covariance matrix (aka 'empirical Fisher') (PR)

  • Example for computing eigenvalues with scipy.linalg.sparse.eigsh (PR, example)

  • Linear operator for a Monte-Carlo approximation of the Fisher (PR1, PR2, example)

Fixed/Removed

Internal

  • Refactor examples, extracting common functorch and array comparison methods (PR)

  • Add description of the library on the RTD landing page (PR)

  • Set up a proper test suite with cases (PR)

    • Add regression test cases (PR)
  • Update code to latest versions of linting CI (PR)

1.0.0 - 2022-09-30

Initial release