Releases: chainer/chainermn
ChainerMN 1.3.1 the last release
Important notice
This will be the last release of ChainerMN as an independent Python
package. But its maintenance will continue as a part of Chainer. The
latest code has been merged to Chainer v5 release candidate. This
release is for Chainer v4.x and ChainerMN v1.3 users to get recent
bugfix and enhancement.
enhancement
- Improve performance of fetching device memory (#270, thanks @levelfour!)
- Reduce CUDA kernel launch in BN (updated) (#282)
- Bump version and not allow Chainer 5.x (#296)
bug
- Bugfix bcast for FP16 (#271)
- bugfix bcast (#288)
- Override Optimizer.setup() method at multi-node optmizers (#292)
- Fix errors on 0-d array input to Communicator APIs (#293)
document
- Workaround forkserver (#290)
- Modify image of parallel convolution (#294, thanks @levelfour!)
experimental feature
- add mnbn with nccl (#289)
test
- Travis update (#279)
- added OMP_NUM_THREADS=1 (#284)
- Update Chainer version to 4.4.0 in .travis.yml (#286)
example
- Add parallel convolution example (#272, thanks @levelfour!)
v1.3.0
ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.3.0 release is adds several enhancements, bug fixes to 1.2 and support to latest Chainer release such as v4.0.0 or v4.1.0.
Notable enhancements are optimization of PureNcclCommunicator for double buffering optimizer and FP16 All-Reduce support. With this version, ChainerMN is able to achieve high performance in non-Infiniband interconnect environment equipped with commodity network gears or cloud services such as Amazon Web Services.
Features
- Expose intra- and inter- rank and size (#263)
- Add allreduce method to communicator interface with implementation (#237)
- Add FP16 and FP64 Supports to PureNcclComunicator (#187)
- Add MultiNodeIterator as experimental (#186)
Enhancements
- Remove unused nccl comm and mpi comm (#257)
- Update supported Chainer versions (#223, #238)
- Expose CommunicatorBase as communicator interface with docs (#235)
- Clean up Communicator interface with changes (#232)
- Replace get_device (#231)
- Optimize PureNcclCommunicator to accelerate training with double buffering (#216)
Bugs
- Fix MultiNodeNStepRNN to use Chainer n_cells (#222)
- Fix send to avoid deadlock without inputs does not reqires grad (#214)
- Check contiguousness of outgoing arrays (#213)
Documents
v1.2.0
This is the release of ChainerMN v1.2.0. The highlighted differences are as follows: compatibility with Chainer v3.3.0 and v4.0.0b3, double buffering feature to overlap communication and computation, and removal of the dependency to Cython.
List of Changes
Features
Enhancement
Bugs
- Fix bugs in DoubleBufferingOptimizer and PureNcclCommunicator (#201)
Document
Tests
- Add Chainer 4.0.0b to Travis (#193)
- Adds test for Chainer 3.3 (#190)
- Fix import in some test cases (#184)
- Fix importing errors in unit tests (#178)
Installation
v1.1.0
ChainerMN 1.1.0 release notes
ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.1.0 release is a minor update that adds several enhancements and bug fixes to 1.0, and supports latest Chainer release.
New experimental features include multi-node checkpointing and resuming. It also has several enhancements on DataSet distribution, supporting dynamically changing networks, It adds support to latest Chainer 3.2.0 and drops support on older Chainer versions such as 1.x and 2.x series. Also, pure_nccl
communicator is now generally available and most recommended communicator.
bugfix
enhancement
- Support a wider range of dynamically initialized models for MultiNodeOptimizer (#148)
- Remove outdated cudnn variable to make compatible with CuPy v4 (#147, thanks @tkerola!)
- Avoid sending SubDataset and use broadcast for datasets (#140)
- Support tuple data communication (#139)
- Chainer v3 support (#123)
feature
pure_nccl
communicator is now generally available (#165)- Add simple and distributed checkpointing and automatic recovery (#144)
- Support all-to-all (#135)
document
- Update supported Chainer version in the document (#162)
installation
- Update docs and add cupy as requirement (#171)
example
test
- Fix a bug of point to point with GPU (#174)
- Pass unit tests more than 3 processes (#172)
- Refactor test directory structure to align Chainer's test dir (#169)
- Move from nose to pytest (#167)
- Refactor
tests
directory (#155) - Reduce the number of procs of MPI test for robust CI (#136)
- Add Chainer v3 Test to Travis CI (#141)
other
v1.0.0
v1.0.0
This is ChainerMN v1.0.0, the first stable version. This version includes several new features including NCCL2 support, model parallelism, new examples.
List of Changes
Features
- NCCL2 support (#105)
- MultiNodeBatchNormalization (#106)
- Model parallel interface
- DatasetSizeError (#111)
- Non-CUDA-aware communicator (#93)
shuffle
option tochainermn.scatter_dataset
(#92)
Enhancement
- Refactor directories and files (#117)
- Adding comments (#107)
- Clear names for functions and variables (#103)
Examples
- Dcgan example (#99, thanks @corochann!)
- Seq2seq example (#63)
- Model-parallel MNIST example (#98)
Documents
- ChainerMN logo (#110)
- Mention sudo's env-var issue in the installation document (#87)
- Mention --gpu option in the MNIST tutorial (#85)
- Refactored API reference (#118)
- Minor fixes (#116, #90, #86)
Bug Fixes
Tests
v1.0.0b2
v1.0.0b2
This is the second beta release of ChainerMN 1.0.0.
This release includes a minor API update and several bug fixes.
In addition, we confirm that ChainerMN works fine with Chainer v2.0.0, which has been released on June 1st.
API change:
chainermn.get_epoch_trigger
has been marked as deperecated.
Complete list of changes
bug
- Fix typo (#75)
- Fix assert for cases when
use_nccl == False
(#68) - Fix bug of LogReport in MNIST example (#58)
enhancement
- Add
--communicator
option to MNIST exmaple (#69) - Add a base class for ChainerMN communicators (#65)
- Equalize subdataset sizes and deprecate
chainermn.get_epoch_trigger
(#61)
document
v1.0.0b1
This is the first beta release of ChainerMN! It enables distributed training with Chainer (both v1 and v2) based on the basic synchronized data-parallel approach. Specifically, it includes:
- Optimizer and evaluator wrappers for distributed training and evaluation
- Dataset utility functions
- Examples: MNIST and ImageNet
- Installation guide, tutorial, and API reference