Skip to content

Releases: chainer/chainermn

ChainerMN 1.3.1 the last release

23 Oct 08:06
v1.3.1
2bc2895
Compare
Choose a tag to compare

Important notice

This will be the last release of ChainerMN as an independent Python
package. But its maintenance will continue as a part of Chainer. The
latest code has been merged to Chainer v5 release candidate. This
release is for Chainer v4.x and ChainerMN v1.3 users to get recent
bugfix and enhancement.

enhancement

  • Improve performance of fetching device memory (#270, thanks @levelfour!)
  • Reduce CUDA kernel launch in BN (updated) (#282)
  • Bump version and not allow Chainer 5.x (#296)

bug

  • Bugfix bcast for FP16 (#271)
  • bugfix bcast (#288)
  • Override Optimizer.setup() method at multi-node optmizers (#292)
  • Fix errors on 0-d array input to Communicator APIs (#293)

document

  • Workaround forkserver (#290)
  • Modify image of parallel convolution (#294, thanks @levelfour!)

experimental feature

  • add mnbn with nccl (#289)

test

  • Travis update (#279)
  • added OMP_NUM_THREADS=1 (#284)
  • Update Chainer version to 4.4.0 in .travis.yml (#286)

example

v1.3.0

25 May 07:37
Compare
Choose a tag to compare

ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.3.0 release is adds several enhancements, bug fixes to 1.2 and support to latest Chainer release such as v4.0.0 or v4.1.0.

Notable enhancements are optimization of PureNcclCommunicator for double buffering optimizer and FP16 All-Reduce support. With this version, ChainerMN is able to achieve high performance in non-Infiniband interconnect environment equipped with commodity network gears or cloud services such as Amazon Web Services.

Features

  • Expose intra- and inter- rank and size (#263)
  • Add allreduce method to communicator interface with implementation (#237)
  • Add FP16 and FP64 Supports to PureNcclComunicator (#187)
  • Add MultiNodeIterator as experimental (#186)

Enhancements

  • Remove unused nccl comm and mpi comm (#257)
  • Update supported Chainer versions (#223, #238)
  • Expose CommunicatorBase as communicator interface with docs (#235)
  • Clean up Communicator interface with changes (#232)
  • Replace get_device (#231)
  • Optimize PureNcclCommunicator to accelerate training with double buffering (#216)

Bugs

  • Fix MultiNodeNStepRNN to use Chainer n_cells (#222)
  • Fix send to avoid deadlock without inputs does not reqires grad (#214)
  • Check contiguousness of outgoing arrays (#213)

Documents

  • Add note on checkpoints and fix broken autofunction (#264)
  • Expose intra- and inter- rank and size (#263)
  • Update the description about using FP16 (#262)
  • Added an FAQ entry about MPI hang issue. (#249)
  • Expose CommunicatorBase as communicator interface with docs (#235)

v1.2.0

06 Feb 05:13
530799c
Compare
Choose a tag to compare

This is the release of ChainerMN v1.2.0. The highlighted differences are as follows: compatibility with Chainer v3.3.0 and v4.0.0b3, double buffering feature to overlap communication and computation, and removal of the dependency to Cython.

List of Changes

Features

  • Add intra_rank to all communicators (#177)
  • Support double buffering (#134)

Enhancement

  • Use nccl in CuPy (#181)
  • Change MultiNodeEvaluator to object patching from proxy pattern (#179)

Bugs

  • Fix bugs in DoubleBufferingOptimizer and PureNcclCommunicator (#201)

Document

  • Fixed typos (#197)
  • Fix typo (#182)
  • Add reference of ChainerMN paper (#180)

Tests

  • Add Chainer 4.0.0b to Travis (#193)
  • Adds test for Chainer 3.3 (#190)
  • Fix import in some test cases (#184)
  • Fix importing errors in unit tests (#178)

Installation

  • Delete files related to cython (#202)
  • Bump library versions, esp. Chainer to 3.3 (#200)
  • Update version to 1.2.0 (#199)
  • Fix Chainer version requirement (#198)
  • Remove cython dependency on installation (#44)

v1.1.0

22 Dec 09:18
cab94c9
Compare
Choose a tag to compare

ChainerMN 1.1.0 release notes

ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.1.0 release is a minor update that adds several enhancements and bug fixes to 1.0, and supports latest Chainer release.

New experimental features include multi-node checkpointing and resuming. It also has several enhancements on DataSet distribution, supporting dynamically changing networks, It adds support to latest Chainer 3.2.0 and drops support on older Chainer versions such as 1.x and 2.x series. Also, pure_nccl communicator is now generally available and most recommended communicator.

bugfix

  • Fix array length bug of PureNcclCommunicator (#127)
  • fix setup.py (#119)

enhancement

  • Support a wider range of dynamically initialized models for MultiNodeOptimizer (#148)
  • Remove outdated cudnn variable to make compatible with CuPy v4 (#147, thanks @tkerola!)
  • Avoid sending SubDataset and use broadcast for datasets (#140)
  • Support tuple data communication (#139)
  • Chainer v3 support (#123)

feature

  • pure_nccl communicator is now generally available (#165)
  • Add simple and distributed checkpointing and automatic recovery (#144)
  • Support all-to-all (#135)

document

  • Update supported Chainer version in the document (#162)

installation

  • Update docs and add cupy as requirement (#171)

example

  • model-parallel seq2seq example (#122)
  • Dual parallel example (#121)

test

  • Fix a bug of point to point with GPU (#174)
  • Pass unit tests more than 3 processes (#172)
  • Refactor test directory structure to align Chainer's test dir (#169)
  • Move from nose to pytest (#167)
  • Refactor tests directory (#155)
  • Reduce the number of procs of MPI test for robust CI (#136)
  • Add Chainer v3 Test to Travis CI (#141)

other

  • Add chainer.utils.experimental to create_multi_node_n_step_rnn (#153)
  • Add chainer.utils.experimental to distributed_cpr (#152)
  • Update README (cache information for seq2seq) (#126)

v1.0.0

01 Sep 00:46
Compare
Choose a tag to compare

v1.0.0

This is ChainerMN v1.0.0, the first stable version. This version includes several new features including NCCL2 support, model parallelism, new examples.

List of Changes

Features

  • NCCL2 support (#105)
  • MultiNodeBatchNormalization (#106)
  • Model parallel interface
  • DatasetSizeError (#111)
  • Non-CUDA-aware communicator (#93)
  • shuffle option to chainermn.scatter_dataset (#92)

Enhancement

  • Refactor directories and files (#117)
  • Adding comments (#107)
  • Clear names for functions and variables (#103)

Examples

  • Dcgan example (#99, thanks @corochann!)
  • Seq2seq example (#63)
  • Model-parallel MNIST example (#98)

Documents

  • ChainerMN logo (#110)
  • Mention sudo's env-var issue in the installation document (#87)
  • Mention --gpu option in the MNIST tutorial (#85)
  • Refactored API reference (#118)
  • Minor fixes (#116, #90, #86)

Bug Fixes

  • None to the type specification (#109)
  • Fix imagenet models for v2 (#104, thanks @mitmul!)

Tests

  • Add MNIST test on GPU (#113)
  • Fix the false success on Travis-CI (#112)

v1.0.0b2

02 Jun 06:48
Compare
Choose a tag to compare
v1.0.0b2 Pre-release
Pre-release

v1.0.0b2

This is the second beta release of ChainerMN 1.0.0.
This release includes a minor API update and several bug fixes.
In addition, we confirm that ChainerMN works fine with Chainer v2.0.0, which has been released on June 1st.

API change:

  • chainermn.get_epoch_trigger has been marked as deperecated.

Complete list of changes

bug

  • Fix typo (#75)
  • Fix assert for cases when use_nccl == False (#68)
  • Fix bug of LogReport in MNIST example (#58)

enhancement

  • Add --communicator option to MNIST exmaple (#69)
  • Add a base class for ChainerMN communicators (#65)
  • Equalize subdataset sizes and deprecate chainermn.get_epoch_trigger(#61)

document

  • Fix broken search results on RTD (#57)
  • Fix broken link in search (#56)

v1.0.0b1

08 May 21:21
Compare
Choose a tag to compare
v1.0.0b1 Pre-release
Pre-release

This is the first beta release of ChainerMN! It enables distributed training with Chainer (both v1 and v2) based on the basic synchronized data-parallel approach. Specifically, it includes:

  • Optimizer and evaluator wrappers for distributed training and evaluation
  • Dataset utility functions
  • Examples: MNIST and ImageNet
  • Installation guide, tutorial, and API reference