Skip to content

v1.3.0

Compare
Choose a tag to compare
@shu65 shu65 released this 25 May 07:37
· 52 commits to master since this release

ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.3.0 release is adds several enhancements, bug fixes to 1.2 and support to latest Chainer release such as v4.0.0 or v4.1.0.

Notable enhancements are optimization of PureNcclCommunicator for double buffering optimizer and FP16 All-Reduce support. With this version, ChainerMN is able to achieve high performance in non-Infiniband interconnect environment equipped with commodity network gears or cloud services such as Amazon Web Services.

Features

  • Expose intra- and inter- rank and size (#263)
  • Add allreduce method to communicator interface with implementation (#237)
  • Add FP16 and FP64 Supports to PureNcclComunicator (#187)
  • Add MultiNodeIterator as experimental (#186)

Enhancements

  • Remove unused nccl comm and mpi comm (#257)
  • Update supported Chainer versions (#223, #238)
  • Expose CommunicatorBase as communicator interface with docs (#235)
  • Clean up Communicator interface with changes (#232)
  • Replace get_device (#231)
  • Optimize PureNcclCommunicator to accelerate training with double buffering (#216)

Bugs

  • Fix MultiNodeNStepRNN to use Chainer n_cells (#222)
  • Fix send to avoid deadlock without inputs does not reqires grad (#214)
  • Check contiguousness of outgoing arrays (#213)

Documents

  • Add note on checkpoints and fix broken autofunction (#264)
  • Expose intra- and inter- rank and size (#263)
  • Update the description about using FP16 (#262)
  • Added an FAQ entry about MPI hang issue. (#249)
  • Expose CommunicatorBase as communicator interface with docs (#235)