Previous Versions

neon v2.5.0

Docs_

Optimized SSD MKL backend performance (~3X boost version over version)
Bumped aeon version to v1.3.0
Fixed inference performance issue of MKL batchnorm
Fixed batch prediction issue for gpu backend
Enabled subset_pct for MNIST_DCGAN example
Updated "make clean" to clean up mkl artifacts
Added dockerfile for IA mkl

neon v2.4.0

Docs_

Enabled pip install through pypi
Updated MKLML to version 20171007 with performance improve of ~3X for mnist datalayer/nondatalayer and ~1.6X for DCGAN/WGAN datalayer
Updated resnet model to optimize performance with MKLML 20171007
Updated Alexnet weight file and fixed bug for deep dream
Fixed faster-rcnn inference model loading issue
Added data_loading time measurement and enabled GAN networks benchmarking
Updated to Aeon version 1.2.0
Enabled neon build with mklEngine on Windows systems

neon v2.3.0

Docs_

Optimized DeepSpeech2 MKL backend performance (~7X improvement over the CPU backend)
Fused convolution and bias layer which significantly boosted AlexNet and VGG performance on Intel architectures with MKL backend
Made SSD and Faster-RNN use VGG weight files in new format
Fixed use of reset_cells hyperparameter
Fixed MKL backend bug for GAN and Faster-RCNN models

neon v2.2.0

Docs_

Update MKLML version 20170908 that fixes a bug related to data conversions)
Add SSD example for bounding box object detection that works for both GPU and MKL backend
Add DeepSpeech2 MKL backend optimization that features ~3X improvement
Update aeon to 1.0.0 including new version of manifest (doc/source/loading_data.rst#aeon-dataloader)
Add CHWD Support for Batch Normalization in mkl backend
Modify ResNet-50 model's last layer to match the original ResNet-50 model paper
Enable Seq2Seq testing and benchmarking

neon v2.1.0

neon v2.1.0 released August 2, 2017 supporting:

Set MKL backend (-b mkl) as the default CPU backend on Linux (use -b cpu to specify original CPU backend)
Update MKLML version 20170720 (AVX512 code paths enabled by default and conversion optimizations)
Simplify ResNet example
Makefiles now check for virtualenv and pkg-config (#383)
Fix Deep Speech2 model on MKL backend
Fix MKL installation for "make sysinstall"

neon v2.0.0

Docs_

neon v2.0.0 released June 27, 2017 supporting:

Added support for MKL backend (-b mkl) on Linux, which boosts neon CPU performance significantly
Added WGAN model examples for LSUN and MNIST data
Enabled WGAN and DCGAN model examples for Python3
Added fix (using file locking) to prevent race conditions running multiple jobs on the same machine with multiple GPUs
Added functionality to display some information about hardware, OS and model used
Updated appdirs to 1.4.3 to be compatibile on Centos 7.3 for appliance

neon v1.9.0

Docs_

neon v1.9.0 released May 3, 2017 supporting:

Add support for 3D deconvolution
Generative Adversarial Networks (GAN) implementation, and MNIST DCGAN example, following GoodFellow 2014 (http://arXiv.org/abs/1406.2661)
Implement Wasserstein GAN cost function and make associated API changes for GAN models
Add a new benchmarking script with per-layer timings
Add weight clipping for GDM, RMSProp, Adagrad, Adadelta and Adam optimizers
Make multicost an explicit choice in mnist_branch.py example
Enable NMS kernels to work with normalized boxes and offset
Fix missing links in api.rst [#366]
Fix docstring for --datatype option to neon [#367]
Fix perl shebang in maxas.py and allow for build with numpy 1.12 [#356]
Replace os.path.join for Windows interoperability [#351]
Update aeon to 0.2.7 to fix a seg fault on termination

neon v1.8.2

Docs_

neon v1.8.2 released February 23, 2017 supporting:

Make the whale calls example stable and shuffle dataset before splitting into subsets
Reduce default depth in cifar_msra example to 2
Fix the formatting of the conv layer description
Fix documentation error in the video-c3d example
Support greyscale videos

neon v1.8.1

Docs_

neon v1.8.1 released January 17, 2017 supporting:

Bug fix: Add dilation to object dict and assign defaults to dil_w = dil_h = 1 [#335, #336]
Bug fix: Prevent GPU backend from ignoring non-zero slope in Rectlinclip and change default slope to 0
Bug fix: Nesterov momentum was updating velocities incorrectly

neon v1.8.0

Docs_

neon v1.8.0 released December 28, 2016 supporting:

Skip Thought Vectors (http://arxiv.org/abs/1506.06726) example
Dilated convolution support
Nesterov Accelerated Gradient option to SGD optimizer
MultiMetric class to allow wrapping Metric classes
Support for serializing and deserializing encoder-decoder models
Allow specifying the number of time steps to evaluate during beam search
A new community-contributed Docker image
Improved error messages when a tensor is created with an invalid shape or reshaped to an incompatible size
Fix bugs in MultiCost support
Documentation fixes [#331]

neon v1.7.0

Docs_

neon v1.7.0 released November 11 2016 supporting:

Update Data Loader to aeon https://github.com/NervanaSystems/aeon
Add Neural Machine Translation model
Remove Fast RCNN model (use Faster RCNN model instead)
Remove music_genres example
Fix super blocking for small N with 1D conv
Fix update-direct conv kernel for small N
Add gradient clipping to Adam optimizer
Documentation updates and bug fixes

neon v1.6.0

Docs_

neon v1.6.0 released September 21 2016 supporting:

Faster RCNN model
Sequence to Sequence container and char_rae recurrent autoencoder model
Reshape Layer that reshapes the input [#221]
Pip requirements in requirements.txt updated to latest versions [#289]
Remove deprecated data loaders and update docs
Use NEON_DATA_CACHE_DIR envvar as archive dir to store DataLoader ingested data
Eliminate type conversion for FP16 for CUDA compute capability >= 5.2
Use GEMV kernels for batch size 1
Alter delta buffers for nesting of merge-broadcast layers
Support for ncloud real-time logging
Add fast_style Makefile target
Fix Python 3 builds on Ubuntu 16.04
Run setup.py for sysinstall to generate version.py [#282]
Fix broken link in mnist docs
Fix conv/deconv tests for CPU execution and fix i32 data type
Fix for average pooling with batch size 1
Change default scale_min to allow random cropping if omitted
Fix yaml loading
Fix bug with image resize during injest
Update references to the ModelZoo and neon examples to their new locations

neon v1.5.4

Docs_

neon v1.5.4 released July 15 2016 supporting:

Implement Binarized Neural Networks from http://arxiv.org/pdf/1602.02830v3.pdf
Bug fixes [#268]

neon v1.5.3

Docs_

neon v1.5.3 released July 7 2016 supporting:

Bug fixes [#267]

neon v1.5.2

Docs_

neon v1.5.2 released July 6 2016 supporting:

Bug fixes to audio loader

neon v1.5.1

Docs_

neon v1.5.1 released June 30 2016 supporting:

Bug fixes

neon v1.5.0

Docs_

neon v1.5.0 released June 29 2016 supporting:

Python2/Python3 compatibility [#191]
Support for Pascal GPUs
Persistent RNN kernels [#262]
Dataloader enhancements (audio loader with examples)
HDF5 file data iterator
Convolution kernel improvements
Winograd kernel for fprop/bprop and 5x5 stride 1 filters
API documentation improvements [#234, #244, #263]
Cache directory cleanup
Reorganization of all unit tests
Check for compatible shapes before doing a memcpy [#182, #183]
Bug fixes [#231, #241, #253, #257, #259]

neon v1.4.0

Docs_

neon v1.4.0 released Apr 29 2016 supporting:

VGG16 based Fast R-CNN model using winograd kernels
new, backward compatible, generic data loader
C3D video loader model trained on UCF101 dataset
Deep Dream example
make conv layer printout more informative [#222]
fix some examples to use new arg override capability
improve performance for relu for small N
better support for arbitrary batch norm layer placement
documentation updates [#210, #213, #236]

neon v1.3.0

Docs_

neon v1.3.0 released Mar 3 2016 supporting:

Winograd kernels and associated autotuning routines
benchmarking scripts
deprecation of deterministic argument for backend constructor
improve batch norm stability with fp16 backend
allow strided support for dimshuffle kernel
speed up zero momentum gradient descent

neon v1.2.2

Docs_

neon v1.2.2 released Feb 24 2016 supporting:

Benchmarking enhancements
fast dimshuffle, transpose, other kernel speedups and refactoring
batch norm states fix, deterministic updates
example fixes for fast rcnn and conv_autoencoder
image decoding rescaling method fix
deserialization fixes for RNN's, refactoring
caffe compatibility fixes
documentation updates

neon v1.2.1

Docs_

neon v1.2.1 released Feb 15 2016 supporting:

New MergeSum, Colornoise layers
support for aspect_ratio scaling augmentation
updated IMDB sentiment analysis example
generic CSV batchwriter
various build and deserialization bugfixes, doc updates

neon v1.2.0

Docs_

neon v1.2.0 released Jan 31 2016 supporting:

Kepler GPU kernel support [#80]
new dataloader format, updated docs [#115, #170]
new serialization format
FastRCNN implementation, ROI pooling support [#135]
deep residual nets implementation and example
expanded model zoo
Ticker dataset and copy, repeat copy tasks
autodiff transpose support [#173]
numerous bug fixes and documentation updates.

neon v1.1.5

Docs_

neon v1.1.5 released Jan 15 2016 supporting:

CUDA kernels for lookuptable layer (up to 4x speedup)
support for determinstic Conv layer updatesa
LRN layer support
custom dataset walkthrough utilizing bAbI data
reduced number of threads in deep reduction EW kernels [#171]
additional (de)serialization routines [#106]
CPU tensor slicing fix
corrections for PrecisionRecall, MultiLabelStats [#148]
explicitly specify python2.7 for virtualenv [#155]
default to SM50 when no working GPU found [#186]
Add alpha to ELU activation [#164]
deconv callback fix [#162]
various documentation updates [#151, #152]

neon v1.1.4

Docs_

neon v1.1.4 released Jan 4 2016 supporting:

Add support for bidirectional RNNs and LSTMs
added ELU, leaky ReLU activations
significantly faster GPU kernel builds (using ptx instead of cuda-c)
data shuffling enhancements, removal of old data loader code.
caffe conv, pool, dropout layer matching and compatibility flags
add scheduling support for RMSProp
callback enhancements, additional unit tests
documentation auditing, added links to introductory video tutorials

neon v1.1.3

Docs_

neon v1.1.3 released Dec 1 2015 supporting:

deconvolution and weight histogram visualization examples and documentation
CPU convolution and pooling layer speedups (~2x faster)
bAbI question and answer interactive demo, dataset support.
various ImageLoader enhancements.
interactive usage improvements (shortcut Callback import, multiple Callbacks init, doc updates, single item batch size support)
set default verbosity level to warning
CIFAR10 example normalization updates
CUDA detection enhancements [#132]
only parse batch_writer arguments when used as a script, allow undefined global_mean [#137, #140]

neon v1.1.2

Docs_

neon v1.1.2 released Nov 17 2015 supporting:

completely re-written C++ multithreaded dataloader
new weight initialization options for recurrent layers
Added deconvolution visualization support (guided backprop)
new bAbI question answering example network
Improved performance of cifar10_allcnn, word_lstm examples
new CUDA-C max and avg pooling kernels
Additional bugfixes and documentation updates

neon v1.1.1

Docs_

neon v1.1.1 released Nov 6 2015 supporting:

Callback initialization bug fix [#127]
IMDB LSTM example bug fix [#130]
Added cuda-convnet2 style binary dropout variant
Added benchmark function to model (separate fprop, bprop, update timings)
Remove h_buffer references in lieu of outputs for recurrent layers
Multi-cost output buffer bugfix for inference [#131]
New timeseries prediction and generation example
Change Callback initialization to re-support named arguments. Separate out these arguments in argparser. [#128]

neon v1.1.0

Docs_

neon v1.1.0 released Oct 30 2015 supporting:

Sentiment analysis support (LSTM lookupTable based), new IMDB example
Support for merge and branch layer stacks via LayerContainers
- Sequential, Tree, MergeBroadcast, MergeMultiStream
Support for freezing layer stacks
Adagrad optimizer support
new GPU kernels for fast compounding batch norm, conv and pooling engine updates, new kernel build system and flags.
Modifications for Caffe support
- conv, pooling, P/Q updates, dropout layer normalization more in-line with Caffe approach. NOTE: this breaks backwards compatibility with some strided conv/pool related models serialized using older versions of neon as the output sizes may now be different. See the FAQ for more info.
- serialization enhancements to make caffe model import/export easier
- use per-channel mean subtraction instead of single global. NOTE: this breaks backwards compatibility with ImgMaster saved datasets prior to this revision. To correct, please use the included update_dataset_cache.py script in the util directory.
Default training cost display during progress bar is now calculated on a rolling window basis rather than from the beginning of each epoch
Separate Layer configuration and initialization steps
YAML based alexnet example
Callback enhancements.
- now pass args instead of having to spell out callbacks in each example
- Changed validation callback to loss callback, validation_frequency now evaluation_frequency
- Generic metric callback.
Various bug fixes
- non-contiguous array get for GPUTensors
- 1D slicing returns 2D matrices
- bin/neon serialization fixes for RNNs
- 3D conv fixes for fprop, bprop
- batch norm inference fix
- bias layer size fix
Documentation updates and improvements

neon v1.0.0

Docs_

neon v1.0.0 released Sep 9 2015, a major top to bottom re-write of the codebase that features the following enhancements:

RNN/LSTM
- Code is cleaner and achieves state of the art results on the Penn Tree Bank dataset using RNN/LSTM/GRU
- Fast image captioning model (~200x faster than CPU based NeuralTalk) on flickr8k dataset
Basic automatic differentiation support
Framework for visualizations (supported via callbacks)
Top-down refactoring & redesign to enable quicker iteration while keeping the speedups offered by our nervanagpu kernels
- Datasets are easier to specify
- Backend now uses OpTrees (similar to nervanagpu) to support autodiff
- nervanagpu merged in as a neon backend to simplify development and use
- YAML syntax is simplified (but not backwards compatible)
- Better documentation and wider test coverage

neon v0.9.0

Docs_

neon v0.9.0 supports:

Hyperparameter optimization
Multi GPU

neon v0.8.2

Docs_

neon v0.8.2 supports:

Integration with our cudanet fork of Alex Krizhevsky's cuda-convnet2 library for Kepler GPU is

We will add support for previous generation GPUs, multi-GPU and hyperparameter optimization in the upcoming releases.

neon v0.8.1

Initial public release of neon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

previous_versions.rst

previous_versions.rst

Previous Versions

neon v2.5.0

neon v2.4.0

neon v2.3.0

neon v2.2.0

neon v2.1.0

neon v2.0.0

neon v1.9.0

neon v1.8.2

neon v1.8.1

neon v1.8.0

neon v1.7.0

neon v1.6.0

neon v1.5.4

neon v1.5.3

neon v1.5.2

neon v1.5.1

neon v1.5.0

neon v1.4.0

neon v1.3.0

neon v1.2.2

neon v1.2.1

neon v1.2.0

neon v1.1.5

neon v1.1.4

neon v1.1.3

neon v1.1.2

neon v1.1.1

neon v1.1.0

neon v1.0.0

neon v0.9.0

neon v0.8.2

neon v0.8.1

Files

previous_versions.rst

Latest commit

History

previous_versions.rst

File metadata and controls

Previous Versions

neon v2.5.0

neon v2.4.0

neon v2.3.0

neon v2.2.0

neon v2.1.0

neon v2.0.0

neon v1.9.0

neon v1.8.2

neon v1.8.1

neon v1.8.0

neon v1.7.0

neon v1.6.0

neon v1.5.4

neon v1.5.3

neon v1.5.2

neon v1.5.1

neon v1.5.0

neon v1.4.0

neon v1.3.0

neon v1.2.2

neon v1.2.1

neon v1.2.0

neon v1.1.5

neon v1.1.4

neon v1.1.3

neon v1.1.2

neon v1.1.1

neon v1.1.0

neon v1.0.0

neon v0.9.0

neon v0.8.2

neon v0.8.1