Skip to content

Releases: asappresearch/sru

2.7.0-rc1 - Postpone CUDA initialization

04 Jan 21:18
Compare
Choose a tag to compare

Postponed the CUDA initialization to the instantiation of SRUCells in order to ensure that the process where it takes place is where we plan to run our model.

3.0.0.dev6

17 Jun 23:33
2a344d3
Compare
Choose a tag to compare
3.0.0.dev6 Pre-release
Pre-release

More layer norm options;
More info in _repr_()

GPU inference in Torchscript model; post layer norm

18 May 16:12
a698784
Compare
Choose a tag to compare
  • Support GPU/CUDA inference in Torchscript model
  • Support post layer norm
  • Support custom init value for weight_c
  • Add unit tests for GPU inference
  • Add unit tests for backward()
  • Add more unit tests for Torchscript

Support GPU/CUDA inference in torchscript model; fix an issue

13 May 17:22
Compare
Choose a tag to compare

Dev1:

  • Support GPU/CUDA inference in torchscript model
  • Support post layer norm
  • Support custom init value for weight_c
    Dev2:
  • Fix an issue

Support GPU/CUDA inference in torchscript model

13 May 22:39
Compare
Choose a tag to compare
  • Support GPU/CUDA inference in torchscript model
  • Support post layer norm
  • Support custom init value for weight_c

Support GPU/CUDA inference in torchscript model

12 May 15:08
Compare
Choose a tag to compare
  • Support GPU/CUDA inference in torchscript model
  • Support post layer norm
  • Support custom init value for weight_c

v3.0.0.dev3

04 May 02:38
b1315fe
Compare
Choose a tag to compare

Fix a typo. Add an option to only use attention_last_n_layers. Replace option normalize_after with normalization_type

v3.0.0.dev2 Bug fixes

18 Mar 17:56
Compare
Choose a tag to compare
v3.0.0.dev2 Bug fixes Pre-release
Pre-release

Changes:

  • change weight_c_init from Optional[float] = None to float = 1.0

Bug fixes:

  • fix a potential memory leak in custom op
  • fix bug in cuda maskpad
  • torchscript compatible in torch 1.5.1 now

Full fp16 training; SRUpp release

05 Mar 17:33
81a657b
Compare
Choose a tag to compare
Pre-release

Note that future 3.0.0 release, and future 3.0.0 dev releases might not be backwards compatible with this dev release.

Key features / changes:

  • #160: SRU++ is now available. Unit tests are included for torchscript compatibility and correctness. Example language model training code is available.
  • #166: fp16 training improvement. The recurrence kernel will run in float16 now when amp is enabled. This gives an additional ~10% speedup on tested language model training, ~20% reduction on GPU memory usage and no regression on final results.
  • #167: Code clean-up. No autocast block needed in sru.ops.elementwise_recurrence_gpu. This would allow both Native AMP and APEX AMP to work. (Credit: @visionscaper)

Other changes:

  • Fix an dtype error within adaptive embedding (#168)
  • Significant speed-up on BILLONWORD training (#169)
  • LICENCE update requested by IPC (#165)

Dev release of v3

22 Jan 20:02
Compare
Choose a tag to compare
Dev release of v3 Pre-release
Pre-release

Note that future release and dev releases of v3 might be backwards incompatible with this dev release.

This dev release:

  • custom_m renamed to transform_module
  • transform_module always used now (the weight and weight_proj parameters have been removed)
  • projection_size can take in a sequence of projection_sizes, one per layer
  • n_proj in SRUCell renamed to projection_size, for consistency