Skip to content

v0.7.5

Latest
Compare
Choose a tag to compare
@snarayan21 snarayan21 released this 09 Apr 00:35
· 4 commits to main since this release
3ba9301

馃殌 Streaming v0.7.5

Streaming v0.7.5 is released! Install via pip:

pip install --upgrade mosaicml-streaming==0.7.5

馃拵 New Features

1. Tensor/Sequence Parallelism Support

Using the replication argument, easily share data samples across multiple ranks, enabling sequence or tensor parallelism.

  • Replicating samples across devices (SP / TP enablement) by @knighton in #597
  • Expanded replication testing + documentation by @snarayan21 in #607
  • Make streaming use the correct number of unique samples with SP/TP by @snarayan21 in #619

2. Overhauled Streaming Documentation

New and improved streaming documentation can be found here -- please submit issues with any feedback.

3. batch_size is now required for StreamingDataset

As we have seen multiple errors and performance degradations from users not setting the batch_size argument to StreamingDataset, we are making it a requirement to iterate over the dataset.

3. Support for Python 3.11, deprecate Python 3.8

  • Add support for Python 3.11 and deprecate Python 3.8 by @karan6181 in #586

馃悰 Bug Fixes

  • [easy typo fix] fix f-string by @bigning in #596
  • Change comparison in partitions to include equals by @JAEarly in #587
  • Use type int when initializing SharedMemory size by @bchiang2 in #604
  • COCO Dataset fix -- avoids allow_unsafe_types=True by @snarayan21 in #647

馃敡聽Improvements

What's Changed

New Contributors

Full Changelog: v0.7.4...v0.7.5