Skip to content

Releases: activeloopai/deeplake

1.3.3 🚀

09 Apr 07:56
Compare
Choose a tag to compare

🧭 What's Changed

  • to_pytorch now supports a new argument (key_list) that only passes certain tensors to it and speeds up iteration time in case multiple extra tensors are present. (#715) @AbhinavTuli
  • Caching present within to_pytorch has been improved to tensors with dynamic shapes (earlier it was saving only the current sample in the cache) (#715) @AbhinavTuli
  • Added ability to store DatasetView as a new Dataset (#740) @AbhinavTuli
  • Introduces Windows and MacOS tests to circleci (#719) @haiyangdeperci
  • Benchmark restructuring and memory profiling (#642) @benchislett
  • changed default dtype of classlabel from uint16 to uint8 (#745) @AbhinavTuli
  • Updated humbug version (#728) @zomglings

🗂️ Documentation

  • Add examples of dataset generation and modification using transforms, trainings with TensorFlow and PyTorch (#675) @kristinagrig06
  • Added code and testing notebook for running dataset transforms on a ray cluster. (#713) @kristinagrig06

🐛 Bug Fixes

🔗 Dependency Updates

⚙️ Who Contributed

@AbhinavTuli, @Diveafall, @benchislett, @haiyangdeperci, @imshashank, @kristinagrig06 and @zomglings

1.3.2

26 Mar 11:10
f688ec3
Compare
Choose a tag to compare

🚀 New

🧭 What's Changed

  • to_tensorflow now supports a new argument (key_list) that only passes certain tensors to it and speeds up iteration time in case multiple extra tensors are present. (#689) @AbhinavTuli
  • Caching present within to_tensorflow has been improved to tensors with dynamic shapes (earlier it was saving only the current sample in the cache) (#689) @AbhinavTuli
  • Adds the option to specify None as compressor while defining the schema (#689) @AbhinavTuli
  • Adds the ability to slice dynamically shaped tensors and obtain a list instead of iterating over them one by one. (#689) @AbhinavTuli
  • transform logic has been modified to work properly with multiple workers (#689) @AbhinavTuli
  • Added tags to usage and crash reports (#697) @zomglings
  • Added ipynb file with benchmark tests for dnafrag package (#676) @DebadityaPal
  • Relaxed hub requirements (#659) @haiyangdeperci
  • Updated Objectron dataset tensors from generic types to hub schema representations (#705) @haiyangdeperci

🐛 Bug Fixes

🗂 Documentation

🔗 Dependency Updates

⚙️ Who Contributed

@AbhinavTuli, @DebadityaPal, @McCrearyD, @TakshPanchal, @dependabot-preview, @dependabot-preview[bot], @dhiganthrao, @george-zakharov, @haiyangdeperci, @hakanbakacak, @kevinlu1211, @kristinagrig06, @madhucharan, @mynameisvinn, @thisiseshan, @zomglings

1.3.1

26 Mar 08:57
06cab25
Compare
Choose a tag to compare

🚀 New

🧭 What's Changed

  • to_tensorflow now supports a new argument (key_list) that only passes certain tensors to it and speeds up iteration time in case multiple extra tensors are present. (#689) @AbhinavTuli
  • Caching present within to_tensorflow has been improved to tensors with dynamic shapes (earlier it was saving only the current sample in the cache) (#689) @AbhinavTuli
  • Adds the option to specify None as compressor while defining the schema (#689) @AbhinavTuli
  • Adds the ability to slice dynamically shaped tensors and obtain a list instead of iterating over them one by one. (#689) @AbhinavTuli
  • transform logic has been modified to work properly with multiple workers (#689) @AbhinavTuli
  • Added tags to usage and crash reports (#697) @zomglings
  • Added ipynb file with benchmark tests for dnafrag package (#676) @DebadityaPal
  • Relaxed hub requirements (#659) @haiyangdeperci
  • Updated Objectron dataset tensors from generic types to hub schema representations (#705) @haiyangdeperci

🐛 Bug Fixes

🗂 Documentation

🔗 Dependency Updates

⚙️ Who Contributed

@AbhinavTuli, @DebadityaPal, @McCrearyD, @TakshPanchal, @dependabot-preview, @dependabot-preview[bot], @dhiganthrao, @george-zakharov, @haiyangdeperci, @hakanbakacak, @kevinlu1211, @kristinagrig06, @madhucharan, @mynameisvinn, @thisiseshan, @zomglings

1.3.0

08 Mar 15:31
853456a
Compare
Choose a tag to compare

🧭 What's Changed

  1. Version Control has been added to Hub Datasets! (#610) @AbhinavTuli
  2. to_tensorflow now properly supports Text datasets (#658) @AbhinavTuli
  3. Hub crash and system information reports using Bugout (#624) @zomglings
  4. Added support for multiple BBox and Classlabel, instead of Sequences. (#658) @AbhinavTuli
  5. CLI name has been changed from hub to activeloop (#631) @haiyangdeperci
  6. Notebook example for creating dataset for object detection and instance segmentation added(#629) @haritsahm
  7. Tutorial for working with Audio Added (#592) @mynameisvinn

🚀 New

  1. Hub version command cli (#628) @sparkingdark
  2. Automatic Release Drafter added to repository (#598) @Anselmoo
  3. Improve Directory Structure of Examples (#630) @SauravMaheshkar
  4. Put zarr, tileDB, and hub benchmarks in one file (#534) @DebadityaPal
  5. Refactored Dataset Class (#576) @DebadityaPal
  6. Add Github Actions CI pipeline (#372) @ADI10HERO
  7. Improve Directory Structure of Examples (#630) @SauravMaheshkar

🐛 Bug Fixes

  1. Removed Assertions from shape_detector.py and added exceptions (#616) @DebadityaPal
  2. Adds support for dataset views in sharded dataset (#557) @AbhinavTuli
  3. Advanced slicing added for Sharded Dataset (#558) @AbhinavTuli

🗂 Documentation

  1. README added in Korean (#621) @HyeongminLEE
  2. README added in Bahasa Indonesia (#645) @haritsahm
  3. README added in French (#640) @MargauxMasson
  4. README added in Turkish (#608) @hakanbakacak
  5. Chinese Readme Proofread and Update (#613) @Cynthia7979
  6. Change ds.commit() to ds.flush() throughout in README.md (#619) @galbwe
  7. Added explaination for local file system to docs (#634) @McCrearyD
  8. Replaced commit() with flush() in documentation. (#604) @dhiganthrao
  9. Add MinIO to Data Storage docs (#605) @gabriel-milan
  10. Updated example notebooks with pip (#585) @MojammelHossain
  11. Typos fixed (#591) @dPacc

🔗 Dependency Updates

Bump pytest from 6.2.1 to 6.2.2 (#496) @dependabot-preview
Bump ray from 1.0.0 to 1.2.0 (#554) @dependabot-preview
Bump boto3 from 1.16.39 to 1.17.20 (#646) @dependabot-preview

⚙️ Who Contributed

@ADI10HERO, @AbhinavTuli, @Anselmoo, @Cynthia7979, @DebadityaPal, @HyeongminLEE, @MargauxMasson, @McCrearyD, @MojammelHossain, @SauravMaheshkar, @dPacc, @davidbuniat, @dhiganthrao, @gabriel-milan, @galbwe, @haiyangdeperci, @hakanbakacak, @haritsahm, @imshashank, @mikayelh, @mynameisvinn, @sparkingdark and @zomglings

1.2.3

07 Mar 04:44
63cf2d5
Compare
Choose a tag to compare

Release Notes

  1. Reverting shape checks for Mask schema to maintain backward compatibility.

1.2.2

17 Feb 10:12
bd47a0c
Compare
Choose a tag to compare

Release Notes

  1. Hotfix for a bug that resulted in incorrect slicing of TensorView.

1.2.1

16 Feb 03:13
d9ebc7e
Compare
Choose a tag to compare

Release Notes

  1. Dataset copying has been added allowing you to copy your own and other users' datasets easily. Datasets can be copied across gcs, s3, aws, local storage and hub storage. #454 (@AbhinavTuli)
  2. Many improvements to the benchmarks #508 #512 #531 #545 #550 (@haiyangdeperci @DebadityaPal)
  3. Development Roadmap added #511 (@mynameisvinn)
  4. Improved message for Hub transforms by displaying shard size #523 (@DebadityaPal)
  5. All windows have now been fixed. #528 (@AbhinavTuli)
  6. Hub dataset filtering has been overhauled and a section has been added for the same in the documentation #539 (@AbhinavTuli)
  7. to_tensorflow issues with Datasets containing Sequences (such as coco) have been fixed #540 (@AbhinavTuli)
  8. Adds get_label parameter to .compute() and .numpy(), to directly retrieve string label from ClassLabel #489 (@DebadityaPal)
  9. Tutorial added for using Hub with Hugging Face transformers #536 (@DebadityaPal)
  10. Some unit tests have now been parameterized to cover multiple datatypes #527 (@drewpotter)
  11. From directory function has been implemented to directly ingest categorical image data #459 (@sparkingdark)
  12. Example use case added for creating a Hub dataset for Deep Learning prediction of crop yield #559 (@MargauxMasson)
  13. MPL Headers have been added to source files #494 (@KrishnaChaitanya1)

1.2.0

24 Jan 06:44
5811597
Compare
Choose a tag to compare

Release Notes

  1. Adds support for dataset filtering (#460)(@AbhinavTuli)
  2. Greatly improves to_tensorflow performance (#481) (@AbhinavTuli)
  3. Benchmarks added for Hub 1.x (#486) (@benchislett)
  4. Fixes a bug that caused issues on windows machines (#472)(@FayazRahman)
  5. Fixes a bug that caused issues with TF 2.4.0 (#478) (@DebadityaPal)
  6. Fixes docker build issue (#463) (@Darkborderman)
  7. Added Chinese readme (#458) (@EYH0602)
  8. Better automatic determination of Dataset mode depending on permissions (#466)(@edogrigqv2)
  9. CoLA dataset uploaded to Hub, upload script added to examples (#487)(@mynameisvinn)
  10. Fixes a bug with dataset slicing (#480) (@AbhinavTuli)
  11. Adds support for custom s3 endpoints (including MinIO) (#482) (@AbhinavTuli)
  12. Adds the ability to set a name to a dataset so it appears better on the visualizer (#468) (@AbhinavTuli)

1.1.3

15 Jan 15:37
ace631f
Compare
Choose a tag to compare

Fixes an issue in to_pytorch when using a dataset that the user doesn't own.

1.1.0

10 Jan 20:05
Compare
Choose a tag to compare

Release Notes

  • Custom s3storage with 5-10x faster than S3FS
  • Faster pytorch dataset with current chunk logic
  • Fixed caching with in-memory per process without LMDB
  • Better Exception handling for loading a dataset, shape and type checks, casting
  • Added examples, tutorials, and better GitHub issue handling
  • Add the opportunity to fill in additional information about the dataset such as description, license, citation
  • Native support with .compute() in the middle for nested tensors

Contributors include. @edogrigqv2 @AbhinavTuli @mynameisvinn @Anselmoo @sparkingdark @sanchitvj @Atom-101