Skip to content

Releases: neuralmagic/sparsezoo

SparseZoo v1.7.0

15 Mar 01:18
0b58962
Compare
Choose a tag to compare

New Features:

  • Download support for LLMs (#379, #436)
  • Functionality to support channelwise quantization analysis (#441)
  • Chunked downloads for improved handling of large files (#446, #471)
  • SparseZoo Model Additions:
    • For various NL tasks, including chat, instruction tuning, code generation, summarization, question answering, and arithmetic reasoning:
      • Sparsified and baseline: Llama 7B (view) | Mistral 7B (view)
    • For code generation:
      • Sparsified and baseline: CodeLlama 7B (view)

Changes:

  • Deployment directories are directly downloaded as tar.gz and subsequently unzipped, enabling faster downloads. (#389)
  • Timestamps are appended with "nightly" are to improve analytics aggregation. (#406)
  • Reference to LICENSE-NEURALMAGIC removed and consolidated model license attributions into a single file. (#400, #409)
  • Legacy sparsezoo.analyze functionality has been renamed to sparsezoo.analyze_v1 for clarity. (#460)
  • The ability to overwrite existing files during download was added, resulting in auto-correction of file corruption errors for model downloads. (#453)

Resolved Issues:

  • Reporting issues resolved within GitHub Actions. (#382)
  • Manual kickoff of GitHub Actions workflows implemented. (#382)
  • Multiple names can now be passed for registering a value in the RegistryMixin class. (#385)
  • Incorrect shape computations with the ONNX Runtime no longer result in incorrect FLOPs calculations through the analyze functionality. (#408)
  • Processing file paths during analysis no longer results in analyze pathways crashing for certain paths. (#425)
  • sparsezoo.model.download now works as intended, where it would previously not download all of the files necessary for Transformers-based models. (#422)
  • Handling by raising TypeErrors instead of ValueErrors improved. (#427)
  • Instructions within the sparsezoo.analyze command-line tool were corrected. (#433)
  • SparseZoo's analysis functionality now results in correct values and no longer crashes when analyzing LLMs. (#421, #461, #462, #463, #438)
  • SparseZoo models can correctly access metrics.yaml files, whereas before these files were not available. (#431)
  • SparseZoo now handles external data for smaller models correctly, where smaller LLMs could previously fail to download correctly. (#443)
  • Handling dictionary values for SparseZoo model objects previously resulted in crashes for LLM downloads. (#448)
  • Reloading previous sparsezoo.analyze results no longer results in serialization errors for channelwise quantized models. (#455)
  • An external data bug was resolved that had resulted in LLMs not downloading properly. (#468)

Known Issues:

  • None

SparseZoo v1.6.1 Patch Release

20 Dec 22:08
00c3cf4
Compare
Choose a tag to compare

This is a patch release for 1.6.0 that contains the following changes:

  • NOTICE contains an updated summarized list of models used in the SparseZoo with their appropriate license and GitHub repository attributions for easier user reference. Model Cards now contain a footer link to the updated SparseZoo NOTICE file. (#410)

Known Issues:

  • The compile time for dense LLMs can be very slow. Compile time to be addressed in forthcoming release.
  • Docker images are not currently pushing. A resolution is forthcoming for functional Docker builds. [RESOLVED]

SparseZoo v1.6.0

11 Dec 21:23
db5c6dc
Compare
Choose a tag to compare

Model Additions: Generative AI

  • CodeGen Mono 2B and 350M trained on BigQuery and ThePile datasets for base and one-shot pruned, quantized, and sparse quantized models (view)
  • CodeGen Multi 2B and 350M trained on BigQuery and ThePile datasets for base and one-shot pruned, quantized, and sparse quantized modes (view)
  • Llama-2 7B baseline and quantized models for pre-trained and chat datasets (view)
  • Llama-2 7B dense, pruned, quantized, and sparse quantized models and recipes for the platypus instruction tuning dataset and gsm8k arithmetic reasoning dataset (view)
  • MPT 7B baseline, quantized, and sparse quantized models for pre-trained and chat datasets (view)
  • MPT 7B dense, pruned, quantized, and sparse quantized models and recipes for dolly instruction tuning dataset and gsm8k arithmetic reasoning dataset (view)
  • OPT 1.3b, 2.7B, 6.7B, and 13B dense, pruned, quantized, and sparse quantized models and recipes for the OPT pre-trained dataset (view)

Model Additions: Computer Vision

  • YOLOv8 n, s, m, l, and x dense pruned, quantized, and sparse quantized models and recipes for COCO and VOC datasets (view)

New Features:

  • Version support added:

  • The initial feature set for SparseZoo V3 web UI is now live, where the home page has been restructured to include and highlight generative AI models.

  • SparseZoo V2 model file structure and V2 stubs enabled, which expands the number of supported files and reduces the number of bytes that need to be downloaded for model checkpoints, folders, and files. It also simplifies the stubs used to access models in the SparseZoo. (Documentation: V2 file structure and stubs docs will be added in v1.7) (#286, #271, #355, #359, #354 , #361, #363, #368, #370, #373)

  • SparseZoo Analyze CLI and APIs added to enable simple functions for quickly checking general and sparsification info for params, operations, reads/writes, and overall model layouts. (#288, #344, #345)

  • RegistryMixin class and patterns added, enabling a centralized and universal registry across Neural Magic's repos and products. (#365)

Model Changes: Computer Vision

  • EfficientNet-B0 to B5, EfficientNet V2 S, M, and L have been updated with example transfer recipes for base and quantized versions from the ImageNet dataset. (view)
  • MobileNet V1 models have been updated with corrected metrics and model card updates to include updated instructions for transfer and sparsifiation across dense, sparse, and quantized versions for the ImageNet dataset. (view)

Model Changes: Natural Language Processing

To address DeepSparse deployment pipelines failing due to the missing files, the following models have been updated to include new tokenizer files for the deployment directory across dense, sparse, and sparse quantized versions, with the targeted datasets:

Product Changes:

  • Extra information about the benchmarking device each benchmark was run on has been added to the Python interface for benchmarking results. (#294)
  • README and documentation updated to include: Slack Community name change, Contact Us form introduction, Python version changes; corrections for YOLOv5 torchvision, Transformers, and SparseZoo broken links; and installation command. (#307)
  • Improved support for large ONNX files to improve loading performance and limit memory performance issues, especially for LLMs. (#308, #320)
  • SparseZoo model folders that are downloaded through the Python API will now be saved locally under their repo name instead of model id. This is to enable easier tracking of which models have been downloaded to a user's system. (#317, #369)
  • Python 3.7 support is deprecated. (#348)
  • Pydantic version pinned to <2.0 preventing potential issues with untested versions. (#339)
  • File path endings are added to download logs, enabling more useful output information when downloading models. (#346, #360)
    ONNX utility functions have been broken out into multiple files, enabling better structure for future enhancements. The namespace and imports all remain the same. (#353)

Resolved Issues:

  • A test for checking throughput values no longer fails, resulting in successful test cases passing. (#306)
  • Metric names were not matching due to different formatting, such as spaces and casing. For example, top1accuracy and Top 1 Accuracy will now match. (#310)
  • Google Analytics errors were being shown to the user if the libraries were used too frequently on the same system. (#318, #322, #324, #327)
  • In some cases, logging information was duplicated due to multiple streams being registered, such as when DeepSparse benchmarks were run. This is now fixed to ensure logs are no longer duplicated. (#330)
  • Unit and integration tests now remove temporary test files and limit test file creation, which were not being properly deleted. (#329)

Known Issues:

  • The compile time for dense LLMs can be very slow. Compile time to be addressed in forthcoming release.
  • Docker images are not currently pushing. A resolution is forthcoming for functional Docker builds. [RESOLVED]

SparseZoo v1.5.2 Patch Release

05 Jul 21:13
626e4d0
Compare
Choose a tag to compare

This is a patch release for 1.5.0 that contains the following changes:

  • Pinned dependency Pydantic, a data validation library for Python, to < v2.0, to prevent current workflows from breaking. Pydantic upgrade planned for future release. (#340)

SparseZoo v1.5.1 Patch Release

07 Jun 05:41
460d3f5
Compare
Choose a tag to compare

This is a patch release for 1.5.0 that contains the following changes:

  • SparseZoo, SparseML, and DeepSparse CLIs/APIs no longer crashes on systems with no internet access. (#328)

SparseZoo v1.5.0

07 Jun 05:34
460d3f5
Compare
Choose a tag to compare

New Features:

  • SparseZoo V2 UI and backend which includes better performance and user experience for discovering and using models and recipes
  • SparseZoo Additions:
    • YOLOv5 and YOLOv5p6 additional sparsified models (view)
    • YOLOv8 baseline and sparsified models (view)
    • oBERTa NLP baseline and sparsified models (view)
    • RoBERTa NLP baseline and sparsified models (view)
  • sparsezoo.analyze CLI to enable easy analysis of ONNX models including performance and sparsity metrics (#263) (#281)
  • sparsezoo.deployment_package CLI to enable easy packaging of models from the SparseZoo for deployments (#261)
  • Product usage analytics tracking; to disable, run the command export NM_DISABLE_ANALYTICS=True (#287)

Changes:

  • ModelAnalysis.from_onnx(...) updated to accept ModelProto objects rather than just ONNX files. (#253)

Resolved Issues:

  • None

Known Issues:

  • If running on a system with no internet access, SparseZoo, SparseML, and DeepSparse CLIs/APIs are crashing. Hotfix forthcoming.

SparseZoo v1.4.0

17 Feb 20:14
0160db9
Compare
Choose a tag to compare

New Features:

  • More performant YOLOv5s and YOLOv5l sparse quantize models
  • YOLOv5 sparse quantized models for m, l, x versions
  • YOLOv5p6 sparse quantized models for n, s, m, l, x versions
  • NLP multi-label use case models for BERT-base, DistilBERT, and BERT-Large on the GoEmotions dataset
  • Initial oBERTa models (RoBERTa style models) for SQuAD and GLUE tasks

Changes:

  • None

Resolved Issues:

  • Due to a breaking change in NumPy, its version was pinned to <=1.21.6 to prevent crashes from happening across SparseZoo, SparseML, and DeepSparse.

Known Issues:

  • None

SparseZoo v1.3.1 Patch Release

04 Jan 21:36
34ab6f0
Compare
Choose a tag to compare

This is a patch release for 1.3.0 that contains the following changes:

  • NumPy version pinned to <=1.21.6 to avoid deprecation warning/index errors in pipelines.

SparseZoo v1.3.0

21 Dec 19:02
34ab6f0
Compare
Choose a tag to compare

New Features:

  • BERT models added for the GoEmotions multi-label dataset.
  • BERT models added for SQuAD 2.0 dataset.
  • oBERTa base models added for GLUE datasets.
  • YOLOv5 and YOLOv5p6 models added for transfer learning.

Changes:

  • Minimum Python version changed to 3.7.
  • Benchmarking and accuracy metrics for a model propagated to the root Python class.

Resolved Issues:

  • None

Known Issues:

  • None

SparseZoo v1.2.0

28 Oct 00:44
69f96a3
Compare
Choose a tag to compare

New Features:

  • SparseZoo ONNX analysis API added to enable easy model analysis for sparsity, quantization, flops, parameters, and more.
  • BERT document classification models added, trained on the IMDB dataset.

Changes:

  • Tokenizer_config.json added as required file for transformers models.
  • Minimum Python version changed to 3.7 as 3.6 as reached EOL.

Resolved Issues:

  • SparseZoo README updated to reflect new APIs and flows that were released with 1.1 release.

Known Issues:

  • None