I have listed some awesome libraries which I found useful for most machine learning practices. These libraries can make things easier and boost your productivity.
- DeepLearningExamples: Source code for many deep learning models by Nvidia
- mmcv: OpenMMLab Computer Vision Foundation
- MMClassification: OpenMMLab Image Classification Toolbox and Benchmark
- MMDetection: OpenMMLab Detection Toolbox and Benchmark
- MMAction2: OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
- MMSegmentation: OpenMMLab Semantic Segmentation Toolbox and Benchmark.
- OpenSelfSup: Self-Supervised Learning Toolbox and Benchmark
- autonlp: AutoNLP: train state-of-the-art natural language processing models and deploy them in a scalable environment automatically
- HuggingFaceTransformer: Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.
- fairseq: Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
- DALI: A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
- AugLy: A data augmentations library for audio, image, text, and video.
- Open3D: Open3D: A Modern Library for 3D Data Processing
- HuggingFaceTokenizer: Fast State-of-the-Art Tokenizers optimized for Research and Production
- HuggingFaceDatasets: The largest hub of ready-to-use NLP datasets for ML models with fast, easy-to-use and efficient data manipulation tools
- Apex: A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
- ApexDataPrefetcher: prefetch data to hide data I/O cost.
- Horovod: Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
- Checkpoint: A PyTorch function which implements activation checkpointing.
- TorchPipe: A GPipe implementation in PyTorch
- PowerSGD Communication Hook: PowerSGD (Vogels et al., NeurIPS 2019) is a gradient compression algorithm, which can provide very high compression rates and accelerate bandwidth-bound distributed training.
- Accelerate: A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision
- lightseq: LightSeq: A High Performance Library for Sequence Processing and Generation
- Megatron: Ongoing research training transformer language models at scale, including: BERT & GPT-2
- DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
- Ray: An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
- Tensorboard: TensorFlow's Visualization Toolkit
- KnockKnock: Get notified when your training ends with only two additional lines of code
- Neptune: Lightweight experiment tracking tool for AI/ML individuals and teams. Fits any workflow.
- netron: Visualizer for neural network, deep learning, and machine learning models
- scalene: Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python