Open Machine Learning

ML Frameworks

1. Acme

Acme is a library of reinforcement learning (RL) building blocks that strives to expose simple, efficient, and readable agents. These agents first and foremost serve both as reference implementations as well as providing strong baselines for algorithm performance. However, the baseline agents exposed by Acme should also provide enough flexibility and simplicity that they can be used as a starting block for novel research. Finally, the building blocks of Acme are designed in such a way that the agents can be run at multiple scales (e.g. single-stream vs. distributed agents).

language	source	license
python	Github	Apache-2.0 license

2. AdaNet

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on recent AutoML efforts to be fast and flexible while providing learning guarantees. Importantly, AdaNet provides a general framework for not only learning a neural network architecture, but also for learning to ensemble to obtain even better models.

This project is based on the AdaNet algorithm, presented in “AdaNet: Adaptive Structural Learning of Artificial Neural Networks” at ICML 2017, for learning the structure of a neural network as an ensemble of subnetworks.

AdaNet has the following goals:

Ease of use: Provide familiar APIs (e.g. Keras, Estimator) for training, evaluating, and serving models.
Speed: Scale with available compute and quickly produce high quality models.
Flexibility: Allow researchers and practitioners to extend AdaNet to novel subnetwork architectures, search spaces, and tasks.
Learning guarantees: Optimize an objective that offers theoretical learning guarantees.

For more information, you may read the docs

language	source	license
python	Github	Apache-2.0 license

3. Analytics Zoo

Analytics Zoo is an open source Big Data AI platform, and includes the following features for scaling end-to-end AI to distributed Big Data:

Orca: seamlessly scale out TensorFlow and PyTorch for Big Data (using Spark & Ray)
RayOnSpark: run Ray programs directly on Big Data clusters
BigDL Extensions: high-level Spark ML pipeline and Keras-like APIs for BigDL
Chronos: scalable time series analysis using AutoML
PPML: privacy preserving big data analysis and machine learning (experimental)

For more information, you may read the docs.

language	source	license
python	Github	Apache-2.0 license

4. Apache MXNet

Apache MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix symbolic and imperative programming to maximize efficiency and productivity. At its core, MXNet contains a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. MXNet is portable and lightweight, scalable to many GPUs and machines.

Apache MXNet is more than a deep learning project. It is a community on a mission of democratizing AI. It is a collection of blue prints and guidelines for building deep learning systems, and interesting insights of DL systems for hackers.

For more information, visit mxnet.apache.org

language	source	license
cpp, python	Github	Apache-2.0 license

5. Apache Spark

Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.

https://spark.apache.org/

language	source	license
scala, python, java, etc	Github	Apache-2.0 license

6. auto_ml

Automated machine learning for analytics & production. [UNMAINTAINED]

language	source	license
python	Github	MIT License

7. BigDL

BigDL seamlessly scales your data analytics & AI applications from laptop to cloud, with the following libraries:

Orca: Distributed Big Data & AI (TF & PyTorch) Pipeline on Spark and Ray
Nano: Transparent Acceleration of Tensorflow & PyTorch Programs
DLlib: “Equivalent of Spark MLlib” for Deep Learning
Chronos: Scalable Time Series Analysis using AutoML
Friesian: End-to-End Recommendation Systems
PPML (experimental): Secure Big Data and AI (with SGX Hardware Security)

For more information, you may read the docs.

language	source	license
scala, python, java, etc	Github	Apache-2.0 license

8. Blocks

Blocks is a framework that helps you build neural network models on top of Theano. Currently it supports and provides:

Constructing parametrized Theano operations, called "bricks"
Pattern matching to select variables and bricks in large models
Algorithms to optimize your model
Saving and resuming of training
Monitoring and analyzing values during training progress (on the training set as well as on test sets)
Application of graph transformations, such as dropout

In the future we also hope to support:

Dimension, type and axes-checking

Please see the documentation for more information.

language	source	license
python	Github	LICENSE

9. Caffe

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR)/The Berkeley Vision and Learning Center (BVLC) and community contributors.

Check out the project site for all the details like

language	source	license
cpp, python	Github	LICENSE

10. ConvNetJS

ConvNetJS is a Javascript implementation of Neural networks, together with nice browser-based demos. It currently supports:

Common Neural Network modules (fully connected layers, non-linearities)
Classification (SVM/Softmax) and Regression (L2) cost functions
Ability to specify and train Convolutional Networks that process images
An experimental Reinforcement Learning module, based on Deep Q Learning

For much more information, see the main page at convnetjs.com

language	source	license
javascript	Github	MIT license

11. DatumBox

The Datumbox Machine Learning Framework is an open-source framework written in Java which allows the rapid development Machine Learning and Statistical applications. The main focus of the framework is to include a large number of machine learning algorithms & statistical methods and to be able to handle large sized datasets.

Datumbox comes with a large number of pre-trained models which allow you to perform Sentiment Analysis (Document & Twitter), Subjectivity Analysis, Topic Classification, Spam Detection, Adult Content Detection, Language Detection, Commercial Detection, Educational Detection and Gender Detection. To get the binary models check out the Datumbox Zoo.

The Framework currently supports performing multiple Parametric & non-parametric Statistical tests, calculating descriptive statistics on censored & uncensored data, performing ANOVA, Cluster Analysis, Dimension Reduction, Regression Analysis, Timeseries Analysis, Sampling and calculation of probabilities from the most common discrete and continues Distributions. In addition it provides several implemented algorithms including Max Entropy, Naive Bayes, SVM, Bootstrap Aggregating, Adaboost, Kmeans, Hierarchical Clustering, Dirichlet Process Mixture Models, Softmax Regression, Ordinal Regression, Linear Regression, Stepwise Regression, PCA and several other techniques that can be used for feature selection, ensemble learning, linear programming solving and recommender systems.

https://www.datumbox.com

language	source	license
javascript	Github	MIT license

12. deepdetect

DeepDetect (https://www.deepdetect.com/) is a machine learning API and server written in C++11. It makes state of the art machine learning easy to work with and integrate into existing applications. It has support for both training and inference, with automatic conversion to embedded platforms with TensorRT (NVidia GPU) and NCNN (ARM CPU).

It implements support for supervised and unsupervised deep learning of images, text, time series and other data, with focus on simplicity and ease of use, test and connection into existing applications. It supports classification, object detection, segmentation, regression, autoencoders, ...

And it relies on external machine learning libraries through a very generic and flexible API. At the moment it has support for:

the deep learning libraries Caffe, Tensorflow, Caffe2, Torch, NCNN Tensorrt and Dlib
distributed gradient boosting library XGBoost
clustering with T-SNE
similarity search with Annoy and FAISS

Please join the community on Gitter, where we help users get through with installation, API, neural nets and connection to external applications.

https://www.deepdetect.com/

language	source	license
cpp	Github	License

13. DL4J

The Eclipse Deeplearning4J (DL4J) ecosystem is a set of projects intended to support all the needs of a JVM based deep learning application. This means starting with the raw data, loading and preprocessing it from wherever and whatever format it is in to building and tuning a wide variety of simple and complex deep learning networks.

Because Deeplearning4J runs on the JVM you can use it with a wide variety of JVM based languages other than Java, like Scala, Kotlin, Clojure and many more.

The DL4J stack comprises of:

DL4J: High level API to build MultiLayerNetworks and ComputationGraphs with a variety of layers, including custom ones. Supports importing Keras models from h5, including tf.keras models (as of 1.0.0-beta7) and also supports distributed training on Apache Spark
ND4J: General purpose linear algebra library with over 500 mathematical, linear algebra and deep learning operations. ND4J is based on the highly-optimized C++ codebase LibND4J that provides CPU (AVX2/512) and GPU (CUDA) support and acceleration by libraries such as OpenBLAS, OneDNN (MKL-DNN), cuDNN, cuBLAS, etc
SameDiff : Part of the ND4J library, SameDiff is our automatic differentiation / deep learning framework. SameDiff uses a graph-based (define then run) approach, similar to TensorFlow graph mode. Eager graph (TensorFlow 2.x eager/PyTorch) graph execution is planned. SameDiff supports importing TensorFlow frozen model format .pb (protobuf) models. Import for ONNX, TensorFlow SavedModel and Keras models are planned. Deeplearning4j also has full SameDiff support for easily writing custom layers and loss functions.
DataVec: ETL for machine learning data in a wide variety of formats and files (HDFS, Spark, Images, Video, Audio, CSV, Excel etc)
LibND4J : C++ library that underpins everything. For more information on how the JVM acceses native arrays and operations refer to JavaCPP
Python4J: Bundled cpython execution for the JVM

All projects in the DL4J ecosystem support Windows, Linux and macOS. Hardware support includes CUDA GPUs (10.0, 10.1, 10.2 except OSX), x86 CPU (x86_64, avx2, avx512), ARM CPU (arm, arm64, armhf) and PowerPC (ppc64le).

http://deeplearning4j.konduit.ai/

language	source	license
java, cpp	Github	Apache-2.0 license

14. detectron2

Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.

https://detectron2.readthedocs.io/en/latest/

language	source	license
python	Github	Apache-2.0 license

NLP Frameworks [ ↑ ]

1. AllenNLP

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.

https://allennlp.org

language	source	license
python	Github	Apache-2.0 license

2. Apache OpenNLP

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

This toolkit is written completely in Java and provides support for common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, coreference resolution, language detection and more!

These tasks are usually required to build more advanced text processing services.

The goal of the OpenNLP project is to be a mature toolkit for the above mentioned tasks.

An additional goal is to provide a large number of pre-built models for a variety of languages, as well as the annotated text resources that those models are derived from.

Presently, OpenNLP includes common classifiers such as Maximum Entropy, Perceptron and Naive Bayes.

OpenNLP can be used both programmatically through its Java API or from a terminal through its CLI. OpenNLP API can be easily plugged into distributed streaming data pipelines like Apache Flink, Apache NiFi, Apache Spark.

https://opennlp.apache.org

language	source	license
java	Github	Apache-2.0 license

3. ERNIE

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

language	source	license
python	Github	Apache-2.0 license

4. flair

A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), special support for biomedical data, sense disambiguation and classification, with support for a rapidly growing number of languages.
A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings, BERT embeddings and ELMo embeddings.
A PyTorch NLP framework. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.

language	source	license
python	Github	MIT License

5. gensim

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.

language	source	license
python	Github	MIT License

6. icecaps

Microsoft Icecaps is an open-source toolkit for building neural conversational systems. Icecaps provides an array of tools from recent conversation modeling and general NLP literature within a flexible paradigm that enables complex multi-task learning setups.

Icecaps is currently on version 0.2.0. In this version we introduced several functionalities:

Personalization embeddings for transformer models
Early stopping variant for performing validation across all saved checkpoints
Implementations for both SpaceFusion and StyleFusion
New text data processing features, including sorting and trait grounding
Tree data processing features from JSON files using the new JSONDataProcessor

language	source	license
python	Github	MIT License

7. jiant

The multitask and transfer learning toolkit for natural language processing research Why should I use jiant?

jiant supports multitask learning
jiant supports transfer learning
jiant supports 50+ natural language understanding tasks
jiant supports the following benchmarks:
- GLUE
- SuperGLUE
- XTREME
jiant is a research library and users are encouraged to extend, change, and contribute to match their needs!

A few additional things you might want to know about jiant:

jiant is configuration file driven
jiant is built with PyTorch
jiant integrates with datasets to manage task data
jiant integrates with transformers to manage models and tokenizers.

language	source	license
python	Github	MIT License

8. NeuralCoref

NeuralCoref is a pipeline extension for spaCy 2.1+ which annotates and resolves coreference clusters using a neural network. NeuralCoref is production-ready, integrated in spaCy's NLP pipeline and extensible to new training datasets.

For a brief introduction to coreference resolution and NeuralCoref, please refer to our blog post. NeuralCoref is written in Python/Cython and comes with a pre-trained statistical model for English only.

NeuralCoref is accompanied by a visualization client NeuralCoref-Viz, a web interface powered by a REST server that can be tried online. NeuralCoref is released under the MIT license.

language	source	license
c, python	Github	MIT License

9. NLP Architect

NLP Architect is an open source Python library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing and Natural Language Understanding Neural Networks.

NLP Architect is an NLP library designed to be flexible, easy to extend, allow for easy and rapid integration of NLP models in applications and to showcase optimized models.

Features:

Core NLP models used in many NLP tasks and useful in many NLP applications
Novel NLU models showcasing novel topologies and techniques
Optimized NLP/NLU models showcasing different optimization algorithms on neural NLP/NLU models
Model-oriented design:
- Train and run models from command-line.
- API for using models for inference in python.
- Procedures to define custom processes for training, inference or anything related to processing.
- CLI sub-system for running procedures
Based on optimized Deep Learning frameworks:
- [TensorFlow]
- [PyTorch]
- [Dynet]
Essential utilities for working with NLP models - Text/String pre-processing, IO, data-manipulation, metrics, embeddings.

https://intellabs.github.io/nlp-architect

language	source	license
python	Github	Apache-2.0 license

10. Natural Language Toolkit (NLTK)

NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. NLTK requires Python version 3.7, 3.8, 3.9 or 3.10.

https://www.nltk.org

language	source	license
python	Github	Apache-2.0 license

11. Pattern
Pattern is a web mining module for Python. It has tools for:

Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM parser
Natural Language Processing: part-of-speech taggers, n-gram search, sentiment analysis, WordNet
Machine Learning: vector space model, clustering, classification (KNN, SVM, Perceptron)
Network Analysis: graph centrality and visualization.

It is well documented, thoroughly tested with 350+ unit tests and comes bundled with 50+ examples. The source code is licensed under BSD.

https://github.com/clips/pattern/wiki

language	source	license
python	Github	BSD-3-Clause license

12. spaCy: Industrial-strength NLP

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products.

spaCy comes with pretrained pipelines and currently supports tokenization and training for 70+ languages. It features state-of-the-art speed and neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pretrained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management. spaCy is commercial open-source software, released under the MIT license.

https://spacy.io/

language	source	license
python, cython	Github	MIT License

13. Stanford CoreNLP

Stanford CoreNLP Provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, Hungarian, Italian, and Spanish. Stanford CoreNLP is an integrated framework, which makes it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components.

The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v2 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.

http://stanfordnlp.github.io/CoreNLP/

language	source	license
java,	Github	GPL-3.0 license

14. SumEval

Well tested & Multi-language evaluation framework for Text Summarization.

Well tested
- The ROUGE-X scores are tested compare with original Perl script (ROUGE-1.5.5.pl).
- The BLEU score is calculated by SacréBLEU, that produces the same values as official script (mteval-v13a.pl) used by WMT.
Multi-language
- Not only English, Japanese and Chinese are also supported. The other language is extensible easily.

Of course, implementation is Pure Python!

language	source	license
python	Github	Apache-2.0 license

15. Texar-PyTorch

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar provides a library of easy-to-use ML modules and functionalities for composing whatever models and algorithms. The tool is designed for both researchers and practitioners for fast prototyping and experimentation. Texar-PyTorch was originally developed and is actively contributed by Petuum and CMU in collaboration with other institutes. A mirror of this repository is maintained by Petuum Open Source.

Texar-PyTorch integrates many of the best features of TensorFlow into PyTorch, delivering highly usable and customizable modules superior to PyTorch native ones.

Key Features

Two Versions, (Mostly) Same Interfaces. Texar-PyTorch (this repo) and Texar-TF have mostly the same interfaces. Both further combine the best design of TF and PyTorch:
- Interfaces and variable sharing in PyTorch convention
- Excellent factorization and rich functionalities in TF convention.
Versatile to support broad needs:
- data processing, model architectures, loss functions, training and inference algorithms, evaluation, ...
- encoder(s) to decoder(s), sequential- and self-attentions, memory, hierarchical models, classifiers, ...
- maximum likelihood learning, reinforcement learning, adversarial learning, probabilistic modeling, ...
Fully Customizable at multiple abstraction level -- both novice-friendly and expert-friendly.
- Free to plug in whatever external modules, since Texar is fully compatible with the native PyTorch APIs.
Modularized for maximal re-use and clean APIs, based on principled decomposition of Learning-Inference-Model Architecture.
Rich Pre-trained Models, Rich Usage with Uniform Interfaces. BERT, GPT2, XLNet, etc, for encoding, classification, generation, and composing complex models with other Texar components!
Clean, detailed documentation and rich examples.

https://asyml.io/

language	source	license
python	Github	Apache-2.0 license

16. TextBlob

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

https://textblob.readthedocs.io/

language	source	license
python	Github	MIT License

17. Thinc

Thinc is a lightweight deep learning library that offers an elegant, type-checked, functional-programming API for composing models, with support for layers defined in other frameworks such as PyTorch, TensorFlow and MXNet. You can use Thinc as an interface layer, a standalone toolkit or a flexible way to develop new models. Previous versions of Thinc have been running quietly in production in thousands of companies, via both spaCy and Prodigy.

https://thinc.ai/

language	source	license
python, cython	Github	MIT License

18. torchtext

Data loaders and abstractions for text and NLP

This repository consists of:

torchtext.datasets: The raw text iterators for common NLP datasets
torchtext.data: Some basic NLP building blocks
torchtext.transforms: Basic text-processing transformations
torchtext.models: Pre-trained models
torchtext.vocab: Vocab and Vectors related classes and factory functions
examples: Example NLP workflows with PyTorch and torchtext library.

https://pytorch.org/text

language	source	license
python, cpp	Github	BSD-3-Clause license

19. transformers

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio.

These models can be applied on:

Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages.
Images, for tasks like image classification, object detection, and segmentation.
Audio, for tasks like speech recognition and audio classification.

Transformer models can also perform tasks on several modalities combined, such as table question answering, optical character recognition, information extraction from scanned documents, video classification, and visual question answering.

Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.

Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch and TensorFlow — with a seamless integration between them. It's straightforward to train your models with one before loading them for inference with the other.

https://huggingface.co/transformers

language	source	license
python	Github	Apache-2.0 license

Computer Vision Libraries [ ↑ ]

1. libfacedetection

This is an open source library for CNN-based face detection in images. The CNN model has been converted to static variables in C source files. The source code does not depend on any other libraries. What you need is just a C++ compiler. You can compile the source code under Windows, Linux, ARM and any platform with a C++ compiler.

SIMD instructions are used to speed up the detection. You can enable AVX2 if you use Intel CPU or NEON for ARM.

The model files are provided in src/facedetectcnn-data.cpp (C++ arrays) & the model (ONNX) from OpenCV Zoo. You can try our scripts (C++ & Python) in opencv_dnn/ with the ONNX model. View the network architecture here.

Please note that OpenCV DNN does not support the latest version of YuNet with dynamic input shape. Please ensure you have the exact same input shape as the one in the ONNX model to run latest YuNet with OpenCV DNN.

examples/detect-image.cpp and examples/detect-camera.cpp show how to use the library.

The library was trained by libfacedetection.train.

language	source	license
cpp	Github	License

2. PyTorch-YOLOv3

A minimal PyTorch implementation of YOLOv3, with support for training, inference and evaluation.

https://pjreddie.com/darknet/yolo/

language	source	license
python	Github	GPL-3.0 license

3. raster-vision

Raster Vision is an open source Python framework for building computer vision models on satellite, aerial, and other large imagery sets (including oblique drone imagery).

It allows users (who don't need to be experts in deep learning!) to quickly and repeatably configure experiments that execute a machine learning pipeline including: analyzing training data, creating training chips, training models, creating predictions, evaluating models, and bundling the model files and configuration for easy deployment.

language	source	license
python	Github	License

4. DeOldify

Quick Start: The easiest way to colorize images using open source DeOldify (for free!) is here: DeOldify Image Colorization on DeepAI

Desktop: Want to run open source DeOldify for photos on Windows desktop? ColorfulSoft made such a thing here and it really works- https://github.com/ColorfulSoft/DeOldify.NET. No GPU required!

The most advanced version of DeOldify image colorization is available here, exclusively. Try a few images for free! MyHeritage In Color

Huggingface Web Demo: Integrated to Huggingface Spaces with Gradio. See demo: Hugging Face Spaces

language	source	license
python	Github	MIT License

5. SOD

SOD is an embedded, modern cross-platform computer vision and machine learning software library that exposes a set of APIs for deep-learning, advanced media analysis & processing including real-time, multi-class object detection and model training on embedded systems with limited computational resource and IoT devices.

SOD was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in open source as well commercial products.

Designed for computational efficiency and with a strong focus on real-time applications. SOD includes a comprehensive set of both classic and state-of-the-art deep-neural networks with their pre-trained models. Built with SOD:

Convolutional Neural Networks (CNN) for multi-class (20 and 80) object detection & classification.
Recurrent Neural Networks (RNN) for text generation (i.e. Shakespeare, 4chan, Kant, Python code, etc.).
Decision trees for single class, real-time object detection.
A brand new architecture written specifically for SOD named RealNets.

https://sod.pixlab.io/

language	source	license
c	Github	License

6. makesense.ai

makesense.ai is a free-to-use online tool for labeling photos. Thanks to the use of a browser it does not require any complicated installation - just visit the website and you are ready to go. It also doesn't matter which operating system you're running on - we do our best to be truly cross-platform. It is perfect for small computer vision deep learning projects, making the process of preparing a dataset much easier and faster. Prepared labels can be downloaded in one of the multiple supported formats. The application was written in TypeScript and is based on React/Redux duo.

https://makesense.ai/

language	source	license
typescript	Github	GPL-3.0 license

7. DeepPrivacy

DeepPrivacy is a fully automatic anonymization technique for images.

This repository contains the source code for the paper "DeepPrivacy: A Generative Adversarial Network for Face Anonymization" published at ISVC 2019, and "Image Inpainting with Learnable Feature Imputation " published at GCPR 2020.

[Interactive Demo]

language	source	license
python	Github	MIT License

8. face_recognition

The world's simplest facial recognition api for Python and the command line

Recognize and manipulate faces from Python or from the command line with the world's simplest face recognition library

language	source	license
python	Github	MIT License

9. DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes. More than 95% of deepfake videos are created with DeepFaceLab.

language	source	license
python	Github	GPL-3.0 license

10. faceswap

FaceSwap is a tool that utilizes deep learning to recognize and swap faces in pictures and videos.

https://www.faceswap.dev/

language	source	license
python	Github	GPL-3.0 license

11. jeelizFaceFilter

This JavaScript library detects and tracks the face in real time from the camera video feed captured with WebRTC. Then it is possible to overlay 3D content for augmented reality applications. We provide various demonstrations using the main WebGL 3D engines. We have included in this repository the release versions of the 3D engines to work with a determined version (they are in /libs/<name of the engine>/).

This library is lightweight and it does not include any 3D engine or third party library. We want to keep it framework agnostic so the outputs of the library are raw: if the face is detected or not, the position and the scale of the detected face and the rotation Euler angles. But thanks to the featured helpers, examples and boilerplates, you can quickly deal with a higher level context (for motion head tracking, for face filter or face replacement...).

https://jeeliz.com/

language	source	license
javascript	Github	Apache-2.0 license

12. OpenCV

Open Source Computer Vision Library

Homepage: https://opencv.org
- Courses: https://opencv.org/courses
Docs: https://docs.opencv.org/4.x/
Q&A forum: https://forum.opencv.org
- previous forum (read only): http://answers.opencv.org
Issue tracking: https://github.com/opencv/opencv/issues
Additional OpenCV functionality: https://github.com/opencv/opencv_contrib

language	source	license
cpp	Github	Apache-2.0 license

13. Luminoth

Luminoth is an open source toolkit for computer vision. Currently, we support object detection, but we are aiming for much more. It is built in Python, using TensorFlow and Sonnet.

https://tryolabs.com/

language	source	license
python	Github	BSD-3-Clause license

ML Tools [ ↑ ]

AIX360 | Apollo | DVC |

1. AI Explainability 360

The AI Explainability 360 toolkit is an open-source library that supports interpretability and explainability of datasets and machine learning models. The AI Explainability 360 Python package includes a comprehensive set of algorithms that cover different dimensions of explanations along with proxy explainability metrics.

The AI Explainability 360 interactive experience provides a gentle introduction to the concepts and capabilities by walking through an example use case for different consumer personas. The tutorials and example notebooks offer a deeper, data scientist-oriented introduction. The complete API is also available.

http://aix360.mybluemix.net/

language	source	license
python	Github	Apache-2.0 license

2. Apollo

A high performance and flexible architecture which accelerates the development, testing, and deployment of Autonomous Vehicles.

http://apollo.auto/

language	source	license
cpp, python, etc..	Github	Apache-2.0 license

3. Data Version Control (DVC)

Data Version Control or DVC is a command line tool and VS Code Extension to help you develop reproducible machine learning projects:

Version your data and models. Store them in your cloud storage but keep their version info in your Git repo.
Iterate fast with lightweight pipelines. When you make changes, only run the steps impacted by those changes.
Track experiments in your local Git repo (no servers needed).
Compare any data, code, parameters, model, or performance plots.
Share experiments and automatically reproduce anyone's experiment.

https://dvc.org/

language	source	license
python	Github	Apache-2.0 license

ML hosting [ ↑ ]

BentoML | Streamlit | Acumos | Ray | Turi

1. BentoML

BentoML makes it easy to create Machine Learning services that are ready to deploy and scale.

Documentation - Overview of the BentoML docs and related resources
Tutorial: Intro to BentoML - Learn by doing! In under 10 minutes, you'll serve a model via REST API and generate a docker image for deployment.
Main Concepts - A step-by-step tour for learning main concepts in BentoML
Examples - Gallery of sample projects using BentoML
ML Framework Guides - Best practices and example usages by the ML framework of your choice
Advanced Guides - Learn about BentoML's internals, architecture and advanced features
Need help? Join BentoML Community Slack 💬

https://bentoml.com

language	source	license
python	Github	Apache-2.0 license

2. Streamlit

The fastest way to build and share data apps.

Streamlit lets you turn data scripts into shareable web apps in minutes, not weeks. It’s all Python, open-source, and free! And once you’ve created an app you can use our Community Cloud platform to deploy, manage, and share your app!

https://streamlit.io/

language	source	license
python, javascript, typescript	Github	Apache-2.0 license

3. Acumos

Acumos is a platform and an open source framework that makes it easy to build, share, and deploy AI apps. It is an LF AI Graduate project.

https://www.acumos.org/

source	type
Github	organization

4. Ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.

https://ray.io/

language	source	license
python, cpp, java etc..	Github	Apache-2.0 license

5. Turi

Turi Create simplifies the development of custom machine learning models. You don't have to be a machine learning expert to add recommendations, object detection, image classification, image similarity or activity classification to your app.

Easy-to-use: Focus on tasks instead of algorithms
Visual: Built-in, streaming visualizations to explore your data
Flexible: Supports text, images, audio, video and sensor data
Fast and Scalable: Work with large datasets on a single machine
Ready To Deploy: Export models to Core ML for use in iOS, macOS, watchOS, and tvOS apps

language	source	license
cpp, python, javascript, ect..	Github	BSD-3-Clause license

Contributing [ ↑ ]

Pull requests are welcome!
for major changes, please open an issue first to
discuss what you would like to change.
please make sure to update tests as appropriate.

Code of Conduct	Contributing

License [ ↑ ]

MIT License

Copyright (c) 2022 rs (cx0y)

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Lincese

[ ↑ ]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
code_of_conduct.md		code_of_conduct.md
contributing.md		contributing.md
license.md		license.md
readme.md		readme.md

License

cx0y/open-ml-libraries

Folders and files

Latest commit

History

Repository files navigation