Machine learning acronyms and abbreviations 🤖

A comprehensive list of ML and AI acronyms and abbreviations. Feel free to ⭐ it!

Machine learning is rapidly growing, creating more mysterious acronyms and abbreviations that might be challenging to follow, especially for beginners. This abbreviations list was created when I collected all acronyms from my Ph.D. thesis. Surprised by the enormous number, I searched through the web to copy and paste them to save time on writing. I found a few lists, but none covered all I needed. I decided to gather all this info in a single Table to make it easier to fellow ML enthusiasts.

Sources 📖

Contributing 📝

Feel free to:

add any ML-related abbreviation,
add the definition alone,
add an issue.

Currently, ~30% of abbreviations have descriptions, so feel free to add them! It should be a brief and concise one-liner rather than explain the whole subject. The purpose is to quickly find the meaning of an abbreviation, and the given definition helps to understand if it matches the context. Abbreviations should be in alphabetical order.

I have added a link to the online doc with all abbreviations to make it easier for you to contribute. Feel free to add a new one and sort the table automatically. You can copy the table from Google Sheets to the markdown table generator: https://www.tablesgenerator.com/markdown_tables.

The list 📑

Acronym	Description	Definition
ACC	ACCuracy	Accuracy is a metric for evaluating classification models.
ACE	Alternating conditional expectation (ACE) algorithm	An algorithm to find the optimal transformations between the response variable and predictor variables in regression analysis.
ADA	AdaBoosted Decision Trees	Using AdaBoost to improve performance in decision trees.
AdaBoost	Adaptive Boosting	A statistical classification meta-algorithm that can be used in conjunction with many other types of learning algorithms to improve performance.
AdR	AdaBoostRegressor	Using AdaBoost to improve performance in regression.
ADT	Automatic Drum Transcription	Methods that aim to detect drum events in polyphonic music
AE	AutoEncoder	A type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning)
AGI	Artificial General Intelligence	The hypothetical ability of an intelligent agent to understand or learn any intellectual task that a human being can
AI	Artificial Intelligence	The simulation of human intelligence in machines that are programmed to think like humans and mimic their actions.
AIWPSO	Adaptive Inertia Weight Particle Swarm Optimization	An optimization algorithm using an individual search ability (ISA) to indicate whether each particle lacks global exploration or local exploitation abilities in each dimension.
AM	Activation Maximization	A method to visualize neural networks and aims to maximize the activation of certain neurons.
AMT	Automatic Music Transcription	Computational algorithms that convert acoustic music signals into some form of music notation
ANN	Artificial Neural Network	A collection of connected computational units or nodes called neurons arranged in multiple computational layers.
AR	Augmented Reality	An interactive experience of a real-world environment where the objects that reside in the real world are enhanced by computer-generated perceptual information sometimes across multiple sensory modalities.
ARNN	Anticipation Recurrent Neural Network
AUC	Area Under the (ROC) Curve	Probability of confidence in a model to accurately predict positive outcomes for actual positive instances
BDT	Boosted Decision Tree
BERT	Bidirectional Encoder Representation from Transformers	Commonly used transformer-based language model.
BiFPN	Bidirectional Feature Pyramid Network
BILSTM	Bidirectional Long Short-Term Memory	A bidirectional recurrent neural network architecture (see LSTM).
BLEU	Bilingual Evaluation Understudy	A score of the effectiveness of translating one language into another one.
BN	Bayesian Network	A probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG).
BNN	Bayesian Neural Network	A type of artificial neural network built by introducing random variations into the network either by giving the network's artificial neurons stochastic transfer functions either by giving the network's artificial neurons stochastic transfer functions or by giving them stochastic weights
BP	BackPropagation	A widely used algorithm for training feedforward neural networks.
BPMF	Bayesian Probabilistic Matrix Factorization
BPTT	Backpropagation Through Time	A gradient-based technique for training certain types of recurrent neural networks (e.g. Elman networks).
BQML	Big Query Machine Learning
BRNN	Bidirectional Recurrent Neural Network
BRR	Bayesian Ridge Regression
CAE	Contractive AutoEncoder
CALA	Continuous Action-set Learning Automata
CART	Classification And Regression Tree
CAV	Concept Activation Vectors	Explainability method that provides an interpretation of a neural net's internal state in terms of human-friendly concepts.
CBI	Counterfactual Bias Insertion
CBOW	Continuous Bag of Words
CDBN	Convolutional Deep Belief Networks	A type of deep artificial neural network composed of multiple layers of convolutional restricted Boltzmann machines stacked together.
CE	Cross-Entropy
CEC	Constant Error Carousel
CF	Common Features
CLNN	ConditionaL Neural Networks
CMAC	Cerebellar Model Articulation Controller
CMMs	Conditional Markov Model	A graphical model for sequence labeling that combines features of hidden Markov models (HMMs) and maximum entropy (MaxEnt) models. Also known as maximum-entropy Markov model (MEMM).
CNN	Convolutional Neural Network	A class of artificial neural network (ANN) most commonly applied to analyze visual imagery
ConvNet	Convolutional Neural Network	A class of artificial neural network (ANN) most commonly applied to analyze visual imagery
CRBM	Conditional Restricted Boltzmann Machine
CRFs	Conditional Random Fields
CRNN	Convolutional Recurrent Neural Network
CTC	Connectionist Temporal Classification
CTR	Collaborative Topic Regression
CV	Coefficient of Variation	Intra-cluster similarity to measure the accuracy of unsupervised classification models based on clusters
CV	Computer Vision
CV	Cross Validation	Resampling method for training, validation and testing a model across different iterations on portions of the full data set.
CSLR	Continuous Sign Language Recognition	Sign language recognition and understanding (continuous using not only single words but whole phrases) getting knowledge about the meaning of signs essential for SLT.
DAAF	Data Augmentation and Auxiliary Feature
DAE	Denoising AutoEncoder or Deep AutoEncoder
DBM	Deep Boltzmann Machine
DBN	Deep Belief Network
DBSCAN	Density-Based Spatial Clustering of Applications with Noise
DCGAN	Deep Convolutional Generative Adversarial Network
DCMDN	Deep Convolutional Mixture Density Network
DE	Differential Evolution
DeconvNet	DeConvolutional Neural Network
DeepLIFT	Deep Learning Important FeaTures
DL	Deep Learning
DNN	Deep Neural Network
DQN	Deep Q-Network
DR	Detection Rate	Represents the sensitivity or detection rate of a model
DSN	Deep Stacking Network
DT	Decision Tree
DTD	Deep Taylor Decomposition
DWT	Discrete Wavelet Transform
ELECTRA	Efficiently Learning an Encoder that Classifies Token Replacements Accurately
ELM	Extreme Learning Machine
ELMo	Embeddings from Language Models
ELU	Exponential Linear Unit
EM	Expectation maximization
EMD	Entropy Minimization Discretization
ERNIE	Enhanced Representation through kNowledge IntEgration
ETL Pipeline	Extract Transform Load Pipeline
EXT	Extremely Randomized Trees
F1 Score	Harmonic Precision-Recall Mean
FALA	Finite Action-set Learning Automata
FC	Fully-Connected	Layers where all the inputs from one layer are connected to every activation unit of the next layer.
FC-CNN	Fully Convolutional Convolutional Neural Network	A neural network that only performs convolution (and subsampling or upsampling) operations.
FC-LSTM	Fully Connected Long Short-Term Memory	A fully connected neural network to combine the spatial information of surrounding stations (see LSTM and FC).
FCM	Fuzzy C-Means
FCN	Fully Convolutional Network	A neural network that only performs convolution (and subsampling or upsampling) operations.
FFT	Fast Fourier transform
FLOP	Floating Point Operations	A unit of measure of the amount of mathematical computations often used to describe the complexity of a neural network.
FLOPS	Floating Point Operations Per Second	A unit of measure of computer performance
FNN	Feedforward Neural Network
FNR	False Negative Rate	Proportion of actual positives predicted as negatives
FPN	Feature Pyramid Network
FPR	False Positive Rate	Proportion of actual negatives predicted as positives
FST	Finite state transducer
FWIoU	Frequency Weighted Intersection over Union	Metric in segmentation/object detection tasks. Weighted average of IoU's over classes, where weights depend on class frequency.
GA	Genetic Algorithm
GALE	Global Aggregations of Local Explanations	Explainability method that aggregates local explanations (of single prediction) into an explanation how the whole model works.
GAM	Generalized Additive Model
GAM	Global Attribution Mapping
GAMLSS	Generalized Additive Models for Location, Scale and Shape
GAN	Generative Adversarial Network	A deep-learning-based generative model using "indirect" training through the discriminator another neural network that is able to tell how much an input is "realistic" which itself is also being updated dynamically.
GAP	Global Average Pooling
GBRCN	Gradient-Boosting Random Convolutional Network
GD	Gradient Descent	An optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient
GEBI	Global Explanation for Bias Identification	Explainability method that aggregates local explanations (of single prediction) into a global explanation with the goal of finding biases and systematic errors in decision making.
GFNN	Gradient Frequency Neural Networks
GLCM	Gray Level Co-occurrence Matrix
Gloss2Text	A task of transforming raw glosses into meaningful sentences.
GloVE	Global Vectors
GMM	Gaussian mixture model	A probabilistic model that assumes all the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters.
GPR	Gaussian Process Regression
GPT	Generative Pre-trained Transformer	An autoregressive language model that uses deep learning to produce human-like text.
GradCAM	GRADient-weighted Class Activation Mapping
HamNoSys	Hamburg Sign Language Notation System	An annotation system that describes sign language symbols
HAN	Hierarchical Attention Network
HCA	Hierarchical Clustering Analysis
HDP	Hierarchical Dirichlet process
HHDS	HipHop Dataset
hLDA	Hierarchical Latent Dirichlet allocation
HMM	Hidden Markov Model
HNN	Hopfield Neural Network
i.i.d	Independent and Identically Distributed
ID3	Iterative Dichotomiser 3
IDR	Input dependence rate
IIR	Input independence rate
INFD	Explanation Infidelity
IoU	Jaccard index (intersection over union)	Metric in segmentation/object detection tasks. Ratio of areas of intersection and union of two (segmentation) boxes, corresponding to e.g. prediction and label.
ISIC	International Skin Imaging Collaboration
k-NN	k-Nearest Neighbor
KDE	Kernel Density Estimation
KL	Kullback Leibler (KL) divergence
kNN	k-Nearest Neighbours	A non-parametric supervised learning method used for classification and regression.
KRR	Kernel Ridge Regression
LDA	Latent Dirichlet Allocation	A generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.
LDA	Linear Discriminant Analysis
LDADE	Latent Dirichlet Allocation Differential Evolution
LightGBM	Light Gradient-Boosting Machine	Gradient boosting framework that uses tree based learning algorithms, originally developed by Microsoft
LIME	Local Interpretable Model-agnostic Explanations
LRP	Layer-wise Relevance Propagation
LSA	Latent semantic analysis
LSI	Latent Semantic Indexing
LSTM	Long Short-Term Memory	A recurrent neural network can process not only single data points (such as images) but also entire sequences of data (such as speech or video).
LTR	Learning To Rank
LVQ	Learning Vector Quantization
MADE	Masked Autoencoder for Distribution Estimation
MAE	Mean Absolute Error	Average of the absolute error between the actual and predicted values
MAF	Masked Autoregressive Flows
MAP	Maximum A Posteriori (MAP) Estimation
MAPE	Mean Absolute Prediction Error	Percentage of the error between the actual and predicted values
MARS	Multivariate Adaptive Regression Spline	Non-parametric regression technique, extends linear models. Note that the name is trademarked, opem source implementations are often called "EARTH"
MART	Multiple Additive Regression Tree
MaxEnt	Maximum Entropy	Entropy a scientific concept as well as a measurable physical property that is most commonly associated with a state of disorderrandomnessor uncertainty.
MCLNN	Masked ConditionaL Neural Networks
MCMC	Markov Chain Monte Carlo
MCS	Model contrast score
MDL	Minimum description length (MDL) principle
MDN	Mixture Density Network
MDP	Markov Decision Process
MDRNN	Multidimensional recurrent neural network
MER	Music Emotion Recognition
MINT	Mutual Information based Transductive Feature Selection
MIoU	Mean Intersection over Union	Metric in segmentation/object detection tasks. Mean of IoU's over classes.
ML	Machine Learning	The study of computer algorithms that can improve automatically through experience and by the use of data.
MLE	Maximum Likelihood Estimation
MLM	Music Language Models
MLP	Multi-Layer Perceptron	A fully connected class of feedforward artificial neural network
MPA	Mean Pixel Accuracy	Metric in segmentation/object detection tasks. Average ratio of correctly classified pixels by class.
MRR	Mean Reciprocal Rank
MRS	Music Recommender System
MSDAE	Modified Sparse Denoising Autoencoder
MSE	Mean Squared Error	Average of the squares of the error between the actual and predicted values
MSR	Music Style Recognition
NAS	Neural Architecture Search	A technique for automating the design of artificial neural networks.
NB	Na ̈ıve Bayes
NBKE	Na ̈ıve Bayes with Kernel Estimation
NER	Named Entity Recognition
NERQ	Named Entity Recognition in Query
NF	Normalizing Flow
NFL	No Free Lunch (NFL) theorem
NLP	Natural Language Processing
NLT	Neural Machine Translation	An approach to translation with the use of a neural network to predict a sequence of words.
NMS	Non Maximum Suppression	A technique used in Object Detection for removing redundand overlapping bounding boxes
NN	Neural Network
NNMODFF	Neural Network based Multi-Onset Detection Function Fusion
NPE	Neural Physical Engine
NRMSE	Normalized RMSE	Cross-entropy Metric based on the logistic function that measures the error between the actual and predicted values.
NST	Neural Style Transfer	A method that uses of deep neural networks for transfering style.
NTM	Neural Turing Machine
ODF	Onset Detection Function
OLR	Ordinary Linear Regression
OLS	Ordinary Least Squares
PA	Pixel Accuracy	Metric in segmentation/object detection tasks. Ratio of correctly classified over total number of pixels.
PACO	Poisson Additive Co-Clustering
PCA	Principal Component Analysis	The process of computing the principal components and using them to perform a change of basis on the data sometimes using only the first few principal components and ignoring the rest.
PEGASUS	Pre-training with Extracted Gap-Sentences for Abstractive Summarization
PLSI	Probabilistic Latent Semantic Indexing
PM	Project Manager
PMF	Probabilistic Matrix Factorization
PMI	Pointwise Mutual Information
PNN	Probabilistic Neural Network
POC	Proof of Concept
POMDP	Partially Observable Markov Decision Process
POS	Part of Speech (POS) Tagging
PPMI	Positive Pointwise Mutual Information
PReLU	Parametric Rectified Linear Unit-Yor Topic Modeling
PU	Positive Unlabaled	Machine learning paradigma to learn from only positive and unlabeled data.
PYTM	Pitman
RandNN	Random Neural Network
RANSAC	RANdom SAmple Consensus
RBF	Radial Basis Function
RBFNN	Radial Basis Function Neural Network
RBM	Restricted Boltzmann Machine
ReLU	Rectified Linear Unit	An activation function that allow fast and effective training of deep neural architectures on large and complex datasets.
REPTree	Reduced Error Pruning Tree
RF	Random Forest
RGB	Red Green Blue color model	An additive color model used for display of images
RICNN	Rotation Invariant Convolutional Neural Network
RIM	Recurrent Interence Machines
RIPPER	Repeated Incremental Pruning to Produce Error Reduction
RL	Reinforcement Learning
RLFM	Regression based latent factors
RMSE	Root MSE	Squared root of MSE
RNN	Recurrent Neural Network
RNNLM	Recurrent Neural Network Language Model (RNNLM)
RoBERTa	Robustly Optimized BERT Pretraining Approach	Commonly used transformer-based language model.
ROC	Received Operating Characteristic	Curve that plots TPR versus FPR at different parameter settings
ROI	Region Of Interest
RR	Ridge Regression
RTRL	Real-Time Recurrent Learning
SAE	Stacked AE
SARSA	State-Action-Reward-State-Action
SBM	Stochastic block model
SBO	Structured Bayesian optimization
SBSE	Search-based software engineering
SCH	Stochastic convex hull
SDAE	Stacked DAE
seq2seq	Sequence to Sequence Learning	Desribes training approach to convert sequences from one domain (e.g. sentences in English) to sequences in another domain (e.g. the same sentences translated to French).
SER	Sentence Error Rate
SGBoost	Stochastic Gradient Boosting
SGD	Stochastic Gradient Descent
SGVB	Stochastic Gradient Variational Bayes
SHAP	SHapley Additive exPlanation
SHLLE	Supervised Hessian Locally Linear Embedding
Sign2(Gloss+Text)	Sign to Gloss and Text	A two-step process requires joint learning of sign language recognition and translation.
Sign2Gloss	A one to one translation from the single sign to the single gloss.
Sign2Text	A task of full translation from the sign language into the spoken one	grammar and syntax are included.
SLP	Single-Layer Perceptron
SLRT	Sign Language Recognition Transformer	an encoder transformer model trained to predict sign gloss sequences it takes spatial embeddings and learns spatio-temporal representations.
SLT	Sign Language Translation	A full translation of signs to a spoken language.
SLTT	Sign Language Translation Transformer	an autoregressive transformer decoder model trained on output from SLRT to predict one word at a time to generate the corresponding spoken language sentence.
SMBO	Sequential Model-Based Optimization
SOM	Self-Organizing Map	A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data
SpRay	Spectral Relevance Analysis	Global explainability method using spectral clustering and local explanations (LRP).
SSD	Single-Shot Detector	A type of object detector that consists of a single stage. Some examples are YOLO RetinaNet and EfficientDet.
SSL	Self-Supervised Learning
SSVM	Smooth support vector machine
ST	Style Transfer	An algorithm that allows to tranfer properties of one object to another (i.e. transfer painitning style to a photography).
STDA	Style Transfer Data Augmentation	A method using style transfer to augment dataset.
STL	Selt-Taught Learning
SVD	Singing Voice Detection
SVD	Singular Value Decomposition
SVM	Support Vector Machine	Supervised learning models with associated learning algorithms that analyze data for classification and regression analysis.
SVR	Support Vector Regression	Supervised learning models with associated learning algorithms that analyze data for regression analysis.
SVS	Singing Voice Separation
t-SNE	t-distributed stochastic neighbor embedding
T5	Text-To-Text Transfer Transformer	Transformer based language model that uses a text-to-text approach.
TD	Temporal Difference
TDA	Targeted Data Augmentation
TGAN	Temporal Generative Adversarial Network
THAID	THeta Automatic Interaction Detection
TINT	Tree-Interpreter
TLFN	Time-Lagged Feedforward Neural Network
TNR	True Negative Rate	Proportion of actual negatives that are correctly predicted
TPR	True Positive Rate	Proportion of actual positives that are correctly predicted
TRPO	Trust Region Policy Optimization
ULMFiT	Universal Language Model Fine-Tuning
V-Net	Volumetric Convolutional neural network	3D image segmentation based on a volumetric fully convolutional neural network
VAD	Voice Activity Detection
VAE	Variational AutoEncoder	An artificial neural network architecture belonging to the families of probabilistic graphical models and variational Bayesian methods.
VGG	Visual Geometry Group	Popular deep convolutional model designed for classification.
VPNN	Vector Product Neural Network
VQ-VAE	Vector Quantized Variational Autoencoders
VR	Virtual Reality
WER	Word Error Rate	metric to measure performance used in NLP solutions e.g. in automatic speech recognition (ASR).
WFST	Weighted finite-state transducer (WFST)
WMA	Weighted Majority Algorithm
WPE	Weighted Prediction Error
XAI	Explainable Artificial Intelligence	A set of processes and methods to make machine learning algorithms and its results more interpretable.
XGBoost	eXtreme Gradient Boosting
YOLO	You Only Look Once	Fast object detection algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
LICENSE		LICENSE
README.md		README.md
banner.png		banner.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

banner.png

banner.png

Repository files navigation

Machine learning acronyms and abbreviations 🤖

Sources 📖

Contributing 📝

The list 📑

About

Releases

Packages

Contributors 5

License

AgaMiko/machine-learning-acronyms

Folders and files

Latest commit

History

Repository files navigation

Machine learning acronyms and abbreviations 🤖

Sources 📖

Contributing 📝

The list 📑

About

Resources

License

Stars

Watchers

Forks