Detect Anomalies in Text Data Using Variational Autoencoder (VAE) in MATLAB®

This example shows how to detect out-of-distribution text data using a variational autoencoder (VAE).

Overview

VAEs are a neural network architecture composed of two parts:

An encoder that encodes data in a lower-dimensional parameter space.
A decoder that reconstructs the input data by mapping the lower-dimensional representation back into the original space.

You can use a VAE to detect anomalies in your dataset. To do this, train a VAE on your data. Then, encode and decode a test data point. Compare the output of the decoder with the input data. If the input and output are similar, then the data is in-distribution. If the input and output are dissimilar, then the data is out-of-distribution, or anomalous.

This example includes three steps.

Load and preprocess the text data.
Set up and train the encoder and decoder networks.
Use the VAE to detect anomalies in test data

Setup

Clone the repository in a local directory. If you would like to use this repository with MATLAB Online, clink

The main live script is AnomalyDetectionwithTextusingVAE.mlx. The other .m files are supporting functions for sampling the latent space, projecting and reshaping after sampling from latent space, and initializations of the project and reshape layer. You can either open the .mlx for demo or open the .prj file which will automatically open .mlx file.

Before running the file, get the data using the following steps:

Go to https://www.mathworks.com/help/textanalytics/ug/create-simple-text-model-for-classification.html.
Click on the button "Copy Command" on the top right of the page and paste it in MATLAB CLI. This will open the example in the directory where the .csv file is stored.
Copy the .csv file from the example, and paste it in the cloned repo.
If the file is saved in a different location, make sure to change the code that points to it in the .mlx file.

Required Products

MATLAB (R2023a or later)
Text Analytics Toolbox™ (R2023a or later)
Deep Learning Toolbox™ (R2023a or later)

Contact

Sohini Sarkar, ssarkar@mathworks.com

License

The license is available in license.txt file in this GitHub repository.

Community Support

MATLAB Central

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
resources/project		resources/project
AnomalyDetectionwithTextusingVAE.mlx		AnomalyDetectionwithTextusingVAE.mlx
AnomalyDetectionwithTextusingVAE.prj		AnomalyDetectionwithTextusingVAE.prj
README.md		README.md
SECURITY.md		SECURITY.md
initializeGlorot.m		initializeGlorot.m
initializeZeros.m		initializeZeros.m
license.txt		license.txt
openMainFile.m		openMainFile.m
projectAndReshapeLayer.m		projectAndReshapeLayer.m
samplingLayer.m		samplingLayer.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resources/project

resources/project

AnomalyDetectionwithTextusingVAE.mlx

AnomalyDetectionwithTextusingVAE.mlx

AnomalyDetectionwithTextusingVAE.prj

AnomalyDetectionwithTextusingVAE.prj

README.md

README.md

SECURITY.md

SECURITY.md

initializeGlorot.m

initializeGlorot.m

initializeZeros.m

initializeZeros.m

license.txt

license.txt

openMainFile.m

openMainFile.m

projectAndReshapeLayer.m

projectAndReshapeLayer.m

samplingLayer.m

samplingLayer.m

Repository files navigation

Detect Anomalies in Text Data Using Variational Autoencoder (VAE) in MATLAB®

Overview

Setup

Required Products

Contact

License

Community Support

About

Releases 1

Packages

Languages

License

matlab-deep-learning/anomaly-detection-with-text-variational-autoencoder

Folders and files

Latest commit

History

Repository files navigation

Detect Anomalies in Text Data Using Variational Autoencoder (VAE) in MATLAB®

Overview

Setup

Required Products

Contact

License

Community Support

About

Resources

License

Security policy

Stars

Watchers

Forks

Languages