Skip to content

xuhongzuo/DeepOD

Repository files navigation

Python Deep Outlier/Anomaly Detection (DeepOD)

testing2

Documentation Status

codacy

coveralls

downloads

license

DeepOD is an open-source python library for Deep Learning-based Outlier Detection and Anomaly Detection. DeepOD supports tabular anomaly detection and time-series anomaly detection.

DeepOD includes 27 deep outlier detection / anomaly detection algorithms (in unsupervised/weakly-supervised paradigm). More baseline algorithms will be included later.

DeepOD is featured for:

  • Unified APIs across various algorithms.
  • SOTA models includes reconstruction-, representation-learning-, and self-superivsed-based latest deep learning methods.
  • Comprehensive Testbed that can be used to directly test different models on benchmark datasets (highly recommend for academic research).
  • Versatile in different data types including tabular and time-series data (DeepOD will support other data types like images, graph, log, trace, etc. in the future, welcome PR πŸ”­).
  • Diverse Network Structures can be plugged into detection models, we now support LSTM, GRU, TCN, Conv, and Transformer for time-series data. (welcome PR as well ✨)

If you are interested in our project, we are pleased to have your stars and forks πŸ‘ 🍻 .

Installation

The DeepOD framework can be installed via:

pip install deepod

install a developing version (strongly recommend)

git clone https://github.com/xuhongzuo/DeepOD.git
cd DeepOD
pip install .

Usages

Directly use detection models in DeepOD:

DeepOD can be used in a few lines of code. This API style is the same with Sklean and PyOD.

for tabular anomaly detection:

# unsupervised methods
from deepod.models.tabular import DeepSVDD
clf = DeepSVDD()
clf.fit(X_train, y=None)
scores = clf.decision_function(X_test)

# weakly-supervised methods
from deepod.models.tabular import DevNet
clf = DevNet()
clf.fit(X_train, y=semi_y) # semi_y uses 1 for known anomalies, and 0 for unlabeled data
scores = clf.decision_function(X_test)

# evaluation of tabular anomaly detection
from deepod.metrics import tabular_metrics
auc, ap, f1 = tabular_metrics(y_test, scores)

for time series anomaly detection:

# time series anomaly detection methods
from deepod.models.time_series import TimesNet
clf = TimesNet()
clf.fit(X_train)
scores = clf.decision_function(X_test)

# evaluation of time series anomaly detection
from deepod.metrics import ts_metrics
from deepod.metrics import point_adjustment # execute point adjustment for time series ad
eval_metrics = ts_metrics(labels, scores)
adj_eval_metrics = ts_metrics(labels, point_adjustment(labels, scores))

Testbed usage:

Testbed contains the whole process of testing an anomaly detection model, including data loading, preprocessing, anomaly detection, and evaluation.

Please refer to testbed/

  • testbed/testbed_unsupervised_ad.py is for testing unsupervised tabular anomaly detection models.
  • testbed/testbed_unsupervised_tsad.py is for testing unsupervised time-series anomaly detection models.

Key arguments:

  • --input_dir: name of the folder that contains datasets (.csv, .npy)
  • --dataset: "FULL" represents testing all the files within the folder, or a list of dataset names using commas to split them (e.g., "10_cover*,20_letter*")
  • --model: anomaly detection model name
  • --runs: how many times running the detection model, finally report an average performance with standard deviation values

Example:

  1. Download ADBench datasets.
  2. modify the dataset_root variable as the directory of the dataset.
  3. input_dir is the sub-folder name of the dataset_root, e.g., Classical or NLP_by_BERT.
  4. use the following command in the bash
cd DeepOD
pip install .
cd testbed
python testbed_unsupervised_ad.py --model DeepIsolationForest --runs 5 --input_dir ADBench

Implemented Models

Tabular Anomaly Detection models:

Model Venue Year Type Title
Deep SVDD ICML 2018 unsupervised Deep One-Class Classification1
REPEN KDD 2018 unsupervised Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection2
RDP IJCAI 2020 unsupervised Unsupervised Representation Learning by Predicting Random Distances3
RCA IJCAI 2021 unsupervised RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection4
GOAD ICLR 2020 unsupervised Classification-Based Anomaly Detection for General Data5
NeuTraL ICML 2021 unsupervised Neural Transformation Learning for Deep Anomaly Detection Beyond Images6
ICL ICLR 2022 unsupervised Anomaly Detection for Tabular Data with Internal Contrastive Learning7
DIF TKDE 2023 unsupervised Deep Isolation Forest for Anomaly Detection8
SLAD ICML 2023 unsupervised Fascinating Supervisory Signals and Where to Find Them: Deep Anomaly Detection with Scale Learning9
DevNet KDD 2019 weakly-supervised Deep Anomaly Detection with Deviation Networks10
PReNet KDD 2023 weakly-supervised Deep Weakly-supervised Anomaly Detection11
Deep SAD ICLR 2020 weakly-supervised Deep Semi-Supervised Anomaly Detection12
FeaWAD TNNLS 2021 weakly-supervised Feature Encoding with AutoEncoders for Weakly-supervised Anomaly Detection13
RoSAS IP&M 2023 weakly-supervised RoSAS: Deep semi-supervised anomaly detection with contamination-resilient continuous supervision14

Time-series Anomaly Detection models:

Model Venue Year Type Title
DCdetector KDD 2023 unsupervised DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection15
TimesNet ICLR 2023 unsupervised TIMESNET: Temporal 2D-Variation Modeling for General Time Series Analysis16
AnomalyTransformer ICLR 2022 unsupervised Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy17
NCAD IJCAI 2022 unsupervised Neural Contextual Anomaly Detection for Time Series18
TranAD VLDB 2022 unsupervised TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data19
COUTA arXiv 2022 unsupervised Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection20
USAD KDD 2020 unsupervised USAD: UnSupervised Anomaly Detection on Multivariate Time Series
DIF TKDE 2023 unsupervised Deep Isolation Forest for Anomaly Detection21
TcnED TNNLS 2021 unsupervised An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series22
Deep SVDD (TS) ICML 2018 unsupervised Deep One-Class Classification23
DevNet (TS) KDD 2019 weakly-supervised Deep Anomaly Detection with Deviation Networks24
PReNet (TS) KDD 2023 weakly-supervised Deep Weakly-supervised Anomaly Detection25
Deep SAD (TS) ICLR 2020 weakly-supervised Deep Semi-Supervised Anomaly Detection26

NOTE:

  • For Deep SVDD, DevNet, PReNet, and DeepSAD, we employ network structures that can handle time-series data. These models' classes have a parameter named network in these models, by changing it, you can use different networks.
  • We currently support 'TCN', 'GRU', 'LSTM', 'Transformer', 'ConvSeq', and 'DilatedConv'.

Citation

If you use this library in your work, please cite this paper:

Hongzuo Xu, Guansong Pang, Yijie Wang and Yongjun Wang, "Deep Isolation Forest for Anomaly Detection," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2023.3270293.

You can also use the BibTex entry below for citation.

@ARTICLE{xu2023deep,
   author={Xu, Hongzuo and Pang, Guansong and Wang, Yijie and Wang, Yongjun},
   journal={IEEE Transactions on Knowledge and Data Engineering}, 
   title={Deep Isolation Forest for Anomaly Detection}, 
   year={2023},
   volume={},
   number={},
   pages={1-14},
   doi={10.1109/TKDE.2023.3270293}
}

Star History

Current stars:

GitHub Repo stars

image

Reference


  1. Ruff, Lukas, et al. "Deep one-class classification." ICML. 2018.↩

  2. Pang, Guansong, et al. "Learning representations of ultrahigh-dimensional data for random distance-based outlier detection". KDD (pp. 2041-2050). 2018.↩

  3. Wang, Hu, et al. "Unsupervised Representation Learning by Predicting Random Distances". IJCAI (pp. 2950-2956). 2020.↩

  4. Liu, Boyang, et al. "RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection". IJCAI (pp. 1505-1511). 2021.↩

  5. Bergman, Liron, and Yedid Hoshen. "Classification-Based Anomaly Detection for General Data". ICLR. 2020.↩

  6. Qiu, Chen, et al. "Neural Transformation Learning for Deep Anomaly Detection Beyond Images". ICML. 2021.↩

  7. Shenkar, Tom, et al. "Anomaly Detection for Tabular Data with Internal Contrastive Learning". ICLR. 2022.↩

  8. Xu, Hongzuo et al. "Deep Isolation Forest for Anomaly Detection". TKDE. 2023.↩

  9. Xu, Hongzuo et al. "Fascinating supervisory signals and where to find them: deep anomaly detection with scale learning". ICML. 2023.↩

  10. Pang, Guansong, et al. "Deep Anomaly Detection with Deviation Networks". KDD. 2019.↩

  11. Pang, Guansong, et al. "Deep Weakly-supervised Anomaly Detection". KDD. 2023.↩

  12. Ruff, Lukas, et al. "Deep Semi-Supervised Anomaly Detection". ICLR. 2020.↩

  13. Zhou, Yingjie, et al. "Feature Encoding with AutoEncoders for Weakly-supervised Anomaly Detection". TNNLS. 2021.↩

  14. Xu, Hongzuo et al. "RoSAS: Deep semi-supervised anomaly detection with contamination-resilient continuous supervision". IP&M. 2023.↩

  15. Yang, Yiyuan, et al. "DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection". KDD. 2023↩

  16. Wu, Haixu, et al. "TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis". ICLR. 2023.↩

  17. Xu, Jiehui, et al. "Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy". ICLR, 2022.↩

  18. Carmona, Chris U., et al. "Neural Contextual Anomaly Detection for Time Series". IJCAI. 2022.↩

  19. Tuli, Shreshth, et al. "TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data". VLDB. 2022.↩

  20. Xu, Hongzuo et al. "Calibrated One-class Classification for Unsupervised Time Series Anomaly Detection". arXiv:2207.12201. 2022.↩

  21. Xu, Hongzuo et al. "Deep Isolation Forest for Anomaly Detection". TKDE. 2023.↩

  22. Garg, Astha, et al. "An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series". TNNLS. 2021.↩

  23. Ruff, Lukas, et al. "Deep one-class classification." ICML. 2018.↩

  24. Pang, Guansong, et al. "Deep Anomaly Detection with Deviation Networks". KDD. 2019.↩

  25. Pang, Guansong, et al. "Deep Weakly-supervised Anomaly Detection". KDD. 2023.↩

  26. Ruff, Lukas, et al. "Deep Semi-Supervised Anomaly Detection". ICLR. 2020.↩