ADGym 🏃‍♂️ :Design Choices for Deep Anomaly Detection

ADGym performs large benchmark and automatical selection of AD design choices, where AD design dimensions/choices are acquired (listed in the following Table) by decoupling the standard AD research pipeline:

Data Augmentation → Data Preprocessing → Network Construction → Network Training

Currently, ADGym is mainly devised for the tabular data.

Each part of the pipeline can be instantiated by multiple components (core components are marked in bold):

Pipeline	Detailed Components	Value
Data Augmentation		[Oversampling, SMOTE, Mixup, GAN]
Data Preprocessing		[MinMax, Normalization]
Network Construction	Network Architecture	[MLP, AutoEncoder, ResNet, FTTransformer]
	Hidden Layers	[[20], [100, 20], [100, 50, 20]]
	Activation	[Tanh, ReLU, LeakyReLU]
	Dropout	[0.0, 0.1, 0.3]
	Initialization	[PyTorch default, Xavier, Kaiming]
Network Training	Loss Function	[BCE, Focal, Minus, Inverse, Hinge, Deviation, Ordinal]
	Optimizer	[SGD, Adam, RMSprop]
	Batch Resampling	[False, True]
	Epochs	[20, 50, 100]
	Batch Size	[16, 64, 256]
	Learning Rate	[1e-2, 1e-3]
	Weight Decay	[1e-2, 1e-4]

Quick Start with ADGym!

For the experimental results of all the components, open the test_ADGym.py and run:

adgym = ADGym(la=5, grid_mode='small', grid_size=1000, suffix='test')
adgym.run()

For the experimental results of all the current SOTA semi- or supervised models, open the test_SOTA.py and run:

pipeline = RunPipeline(suffix='SOTA', parallel='semi-supervise', mode='nla')
pipeline.run()

pipeline = RunPipeline(suffix='SOTA', parallel='supervise', mode='nla')
pipeline.run()

For the experimental results of meta classifier (and its counterpart baseline), open the meta.py and run:

# two-stage meta classifier, using meta-feature extractor in MetaOD
run(suffix='', grid_mode='small', grid_size=1000, mode='two-stage')
# end-to-end meta classifier
run(suffix='', grid_mode='small', grid_size=1000, mode='end-to-end')

Python Package Requirements

iteration_utilities==0.11.0
metaod==0.0.6
scikit-learn==0.24
imbalanced-learn==0.7.0
torch==1.9.0
tensorflow==2.8.0
tabgan==1.2.1
rtdl==0.0.13
protobuf==3.20.*
numpy==1.21.6

Update Logs

2022.11.17: run the experiments of current component combinations
2022.11.23: add the GAN-based data augmentation method
2022.11.25: add the oversampling and SMOTE data augmentation method
2022.11.25: add the binary cross entropy loss and focal loss
2023.01.04: add the Mixup data augmentation method
2023.01.04: add different network initialization methods
2023.01.04: add the ordinal loss in PReNet model
2023.01.04: revise the labeled anomalies to the number (instead of ratio) of labeled anomalies
2023.02.20: restart ADGym
2023.02.20: add two baselines: random selection and model selection based on the partially labeled data
2023.02.22: provide both two-stage and end-to-end versions of meta predictor
2023.02.23: improve the training efficiency in meta classifier
2023.02.28: support GPU version of meta predictors and fix some bugs
2023.03.01: provide ml-based meta predictor
2023.03.01: using the performance rank ratio (instead of performance) as training targets
2023.04.23: learning-to-rank + ensemble strategy
2023.05.09: add two loss functions for meta predictor
2023.05.09: add early-stopping mechanism for meta predictor
2023.05.09: add CORAL method for transfer learning in meta features
2023.05.13: provide ensembled topk components
2023.05.22: fixed the bug that modified the original data during the process of conducting the experiment.
2023.05.22: optimized code efficiency
2023.05.29: fixed the bug in REPEN model
2023.06.01: fixed the bug in data preprocessing of meta predictor (end-to-end mode)
2023.06.01: replace the LightGBM by XGBoost (which is faster) for ml-based meta predictor

Name		Name	Last commit message	Last commit date
Latest commit History 97 Commits
baseline		baseline
datasets		datasets
metaclassifier		metaclassifier
result		result
README.md		README.md
__init__.py		__init__.py
components.py		components.py
data_generator.py		data_generator.py
gym.py		gym.py
networks.py		networks.py
sota.py		sota.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baseline

baseline

datasets

datasets

metaclassifier

metaclassifier

result

result

README.md

README.md

init.py

init.py

components.py

components.py

data_generator.py

data_generator.py

gym.py

gym.py

networks.py

networks.py

sota.py

sota.py

utils.py

utils.py

Repository files navigation

ADGym 🏃‍♂️ :Design Choices for Deep Anomaly Detection

Quick Start with ADGym!

Python Package Requirements

Update Logs

About

Releases

Packages

Contributors 2

Languages

Minqi824/ADGym

Folders and files

Latest commit

History

Repository files navigation

ADGym 🏃‍♂️ :Design Choices for Deep Anomaly Detection

Quick Start with ADGym!

Python Package Requirements

Update Logs

About

Topics

Resources

Stars

Watchers

Forks

Languages