Conditional Dual Auto-Encoders for Anomaly Detection in HEP

We use auto-encoders in an anomaly detection setting to search for SUEP (soft unclustered energy patterns) and SVJ (semi-visible jets) signals in a background of QCD events.

In our paper we propose a family of Conditional Dual Auto-Encoders (CoDAEs) models that can learn multiple anomaly detection scores from raw images of particle collisions.

There are two encodes: one with high capacity ($f_R$) to capture details in its large bottleneck, $Z$, and a smaller one ($f_m$) responsible to learn a discriminative 2-dimensional latent space, $Z_m$, that can be directly used for anomaly detection.
The encoder $f_m$ is learned by conditioning (operation denoted by blue circles and paths in the figure) $Z$ on $Z_m$.
Then, conditioning occurs multiple times at different resolutions of the decoder, $D$.

We can define multiple anomaly scores:

From the latent space, $Z$: like the KL-divergence w.r.t. a prior, $p(Z)$.
From the auxiliary bottleneck, $Z_m$, where each component can be considered as a score.
Lastly, from the decoder's reconstructions: for example a score can be the reconstruction error.

Code description

The repository is organized as follows:

ad/ contains the source code that defines layers, models, metrics, etc.
weights/ contains pre-trained weights.
data/ contains an example of the data used in our experiments.
The notebooks codae.ipynb, categorical_codvae.ipynb, dirichlet_vae.ipynb, and qcd_or_what_model.ipynb show the training of the respective models (and evaluation of only the latest two.) supervised_cct.ipynb is used to train the supervised classifier.
n_tracks.ipynb provides a comparison of our models against both physics-based and supervised baselines.
tf-lite_convert.ipynb shows how to optimize (quantize) a CoDAE model, and measure its inference time.

Usage

Installation with virtual environment (otherwise open in, e.g., Google Colab):

Clone the repository: git clone https://github.com/Luca96/dark-autoencoders.git.
Change directory: cd dark-autoencoders\.
Create the virtual environment (named "venv"): python -m venv venv.
Activate it: venv/Scripts/activate (Windows) or venv/bin/activate (UNIX).
Install dependencies: pip install -r requirements.txt.
(optional) Install Jupyter notebook (or lab): pip install notebook or pip install jupyterlab.

Citation

Please consider citing our paper, if using any of the provided code and approach in your own research or project.

@article{anzalone2023triggering,
  title={Triggering Dark Showers with Conditional Dual Auto-Encoders},
  author={Anzalone, Luca and Chhibra, Simranjit Singh and Maier, Benedikt and Chernyavskaya, Nadezda and Pierini, Maurizio},
  journal={arXiv preprint arXiv:2306.12955},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
ad		ad
data		data
src		src
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_dataset_script.py		build_dataset_script.py
categorical_codvae.ipynb		categorical_codvae.ipynb
codae.ipynb		codae.ipynb
dirichlet_vae.ipynb		dirichlet_vae.ipynb
n_tracks.ipynb		n_tracks.ipynb
qcd_or_what_model.ipynb		qcd_or_what_model.ipynb
requirements.txt		requirements.txt
supervised_cct.ipynb		supervised_cct.ipynb
tf-lite_convert.ipynb		tf-lite_convert.ipynb

License

Luca96/dark-autoencoders

Folders and files

Latest commit

History

Repository files navigation

Conditional Dual Auto-Encoders for Anomaly Detection in HEP

Code description

Usage

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages