Skip to content

MarekWadinger/adaptive-interpretable-ad

Repository files navigation

AID: Adaptable and Interpretable Framework for Anomaly Detection

Python application codecov Test Status Flake8 Status DOI

Online outlier detection service for industrial SCADA-based infrastructures for low-latency detection and change-point adaptation. The service provides dynamic operating limits based on changing environmental conditions and sensors aging. This implementation is built upon a robust foundation, leveraging the power of the open-source libraries river, streamz and human_security, among the others. Make sure to check out their great work!

Highlights:

  • Interpretable anomaly detector with self-supervised adaptation
  • Demonstrates interpretability by providing dynamic operating limits
  • Leverages self-learning approach on streamed IoT data
  • Utilizes existing SCADA-based industrial infrastruture
  • Offers faster response time to incidents due to root cause isolation

ESwA23 - Graphical Abstract

BESS_thresh

⚡️ Quickstart

Get your hand on the algorithm using following Jupyter notebooks and play around with open-spource example data:

  1. Case Study 0: Outlier Detection on Inverter Temperature
  2. Case Study 1: Anomaly Detection on BESS Temperature
  3. Case Study 2: Anomaly Detection on Battery Module Temperature
  4. Comparison Study: One-Class SVM and HalfSpace Trees on SCAB Dataset

🏃 Run the services

Our framework is ready to face your challenges with diverse set of suppported publish-subscribe services:

NOTE: Messaging can be signed and encrypted for most of the services. If you find any related bugs, feel free to open an issue.

Example Service Usage: MQTT

We demonstrate the usage of the service using MQTT protocol. The service is based on paho-mqtt library. The source of data is a real coffee machine streaming data to MQTT broker.

To start the service, run following line of code in your terminal:

python rpc_client.py -t "shellies/Shelly3EM-Main-Switchboard-C/emeter/0/power"

Note: You can modify the source data stream using attributes:

  • [-f | --config-file] with path to config.ini (NOTE: first valid key value pair is used)
  • [-t | --topic] to define topic to subscribe to or column in csv file
  • [-k | --key-path] with path to ssh keys of sender and receiver (NOTE: if empty, the keys are created)

To start consumer, run following command:

python consumer.py -t "shellies/Shelly3EM-Main-Switchboard-C/emeter/0/dynamic_limits"

Note: You can modify the source data stream using attributes:

  • [-f | --config-file] with path to config.ini (NOTE: first valid key value pair is used)
  • [-t | --topic] topic of MQTT or column of pd.DataFrame
  • [-k | --key-path] with path to ssh keys of sender and receiver (NOTE: if empty, the keys are created)

Query service responds with printed messages as follows:

Received message: {"time": "1970-01-01 03:17:11", "anomaly": "0", "level_high":"658.396223558289", "level_low": "635.8731097750442"}

Example Service Usage: Streamed DataFrame

If you want to stream example dataset use

python rpc_client.py -t "Average Cell Temperature"

where your config.ini shall contain

[file]
path=examples/data/input/average_temperature.csv
output=examples/data/output/dynamic_limits.json

Now, let's query the latest limits from data/output/dynamic_limits.json

python consumer.py -t "Average Cell Temperature"

The response is the latest date in dynamic_limits.json

{
    "time": datetime.datetime(1970, 1, 1, 14, 52, 42),
    "anomaly": 0,
    "level_high": 1180.92,
    "level_low": 1151.15,
}

Note: You can modify the attributes to retrieve thrasholds at any date:

  • [-d | --date] date as 'Y-m-d H:M:S'

🛠 Installation

python -m venv .env
source .env/bin/activate
pip install -r requirements.txt

👐 Contributing

Feel free to contribute in any way you like, we're always open to new ideas and approaches.

  • Feel welcome to open an issue if you think you've spotted a bug or a performance issue.

💬 Citation

If the service or the algorithm has been useful to you and you would like to cite it in an scientific publication, please refer to the paper published in Expert Systems with Applications:

@article{WADINGER2024123200,
  title    = {Adaptable and Interpretable Framework for Anomaly Detection in SCADA-based industrial systems},
  journal  = {Expert Systems with Applications},
  pages    = {123200},
  year     = {2024},
  issn     = {0957-4174},
  doi      = {https://doi.org/10.1016/j.eswa.2024.123200},
  url      = {https://www.sciencedirect.com/science/article/pii/S0957417424000654},
  author   = {Marek Wadinger and Michal Kvasnica},
  keywords = {Anomaly detection, Root cause isolation, Iterative learning, Statistical learning, Self-supervised learning},
}