Skip to content

Stream-AD/MStream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MSᴛʀᴇᴀᴍ

Implementation of

MSᴛʀᴇᴀᴍ detects group anomalies from a multi-aspect data stream in constant time and memory. We output an anomaly score for each record. MSᴛʀᴇᴀᴍ builds on top of MIDAS to work in a multi-aspect setting such as event-log data, multi-attributed graphs etc.

Demo

  1. Run bash run.sh KDD to compile the code and run it on the KDD dataset.
  2. Run bash run.sh DOS to compile the code and run it on the DOS dataset.
  3. Run bash run.sh UNSW to compile the code and run it on the UNSW dataset.

MSᴛʀᴇᴀᴍ

  1. Change Directory to MSᴛʀᴇᴀᴍ folder cd mstream
  2. Run make to compile code and create the binary
  3. Run ./mstream -n numericalfile -c categoricalfile -t timefile
  4. Run make clean to clean binaries

Command line options

  • -h --help: produce help message
  • -n --numerical: Numerical file name
  • -c --categorical: Categorical file name
  • -c --time: Timestamps file name
  • -o --output: Output file name (default: scores.txt)  
  • -r --rows: Number of Hash Functions (default: 2)  
  • -b --buckets: Number of Buckets (default: 1024)
  • -a --alpha: Temporal Decay Factor (default: 0.6)

Input file format for MSᴛʀᴇᴀᴍ

MSᴛʀᴇᴀᴍ expects the input multi-aspect record stream to be stored in three files:

  1. Numerical file: contains , separated Numerical Features.
  2. Categorical file: contains , separated Categorical Features.
  3. Time File: contains Timestamps.

Both Numerical and Categorical files contain corresponding features of the multi-aspect record. Records should be sorted in non-decreasing order of their time stamps and the column delimiter should be ,

Datasets

  1. KDDCUP99
  2. CICIDS-DoS
  3. UNSW-NB 15
  4. CICIDS-DDoS

Citation

If you use this code for your research, please consider citing our WWW paper.

@inproceedings{bhatia2021mstream,
    title={Fast Anomaly Detection in Multi-Aspect Streams},
    author={Siddharth Bhatia and Arjit Jain and Pan Li and Ritesh Kumar and Bryan Hooi},
    booktitle={The Web Conference (WWW)},
    year={2021}
}