Skip to content

rsamei/Anomaly-Detection

Repository files navigation

Dataset
Time Series (will not be available in repository)
Problem
Anomaly detection
Problem statements
A company needs to monitor its machine to be immediately notified in case of malfunctioning. This machine is dedicated to mechanical processing of pieces. Each time it’s processing a piece, a sensor installed on it records the electric consumption. If there is some abnormal behavior, mainly due to wear or malfunctioning, it will be visible from the time series.
But the time series is very irregular and software is needed to perform analysis over it. Thanks to this analysis, anomalies on the time series should be detected.
Moreover, since immediate intervention is needed in case of malfunctioning, the detection should happen while data is generated.
Dataset description
Time series are generated by sensors. Those sensors are installed in the assembly line of the client. Each sensor collects data with a frequency of 40Hz.
Some dataset have anomalies that denote a failure in the machine. Length and appearance of anomalies can vary.
Every entry of the dataset has a label, indicating at which time the machine had a failure. Two datasets will be provided: a training set and a test set.

final report:
Anamoly Detection using various machine learning models, including Logistic Regression, Random Forest, Isolation Forest, and XGBoost. The best-performing models, XGBoost and Random Forest, were combined into an ensemble where their posterior probabilities were input to a final logistic regression model. A threshold of 0.4 for the positive class was chosen for the final prediction. Due to time constraints, additional models and hyperparameter tuning were not explored.