Skip to content

Ardawanism/Fault-Detection-and-Identification-2023

Repository files navigation

Fault-Detection-and-Identification-2023

In this repo, a few topics about useful Machine Learning and Deep Learning algorithms for Fault Detection and Identification are taught.

Context

Machine Learning and Deep Learning algorithms play an important role in Intelligent Condition Monitoring and Data-Driven Fault Detection and Identification. Some of useful algorithms are implemented from scratch and investigated in depth in this repo. At first, PCA and LDA dimension reduction methods are implemented from scratch and investigated. Then, Multi-Layer Perceptron is utilized for solving Classification and Regression tasks. RBF Neural Network is also implemented from scratch and utilized for a classification task on a simple generated data set. Moreover, Dense, RBF, Convolutional and Convolutional-Recurrent Neural Networks are utilized for Fault Identification on Case Western Bearing Health data set.

Dataset

For PCA and LDA investigation a custom synthetic data set is used. Also for taching MLP for Regression and Classification, and RBF for Classification a custom synthetic data set is used. For Fully Connected Autoencoder for dimensionality reduction a custom bearing data set containing 12 statistical features is used. Moreover, The Case Western Reserve University bearing data set which contains signals of various bearing health states is utilized for training MLP, RBF, Convolutional and ConvLSTM networks. The signals are segmented by sliding a window with size 420 in a non-overlapping manner. The window size ischoosen based on the sampling frequency and the motor rotational speed. The data set contains 4 classes, namely Healthy, Inner Race, Ball and Outer Race.

This dataset is taken from the website https://engineering.case.edu/bearingdatacenter/download-data-file

Berief list of implemented algorithms and neural networks

  1. PCA and LDA
  2. MLP for Classification and Regression tasks
  3. RBF for Classification task
  4. Fully Connected Autoencoder for dimensionality reduction
  5. MLP for CWRU bearing Fault Classification
  6. RBF for CWRU bearing Fault Classification
  7. CNN for CWRU bearing Fault Classification
  8. ConvLSTM for CWRU bearing Fault Classification

PCA and LDA

Two experiments are carried out in this section. For the 1st experiment, a simple data set with gaussian distrbution is generated. Then the principal directions for transforming the data are obtained from PCA and are depicted on the generated distribution. The generated distribution and corresponding principal directions are depicted in the below figure.

For the 2nd example, a simple data set with 3 classes is generated. The generated data set is depicted in below figure.

Then the principal components are obtained from PCA and are depicted on the scatter plot of data in 2 dimensional space. The result is depicted in below figure.

Then the principal components are obtained from PCA and the data is projected to each of obtained directions. The result is depicted in below figure.

Furthermore, using the same data set and LDA, the principal components for transforming the data are acquired. The result of transforming data without dimensionality reduction as well as transforming the data along each of principal directions (reducing one dimension) are depicted in below figure.

MLP for Regression

The target function for regression is depicted in below figure.

The error during training procedure is depicted in below figure.

The predicted values by the regression network for samples belonging to train data set and test data sets are depicted in figure below.

Apparentely, for large values of independent variable the predicted values are not accurate enough, so we use powers of independent variable x (x^2, x^3, ...) in order to achieve a better regression model. The error during training procedure is depicted in below figure.

The achieved result is depicted in below figure. Obviously the results are more accurate for large values of independent variable x.

MLP and RBF for Classification

The synthetic data set used for training MLP and RBF Networks for Classification task is depicted in below figure.

The loss and accuracy figure of train and validation data during training of MLP are depicted in below figure.

Obviously MLP achieves great performance on test data set. The confusion matrix of test data set is depicted in below figure.

The loss and accuracy figure of train and validation data during training of RBF are depicted in below figure.

Obviously RBF achieves great performance on test data set. The confusion matrix of test data set is depicted in below figure.

Fully Connected Autoencoder for Dimensionality Reduction

The architecture of FC Autoencoder (AE) is depicted in below figure.

At first, a 2 dimensional visualization of the data is acquired by t-sne. The scatter plot of raw data in 2 dimensional space is depicted in below figure.

The reconstruction error histogram before training the AE is depicted in below figure.

The plot of reconstruction error during training the AE is depicted in below figure.

The reconstruction error histogram after training the AE is depicted in below figure. Obviously, the distribution is much tighter after training, which denotes that the model has learnt a good latent representation from raw data and can reduce the dimensionality of raw data with slight information loss or none at all.

CWRU bearing data set Classification using MLP, RBF, CNN, and ConvLSTM

The CWRU bearing health data set is sliced into signals with length 420. The signal length is choosen based on the sampling frequency and the motor rotational speed. One sample of segmented signals is depicted in below figure.

The loss and accuracy during training MLP on Drive Signals are depicted in below figure.

The confusion matrix of Drive Ssignals test data set is depicted in below figure.

The loss and accuracy during training MLP on Fan Signals are depicted in below figure.

The confusion matrix of Fan Signals test data set is depicted in below figure.

The loss and accuracy during training MLP on Drive-Fan Signals are depicted in below figure.

The confusion matrix of Drive-Fan Signals test data set is depicted in below figure.

The performance comparision between using Drive Signals, Fan Signals and concatenation of Drive-Fan Signals is depicted in below table.

Accuracy precision recall f1-score
Drive Signals 94.42 95.05 94.43 94.44
Fan Signals 97.5 97.11 97.5 97.15
Drive-Fan Signals 97.72 98.57 97.72 98.07

For using RBF, a PCA is performed at first. The number of useful features is selected based the the Information/Feature graph depicted in below figure. Only the first 10 features after the transformation are considered useful and are kept and the others are omitted.

The loss and accuracy during training RBF on Drive-Fan Signals are depicted in below figure.

The confusion matrix of Drive-Fan Signals test data set is depicted in below figure.

The loss and accuracy during training CNN on Drive-Fan Signals are depicted in below figure.

The confusion matrix of Drive-Fan Signals test data set is depicted in below figure.

The loss and accuracy during training ConvLSTM on Drive-Fan Signals are depicted in below figure.

The confusion matrix of Drive-Fan Signals test data set is depicted in below figure.

The performance comparision between different models trained on Drive Signals, Fan Signals and concatenation of Drive-Fan Signals is depicted in below table.

Accuracy precision recall f1-score
MLP (Trained on Drive Signals) 94.42 95.05 94.43 94.44
MLP (Trained on Fan Signals) 97.5 97.11 97.5 97.15
MLP (Trained on Drive-Fan Signals) 97.72 98.57 97.72 98.07
RBF (Trained on Drive-Fan Signals) 93.03 92.44 93.03 92.5
CNN (Trained on Drive-Fan Signals) 100.0 100.0 100.0 100.0
ConvLSTM (Trained on Drive-Fan Signals) 100.0 100.0 100.0 100.0