Skip to content

yangzhen-cdut/Unsupervised-Clustering

Repository files navigation

Time Series Contrastive Clustering (TSCC)

Cite This Paper:

Z. Yang, H. Li, X. Tuo, L. Li and J. Wen, "Unsupervised Clustering of Microseismic Signals Using a Contrastive Learning Model," in IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1-12, 2023, Art no. 5903212, doi: 10.1109/TGRS.2023.3240728.

Link: https://doi.org/10.1109/TGRS.2023.3240728

Getting Started

Clone the project into your local system

git clone https://github.com/yangzhen-cdut/Unsupervised-Clustering.git
cd Unsupervised-Clustering

Requirements

The recommended requirements for TSCC are specified as follows:

  • Python 3.7
  • torch==1.8.1
  • scipy==1.7.3
  • numpy==1.21.6
  • pandas==1.3.5
  • scikit_learn==0.24.2
  • matplotlib==3.5.2
  • bottleneck==1.3.4
  • seaborn==0.11.2

The dependencies can be installed by:

pip install -r requirements.txt

Note that you should have CUDA installed for running the code.

Usage

To train TSCC on a microseismic dataset, run the following command:

python run.py <dataset_name> <dataset_name> --pretraining_epoch <pretraining_epoch> --batch-size <batch_size> --MaxIter <MaxIter> --repr-dims <repr_dims>

After training, the trained encoder of pre-training phase, the trained encoder of fine-tuning phase and clustering centers can be found in ./<dataset_name>_Pretraining_phase, ./<dataset_name>_Finetuning_phase, ./<dataset_name>_Centers.

To evaluate TSCC on a microseismic dataset, run the following command:

python evaluation.py <dataset_name> <dataset_name> --pretraining_epoch <pretraining_epoch> --batch-size <batch_size> --MaxIter <MaxIter> --repr-dims <repr_dims>

There are two examples are given in evaluation.py: eval_with_real_data and eval_with_synthetic_data. You can call those two functions directly, and the output of representations can be found in ./Eval_Representations.npy and ./Eval_Syn_Representations.npy, respectively.

Results

The architecture of TSCC used in our study.

Clustering performance comparison.

Visualization of learned latent representations.

Latent representations of synthetic waveforms

Latent representations of real microseismic waveforms

Representative cluster distribution

Tpyical microseismic waveforms

Tpyical noise waveforms

Classification performance comparison of supervised classifier using raw time series and the features (R) generated by the TSCC.

Methods Time series Feature R
ACC (%) NMI (%) AUPRC (%) ACC (%) NMI (%) AUPRC (%)
Linear 71.59 15.38 66.47 99.11 92.63 98.71
KNN 90.58 55.55 87.65 98.13 86.62 97.36
SVM 97.65 83.91 99.81 98.94 91.56 99.91
TSCC 98.07 86.26 97.15 -- -- --

Releases

No releases published

Packages

No packages published

Languages