Skip to content

flaneur-ml/TDA4TS_CLF

Repository files navigation

Topological Data Analysis for Time Series Classification

TDA was used to extract topological features from UCR time series classification datasets. In our project we argue that topological features may help to identify interesting patterns in data in which shape has meaning. This repo contains the pipeline for extracting the topological features and evaluation of multiple classification algorithms on this features. Giotto-tda, a high-performance topological machine learning toolbox in Python built on top of scikit-learn library, was used to extract topological features from the input data using persistent homology and combine these features with machine learning methods.

Repository structure

  • TDA.ipynb - the main notebook that demonstrates the application, evaluation and analysis of topological features for time series classification
  • src/TFE - contains routines for extracting Persistence Diagram and implemented topological features
  • src/nn and src/ae - contain neural network and VAE implementation
  • src/utils.py - contains helping methods
  • extract_tda_dataset.py - script that can be used to generate datasets with topological features from initial UCR datasets
  • evaluation.py - script that can be used for evaluation of extracted topological features datasets
  • Texas_sharpshooter.ipynb - notebook that was used to build Texas Sharpshooter plot
  • CD_diagram.ipynb - notebook that was used to build CD diagram
  • results_wit_metrics_on_42.csv - CSV file that contains some on the test some of the dataset
  • results_with_accuracies_on_110.csv - CSV file that contains all the accuracies on the test for every dataset

Extracted topological features from the initial UCR datasets can be found on Google Drive.

Setup

  • Clone this repo:
git clone https://github.com/SamirMoustafa/Time-Series-Classification.git	
  • Install dependencies:
pip install -r requirements.txt	

License

The project is licensed under the MIT License - see the LICENSE.md file for details.