Time-series Modelling Case Studies

This repo contains the code implementation for two case studies of time series modelling. This was a part of my interview process for the Research Intern in Deep Learning role at Huawei Ireland Research Center.

First round: Technical

First round interview questions can be found here

Second round: Case studies

The report for these case studies can be found here

The purpose of the second round of interview is to assess your problem solving skills in real-world data science problems in the areas of time-series anomaly detection, prediction, and clustering.

Data are contained in two subfolders:

AD&P (data for anomaly detection and prediction case studies)
C (data for clustering case study)

Case study 1: Anomaly Detection

The task is to produce time-series point-anomaly detection models for each of the 25 Key Performance Indicators (KPI) time-series contained in AD&P folder.

Specific requirements

Each row associates a “time-stamp” with a “kpi_value”. The anomaly detector should be trained to detect point-anomalies in “kpi_value”.
For the majority of time-series the problem is unsupervised (no point-anomaly labels). For some KPIs (datasets 201-206) we have also access to labelled data. In those cases labels can be taken into account if needed (optional).
Each of the 25 time-series needs to be modelled independently of others.
Slides with description of the anomaly detection method chosen.
Slides with visualisation of anomaly detection output in whole time-series, for each of the 25 time-series.
During presentation, issues of scalability, adaptation, model selection, generalisation will be discussed.

Case 2: Prediction

The task is to produce time-series prediction models for each of the 25 Key Performance Indicators (KPI) time-series contained in AD&P folder.

Specific requirements

For each of the 25 time-series train a model to forecast the kpi_value at time t+1, t+2, t+3, t+4, t+5, given information up to time t.
Slides with description of the prediction method chosen.
Slides with train/test prediction performance assessment.
Slides with visualisation of prediction versus actual time-series values.
During presentation, issues of scalability, adaptation, model selection, generalisation will be discussed.

Case study 3: Clustering

The task is to cluster the 23 time-series contained in the C folder.

Specific requirements

Monthly time-series data: “value” column associated with “date” column.
Number of optimal cluster should be emergent through your analysis.
Slides with description of the clustering method chosen.
Visualisation of clustering results.
During presentation, issues of scalability and model selection will be discussed.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
configs		configs
resources		resources
scripts		scripts
source		source
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

resources

resources

scripts

scripts

source

source

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Time-series Modelling Case Studies

First round: Technical

Second round: Case studies

Case study 1: Anomaly Detection

Case 2: Prediction

Case study 3: Clustering

About

Languages

kaylode/time-series-modelling

Folders and files

Latest commit

History

Repository files navigation

Time-series Modelling Case Studies

First round: Technical

Second round: Case studies

Case study 1: Anomaly Detection

Case 2: Prediction

Case study 3: Clustering

About

Topics

Resources

Stars

Watchers

Forks

Languages