Skip to content

kaylode/time-series-modelling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Time-series Modelling Case Studies

This repo contains the code implementation for two case studies of time series modelling. This was a part of my interview process for the Research Intern in Deep Learning role at Huawei Ireland Research Center.


First round: Technical

First round interview questions can be found here

Second round: Case studies

The report for these case studies can be found here

The purpose of the second round of interview is to assess your problem solving skills in real-world data science problems in the areas of time-series anomaly detection, prediction, and clustering.

Data are contained in two subfolders:

  1. AD&P (data for anomaly detection and prediction case studies)
  2. C (data for clustering case study)

screen screen


Case study 1: Anomaly Detection

The task is to produce time-series point-anomaly detection models for each of the 25 Key Performance Indicators (KPI) time-series contained in AD&P folder.

Specific requirements
  • Each row associates a “time-stamp” with a “kpi_value”. The anomaly detector should be trained to detect point-anomalies in “kpi_value”.
  • For the majority of time-series the problem is unsupervised (no point-anomaly labels). For some KPIs (datasets 201-206) we have also access to labelled data. In those cases labels can be taken into account if needed (optional).
  • Each of the 25 time-series needs to be modelled independently of others.
  • Slides with description of the anomaly detection method chosen.
  • Slides with visualisation of anomaly detection output in whole time-series, for each of the 25 time-series.
  • During presentation, issues of scalability, adaptation, model selection, generalisation will be discussed.

Case 2: Prediction

The task is to produce time-series prediction models for each of the 25 Key Performance Indicators (KPI) time-series contained in AD&P folder.

Specific requirements
  • For each of the 25 time-series train a model to forecast the kpi_value at time t+1, t+2, t+3, t+4, t+5, given information up to time t.
  • Slides with description of the prediction method chosen.
  • Slides with train/test prediction performance assessment.
  • Slides with visualisation of prediction versus actual time-series values.
  • During presentation, issues of scalability, adaptation, model selection, generalisation will be discussed.

Case study 3: Clustering

The task is to cluster the 23 time-series contained in the C folder.

Specific requirements
  • Monthly time-series data: “value” column associated with “date” column.
  • Number of optimal cluster should be emergent through your analysis.
  • Slides with description of the clustering method chosen.
  • Visualisation of clustering results.
  • During presentation, issues of scalability and model selection will be discussed.