Skip to content

SageMaker implementation of InceptionTime model for time series classification.

Notifications You must be signed in to change notification settings

fg-research/inception-time-sagemaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InceptionTime SageMaker Algorithm

The Time Series Classification (Inception) Algorithm from AWS Marketplace performs time series classification with the InceptionTime model. It implements both training and inference from CSV data and supports both CPU and GPU instances. The training and inference Docker images were built by extending the PyTorch 2.1.0 Python 3.10 SageMaker containers.

Model Description

InceptionTime is an ensemble model. Each model in the ensemble has the same architecture and uses the same hyperparameters. The only difference between the models is in the initial values of the weights, which are generated from the Glorot uniform distribution.

Each model consists of a stack of blocks, where each block includes three convolutional layers with kernel sizes of 10, 20 and 40 and a max pooling layer. The block input is processed by the four layers in parallel, and the four outputs are concatenated before being passed to a batch normalization layer followed by a ReLU activation.

A residual connection is applied between the input time series and the output of the second block, and after that between every three blocks. The residual connection processes the inputs using an additional convolutional layer with a kernel size of 1 followed by a batch normalization layer. The processed inputs are then added to the output, which is transformed by a ReLU activation.

The output of each model is passed to an average pooling layer which removes the time dimension, and then to a final output layer with softmax activation. The parameters of each model are learned separately by minimizing the cross-entropy loss.

At inference time, the class probabilities predicted by the different models are averaged in order to obtain a unique predicted probability for each class.

InceptionTime architecture (source: doi: 10.1007/s10618-020-00710-y)

Model Resources: [Paper] [Code]

SageMaker Algorithm Description

The algorithm implements the model as described above with no changes.

Training

The training algorithm has two input data channels: training and validation. The training channel is mandatory, while the validation channel is optional.

The data should be provided in a CSV file containing the time series and their class labels. The CSV file should not contain any index column or column headers. Each row of the CSV file represents a time series, while each column represents a time step. The class labels should be stored in the first column, while the time series should be stored in the subsequent columns. All the time series should have the same length and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand. See the sample input files train.csv and valid.csv.

See notebook.ipynb for an example of how to launch a training job.

Distributed Training

The algorithm supports multi-GPU training on a single instance, which is implemented through torch.nn.DataParallel. The algorithm does not support multi-node (or distributed) training across multiple instances.

Incremental Training

The algorithm supports incremental training. The model artifacts generated by a previous training job can be used to continue training the model on the same dataset or to fine-tune the model on a different dataset.

Hyperparameters

The training algorithm takes as input the following hyperparameters:

  • filters: int. The number of filters of the convolutional layers of each model.
  • depth: int. The number of blocks of each model.
  • models: int. The number of models.
  • lr: float. The learning rate used for training.
  • lr-decay: float. The learning rate decay factor
  • batch-size: int. The batch size used for training.
  • epochs: int. The number of training epochs.

Metrics

The training algorithm logs the following metrics:

  • train_loss: float. Training loss.
  • train_accuracy: float. Training accuracy.

If the validation channel is provided, the training algorithm also logs the following additional metrics:

  • valid_loss: float. Validation loss.
  • valid_accuracy: float. Validation accuracy.

See notebook.ipynb for an example of how to launch a hyperparameter tuning job.

Inference

The inference algorithm takes as input a CSV file containing the time series. The CSV file should not contain any index column or column headers. Each row of the CSV file represents a time series, while each column represents a time step. All the time series should have the same length and should not contain missing values. The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand. See the sample input file test_data.csv.

The inference algorithm outputs the predicted class labels, which are returned in CSV format. See the sample output files batch_predictions.csv and real_time_predictions.csv.

See notebook.ipynb for an example of how to launch a batch transform job.

Endpoints

The algorithm supports only real-time inference endpoints. The inference image is too large to be uploaded to a serverless inference endpoint.

See notebook.ipynb for an example of how to deploy the model to an endpoint, invoke the endpoint and process the response.

Additional Resources: [Sample Notebook]

References

  • H. Ismail Fawaz, B. Lucas, G. Forestier, C. Pelletier, D.F. Schmidt, J. Weber, G.I. Webb, L. Idoumghar, P.A. Muller and F. Petitjean, "InceptionTime: Finding AlexNet for Time Series Classification," Data Mining and Knowledge Discovery, vol. 34, no. 6, pp. 1936-1962, 2020, doi: 10.1007/s10618-020-00710-y.