The Time Series Classification (Inception) Algorithm from AWS Marketplace performs time series classification with the InceptionTime model. It implements both training and inference from CSV data and supports both CPU and GPU instances. The training and inference Docker images were built by extending the PyTorch 2.1.0 Python 3.10 SageMaker containers.
InceptionTime is an ensemble model. Each model in the ensemble has the same architecture and uses the same hyperparameters. The only difference between the models is in the initial values of the weights, which are generated from the Glorot uniform distribution.
Each model consists of a stack of blocks, where each block includes three convolutional layers with kernel sizes of 10, 20 and 40 and a max pooling layer. The block input is processed by the four layers in parallel, and the four outputs are concatenated before being passed to a batch normalization layer followed by a ReLU activation.
A residual connection is applied between the input time series and the output of the second block, and after that between every three blocks. The residual connection processes the inputs using an additional convolutional layer with a kernel size of 1 followed by a batch normalization layer. The processed inputs are then added to the output, which is transformed by a ReLU activation.
The output of each model is passed to an average pooling layer which removes the time dimension, and then to a final output layer with softmax activation. The parameters of each model are learned separately by minimizing the cross-entropy loss.
At inference time, the class probabilities predicted by the different models are averaged in order to obtain a unique predicted probability for each class.
InceptionTime architecture (source: doi: 10.1007/s10618-020-00710-y)
Model Resources: [Paper] [Code]
The algorithm implements the model as described above with no changes.
The training algorithm has two input data channels: training
and validation
.
The training
channel is mandatory, while the validation
channel is optional.
The data should be provided in a CSV file containing the time series and their class labels.
The CSV file should not contain any index column or column headers.
Each row of the CSV file represents a time series, while each column represents a time step.
The class labels should be stored in the first column, while the time series should be stored in the subsequent columns.
All the time series should have the same length and should not contain missing values.
The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.
See the sample input files train.csv
and valid.csv
.
See notebook.ipynb
for an example of how to launch a training job.
The algorithm supports multi-GPU training on a single instance, which is implemented through torch.nn.DataParallel. The algorithm does not support multi-node (or distributed) training across multiple instances.
The algorithm supports incremental training. The model artifacts generated by a previous training job can be used to continue training the model on the same dataset or to fine-tune the model on a different dataset.
The training algorithm takes as input the following hyperparameters:
filters
:int
. The number of filters of the convolutional layers of each model.depth
:int
. The number of blocks of each model.models
:int
. The number of models.lr
:float
. The learning rate used for training.lr-decay
:float
. The learning rate decay factorbatch-size
:int
. The batch size used for training.epochs
:int
. The number of training epochs.
The training algorithm logs the following metrics:
train_loss
:float
. Training loss.train_accuracy
:float
. Training accuracy.
If the validation
channel is provided, the training algorithm also logs the following additional metrics:
valid_loss
:float
. Validation loss.valid_accuracy
:float
. Validation accuracy.
See notebook.ipynb
for an example of how to launch a hyperparameter tuning job.
The inference algorithm takes as input a CSV file containing the time series.
The CSV file should not contain any index column or column headers.
Each row of the CSV file represents a time series, while each column represents a time step.
All the time series should have the same length and should not contain missing values.
The time series are scaled internally by the algorithm, there is no need to scale the time series beforehand.
See the sample input file test_data.csv
.
The inference algorithm outputs the predicted class labels, which are returned in CSV format.
See the sample output files batch_predictions.csv
and real_time_predictions.csv
.
See notebook.ipynb
for an example of how to launch a batch transform job.
The algorithm supports only real-time inference endpoints. The inference image is too large to be uploaded to a serverless inference endpoint.
See notebook.ipynb
for an example of how to deploy the model to an endpoint, invoke the endpoint and process the response.
Additional Resources: [Sample Notebook]
- H. Ismail Fawaz, B. Lucas, G. Forestier, C. Pelletier, D.F. Schmidt, J. Weber, G.I. Webb, L. Idoumghar, P.A. Muller and F. Petitjean, "InceptionTime: Finding AlexNet for Time Series Classification," Data Mining and Knowledge Discovery, vol. 34, no. 6, pp. 1936-1962, 2020, doi: 10.1007/s10618-020-00710-y.