Human Activity Recognition from Smartphone Accelerometer Data

Author: Abhijit Khuperkar, Data Scientist
Email: akhuperkar@yahoo.com
Follow on: LinkedIn | Twitter | Getpocket

Introduction

This project demonstrates how to predict the type of physical activity (e.g., walking, climbing stairs) from tri-axial smartphone accelerometer data using supervised machine learning. Smartphone accelerometers are very precise, and different physical activities give rise to unique patterns of acceleration.

Input Data

The input data used for training in this project consists of two files.

The first file, train_time_series.csv, contains the raw accelerometer data, which has been collected using the Beiwe research platform, and it has the following format:

timestamp, UTC time, accuracy, x, y, z

The time series signals are sampled at 10 Hz (0.1 seconds per sample) and contains total 3744 samples and 3 components. timestamp column is the time variable. The last three columns labeled x, y, and z correspond to measurements of linear acceleration along each of the three orthogonal axes.
The second file, train_labels.csv, contains the activity labels. Different activities have been encoded with integers as follows:

1 = standing,
2 = walking,
3 = stairs down,
4 = stairs up.

The activity labels are sampled at 1 Hz (1 second per sample) and contains 375 samples. Because the accelerometers are sampled at high frequency, the labels in train_labels.csv are only provided for every 10th observation in train_time_series.csv.

Approach

Here, the goal is to classify the four physical activities from smartphone accelerometer signals as accurately as possible. My approach to build a machine learning classifier is as follows:

The training_labels.csv contain labels for every 10 observations in training_time_series.csv. This implies each labeled signal is sampled from 10 signals.
I combined 3 axes components in training and test time series into a 4th component of combined magnitude by taking a square root of summation of their squares: sqrt(x^2+y^2+z^2)
From 1 & 2 above, I transformed the training_time_series dataframe of shape (3744, 3) into a numpy array of shape (375, 10, 4)
From this array, I extracted features for each of the 4 components using frequency transformations e.g. Fast fourier transformation values, Power spectral density values, Autocorrelation
I split this array of all features into training and validation array in 80:20 ratio. The training set (300 observations) is used to train the classifier. I randomized the training set to avoid classifier getting biased to a particular pattern in time-series
To address the activity class imbalance, I used oversampling with SMOTE on the training set
The validation set (75 observations) is used to get a sense of validation accuracy as an indicator of test accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitattributes		.gitattributes
HAR Using Machine Learning .ipynb		HAR Using Machine Learning .ipynb
README.md		README.md
train_labels.csv		train_labels.csv
train_time_series.csv		train_time_series.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitattributes

.gitattributes

HAR Using Machine Learning .ipynb

HAR Using Machine Learning .ipynb

README.md

README.md

train_labels.csv

train_labels.csv

train_time_series.csv

train_time_series.csv

Repository files navigation

Human Activity Recognition from Smartphone Accelerometer Data

Introduction

Input Data

Approach

About

Releases

Packages

Languages

akhuperkar/HAR-Smartphone-Accelerometer

Folders and files

Latest commit

History

Repository files navigation

Human Activity Recognition from Smartphone Accelerometer Data

Introduction

Input Data

Approach

About

Resources

Stars

Watchers

Forks

Languages