getdata-project/CodeBook.md at master · Teots/getdata-project · GitHub

Source of the original data: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip Original description: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones

The attached R script (run_analysis.R) performs the following to clean up the data:

Merges the training and test sets

The rows of both sets are appended into a new variable called {subject, x, y}Combined

Reads file features.txt and extracts only the measurements on the mean and standard deviation for each measurement

Only the relevant rows (stdv and mean) in xCombined are used -> the rest is thrown away

Reads activity_labels.txt and applies descriptive activity names to name the activities in the data set

Replaces activities in yCombined by human-readable names

All data sets are combined into one set -> combinedData
The combined data is split by activity and subject

Furthermore, the mean of each column of those splits is caculated

The result is saved as cleandData.txt