Action Recognition

This project aims to accurately recognize user's action in a series of video frames through combination of convolution neural nets, and long-short term memory neural nets.

Project Overview

This project explores prominent action recognition models with UCF-101 dataset
Perfomance of different models are compared and analysis of experiment results are provided

File Structure of the Repo

rnn_practice: Practices on RNN models and LSTMs with online tutorials and other useful resources

data: Training and testing data. (NOTE: please don't add large data files to this repo, add them to .gitignore)

models: Defining the architecture of models

utils: Utils scripts for dataset preparation, input pre-processing and other helper functions

train_CNN: Training CNN models. The program loads corresponding models, sets the training parameters and initializes network training

process_CNN: Processing video with CNN models. The CNN component is pre-trained and fixed during the training phase of LSTM cells. We can utilize the CNN model to pre-process frames of each video and store the intermediate results for feeding into LSTMs later. This procedure improves the training efficiency of the LRCN model significantly

train_RNN: Training the LRCN model

predict: Calculating the overall testing accuracy on the entire testing set

Models Description

Fine-tuned ResNet50 and trained solely with single-frame image data. Each frame of the video is considered as an image for training and testing, which generates a natural data augmentation. The ResNet50 is from keras repo, with weights pre-trained on Imagenet. ./models/finetuned_resnet.py

LRCN (CNN feature extractor, here we use the fine-tuned ResNet50 and LSTMs). The input of LRCN is a sequence of frames uniformly extracted from each video. The fine-tuned ResNet directly uses the result of [1] without extra training (C.F.Long-term recurrent convolutional network).

Produce intermediate data using ./process_CNN.py and then train and predict with ./models/RNN.py

Simple CNN model trained with stacked optical flow data (generate one stacked optical flow from each of the video, and use the optical flow as the input of the network). ./models/temporal_CNN.py

Two-stream model, combines the models in [2] and [3] with an extra fusion layer that output the final result. [3] and [4] refer to this paper ./models/two_stream.py

Citations

If you use this code or ideas from the paper for your research, please cite the following papers:

@inproceedings{lrcn2014,
   Author = {Jeff Donahue and Lisa Anne Hendricks and Sergio Guadarrama
             and Marcus Rohrbach and Subhashini Venugopalan and Kate Saenko
             and Trevor Darrell},
   Title = {Long-term Recurrent Convolutional Networks
            for Visual Recognition and Description},
   Year  = {2015},
   Booktitle = {CVPR}
}
@article{DBLP:journals/corr/SimonyanZ14,
  author    = {Karen Simonyan and
               Andrew Zisserman},
  title     = {Two-Stream Convolutional Networks for Action Recognition in Videos},
  journal   = {CoRR},
  volume    = {abs/1406.2199},
  year      = {2014},
  url       = {http://arxiv.org/abs/1406.2199},
  archivePrefix = {arXiv},
  eprint    = {1406.2199},
  timestamp = {Mon, 13 Aug 2018 16:47:39 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/SimonyanZ14},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
data		data
models		models
readme_imgs		readme_imgs
rnn_practice		rnn_practice
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
experiments.md		experiments.md
predict.py		predict.py
process_CNN.py		process_CNN.py
train_CNN.py		train_CNN.py
train_RNN.py		train_RNN.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

readme_imgs

readme_imgs

rnn_practice

rnn_practice

utils

utils

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

experiments.md

experiments.md

predict.py

predict.py

process_CNN.py

process_CNN.py

train_CNN.py

train_CNN.py

train_RNN.py

train_RNN.py

Repository files navigation

Action Recognition

Project Overview

File Structure of the Repo

Models Description

Citations

About

Releases

Packages

Contributors 4

Languages

License

woodfrog/ActionRecognition

Folders and files

Latest commit

History

Repository files navigation

Action Recognition

Project Overview

File Structure of the Repo

Models Description

Citations

About

Topics

Resources

License

Stars

Watchers

Forks

Languages