Skip to content

Latest commit

 

History

History
59 lines (48 loc) · 3.04 KB

README.md

File metadata and controls

59 lines (48 loc) · 3.04 KB

Setup for action recognition training

Download datasets

Download our preprocessed metadata files - infos.tar.gz (12G), to be able to use the training code. Place them so that the structure is:

data/
---ntu/
------info/              # provided in infos.tar.gz, metadata
------rgb/avi/           # download from the original dataset the videos (137GB) => (3.6GB resized version)
------*cache.mat         # provided in infos.tar.gz, cached body pose data
---uestc/
------info/              # provided in infos.tar.gz, metadata
------RGBvideo/          # download from the original dataset the videos (82GB)
------mat_from_skeleton/ # download from the original dataset the Kinect joints (5.5GB)
---surreact/             # provided in infos.tar.gz, rest of the structure as explained in README.md
------ntu/hmmr/info/
------ntu/vibe/info/
------uestc/hmmr/info/
------uestc/vibe/info/

Python environment

cd training/
# Setup symbolic links to the datasets
mkdir data
ln -s <replace_with_surreact_path> data/surreact
ln -s <replace_with_ntu_path> data/ntu
ln -s <replace_with_uestc_path> data/uestc
# Create surreact_env environment with dependencies
conda env create -f environment.yml
conda activate surreact_env

Run training and testing

We provide sample training and testing launches in the exp folder for various datasets and configurations used in the paper. You can run them as following:

cd training/
bash exp/surreact_train.sh

Notes:

  1. To be able to run flowstreams, you need to download pretrained flow estimation network from and place them under pretrained_models/ folder:
  1. The code originally supported more functionalities (estimating segmentation, flow, body pose etc.) which have been removed to simplify. There could be leftovers. The current version is only tested with the action recognition trainings in the exp/ folder. A few of the functionalities are kept such as using segmentation or flow as input. You are welcome to try these, but the code has not been exhaustively tested and I might not remember the details to provide support.

  2. If you see hri40 in the data/code, this refers to the uestc dataset.

  3. Filenames have been slightly modified for the UESTC dataset. Check misc/uestc folder to see how they were preprocessed.

  4. For the experiments in the paper, the video frames for NTU were preprocessed and downsampled to 480x270 pixels resolution to allow faster loading by setting load_res=0.25.