Multimodal_Video_Recognition_Co-Trainning_Model

This is a model for MLP course project. The project focuses on the recognition of hand gesture in the video, which is achieved by two co-trained I3D networks. Our model takes two diﬀerent modalities of frames from the video, RGB and optical, as input. The co-training network is supposed to have the better performance on hand gesture recognition than a single-branch network. In this project, the spatiotemporal semantic alignment optimization method is applied to optimized the video recognition system. The designed model can achieve accuracy more than 99.3% on the EgoGesture dataset,which contains diﬀerent subjects of hand gesture on various scenes.

Environment

Tensorflow-gpu-1.5

Tensorflow_probability-0.7

Sonnet-1.25

Opencv-3.4.2

Imageio

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
checkpoints		checkpoints
GenRGBFlowXY.py		GenRGBFlowXY.py
README.md		README.md
Readme.txt		Readme.txt
i3d.py		i3d.py
i3d_utils.py		i3d_utils.py
list.py		list.py
train_ego_base(2).py		train_ego_base(2).py
train_ego_ssa.py		train_ego_ssa.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoints

checkpoints

GenRGBFlowXY.py

GenRGBFlowXY.py

README.md

README.md

Readme.txt

Readme.txt

i3d.py

i3d.py

i3d_utils.py

i3d_utils.py

list.py

list.py

train_ego_base(2).py

train_ego_base(2).py

train_ego_ssa.py

train_ego_ssa.py

utils.py

utils.py

Repository files navigation

Multimodal_Video_Recognition_Co-Trainning_Model

Environment

About

Releases

Packages

Languages

Jeff-Wu97/Multimodal_Video_Recognition_Co-Trainning_Model

Folders and files

Latest commit

History

Repository files navigation

Multimodal_Video_Recognition_Co-Trainning_Model

Environment

About

Topics

Resources

Stars

Watchers

Forks

Languages