3D-ResNet-for-Keras

A module for creating 3D ResNets based on the work of He et al. [1]. It contains convenient functions to build the popular ResNet architectures: ResNet-18, -34, -52, -102 and -152. It is also possible to create customised network architectures.

Installation

Install e.g. with:

$ python -m pip install git+https://github.com/thauptmann/3D-ResNet-for-Keras.git

and import the module with:

import three_d_resnet_builder

Usage

The convenient functions (build_three_d_resnet_*) just need an input shape, an output shape and an activation function to create a network. Additional customisable are the usage of regularizatio and the usage of kernel and squeeze-and-excitation layers. By changing squeeze_and_excitation to True the network will be build with squeeze-and-excitation layers [2]. In the best case, this improves the capabilities of the network with a very small calculation overhead and without increasing the depth of the network.

The input shape has to be (frames, height, width, channels)

three_d_resnet_builder.build_three_d_resnet_18(input_shape, output_shape, activation_function, regularizer,
                                               squeeze_and_excitation, kernel_name)

The general function (build_three_d_resnet) allows it to change the architecture of the network:

three_d_resnet_builder.build_three_d_resnet(input_shape, output_shape, repetitions, output_activation, regularizer, squeeze_and_excitation,
                     use_bottleneck, kernel_size, kernel_name)

Kernel

The package contains different types of kernel. The type can be choosen with help of the kernel_name variable in the build function. Possible kernels are: '3D', '(2+1)D'[4], 'P3D-B'[5], 'FAST'[6], 'split-FAST'[6].

Demo

For testing purposes I chose the Ucf101 dataset [3] to train on. It contains approximately 13.000 videos with 101 labels of different actions. The purpose of the demo is to show how to use the package, not to get the best results. Therefore important preprocessing steps (e.g. shuffling and data augmentation) are missing.

Caveat: This will download approximately 6.5 GiB and will run for several hours on a GPU. Without it, it will probably crash. (Unfortunately I have not found a smaller and easier example dataset. I am open for advices.)

For decoding the images ffmpeg needs to be installed:

$ apt install ffmpeg

To install the requirements and run the demo, run the following commands at the root directory:

$ python -m pip install -r requirements.txt
$ python demo.py

Results

Model	Top-1 Accuracy	Top-5-Accuracy
ResNet-18	%	%
SE-ResNet-18	%	%

[1] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.

[2] J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 7132-7141, doi: 10.1109/CVPR.2018.00745.

[3] K. Soomro, A. Roshan Zamir and M. Shah, "UCF101: A Dataset of 101 Human Action Classes From Videos in The Wild", CRCV-TR-12-01, November, 2012.

[4] D. Tran, H. Wang, L. Torresani, J. Ray, Y. LeCun, and M. Paluri. A closer look at spatiotemporal convolutions for action recognition. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 6450–6459. (IEEE), 2018.

[5] Z. Qiu, T. Yao, and T. Mei. Learning spatio-temporal representation with pseudo-3d residual networks. In International Conference on Computer Vision (ICCV), pages 5534–5542. (IEEE), 2017.

[6] A. Stergiou and R. Poppe, “Spatio-Temporal FAST 3D Convolutions for Human Action Recognition,” presented at the 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), Dec. 2019. doi: 10.1109/icmla.2019.00036.

Name		Name	Last commit message	Last commit date
Latest commit History 128 Commits
three_d_resnet_builder		three_d_resnet_builder
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

three_d_resnet_builder

three_d_resnet_builder

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

LICENSE

LICENSE

README.md

README.md

demo.py

demo.py

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

3D-ResNet-for-Keras

Installation

Usage

Kernel

Demo

Results

About

Languages

License

thauptmann/3D-ResNet-Builder-for-Keras

Folders and files

Latest commit

History

Repository files navigation

3D-ResNet-for-Keras

Installation

Usage

Kernel

Demo

Results

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages