Video Content Description

Increasing trend in the research community for video processing using artificial intelligence. Trending Tasks:

Video classification.
Video content description.
Video question answering (VQA).

Main Idea

The main idea is to generate descritptions for unconstrained videos, which can be used in video retrieval, blind navigation, and video subtitling.

Examples

Dataset

We use the Microsoft Research Video to Text (MSVD) dataset.

Extracted Visual Feature

We extracted the visual features of the data set using :

VGG-16 (like paper): gdrive link

Architecture

Here is the our architecture.

Checkpoints

We have trained the model using different techniques.

Base paper as in seq to seq -- video to text : gdrive link
Using drop out on features: gdrive
Using temporal attention: gdrive link
Using drop out and attention technique: gdrive line

Results

From the results obtained in the explained experiments, we found out that the best results obtained are from using attention and drop out. Our model outperforms the original paper model in all used metrics as shown in the following table:

Authors

Contribute

Contributions are always welcome!

Please read the contribution guidelines first.

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Attention Model		Attention Model
Base Model		Base Model
Drop out model		Drop out model
Evaluation		Evaluation
Feature Exctraction		Feature Exctraction
Final model		Final model
Images		Images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
contributing.md		contributing.md

License

AmrHendy/video-content-description

Folders and files

Latest commit

History

Repository files navigation

Video Content Description

Main Idea

Examples

Dataset

Extracted Visual Feature

Architecture

Checkpoints

Results

Authors

Contribute

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages