#

msvd

Here are 16 public repositories matching this topic...

Pandla-Vijay / Video-Captioning-using-Spatio-temporal-features-and-Gaussian-Attention

This project utilizes advanced deep learning techniques to automatically generate contextually relevant captions for videos by extracting spatial and temporal features, while incorporating Gaussian attention to focus on important regions. This enhances video indexing, retrieval, and accessibility for visually impaired individuals.

lstm gru video-captioning spatio-temporal-data msvd

Updated Jul 24, 2023
Jupyter Notebook

willyfh / msvd-indonesian

MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).

deep-learning neural-network bahasa-indonesia video-captioning msvd video-retrieval video-description multimodal-dataset video-text indonesian-dataset msvd-indonesian

Updated Aug 4, 2023

HaydenFaulkner / Attributes_SVO_Video_Captioning

LSTM RNN and Transformer networks video captioning on MSVD and MSR-VTT using attributes and SVOS

pytorch transformer lstm video-captioning msvd msr-vtt

Updated Feb 15, 2021
Jupyter Notebook

suyash-chintawar / Spatio-Temporal-Attention-based-Video-Captioning

To build attention based encoder-decoder model for video captioning on the MSVD dataset

lstm gru attention-mechanism video-captioning msvd

Updated Apr 30, 2022
Jupyter Notebook

MikeWangWZHL / VidIL

Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

clip blip msvd youcook2 vision-language msrvtt vatex gpt-3 video-language vlep

Updated Sep 15, 2022
Python

tuyunbin / Enhancing-the-Alignment-between-Target-Words-and-Corresponding-Frames-for-Video-Captioning

[Pattern Rcognition 2021] This is the Theano code for our paper "Enhancing the Alignment between Target Words and Corresponding Frames for Video Captioning".

msvd video-captoning

Updated Oct 22, 2020
Python

jssprz / attentive_specialized_network_video_captioning

Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*

deep-learning video-captioning video-to-text msvd video-description msr-vtt icpr2020

Updated Apr 6, 2021
Python

jssprz / video_features_extractor

Python implementation of extraction of several visual features representations from videos

cnn c3d video-captioning video-representation msvd activitynet-captions visual-representation trecvid tgif-dataset msr-vtt vatex

Updated Jul 19, 2021
Python

jssprz / visual_syntactic_embedding_video_captioning

Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*

deep-learning representation-learning pos-tagging encoder-decoder video-captioning video-to-text msvd video-description msr-vtt wacv2021 syntactic-representations

Updated Apr 16, 2021
Python

jssprz / video_captioning_datasets

Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*

review video-captioning state-of-the-art vision-and-language charades video-to-text msvd video-dataset video-description activitynet-captions trecvid tgif-dataset msr-vtt vatex

Updated Oct 27, 2023
Jupyter Notebook

WingsBrokenAngel / delving-deeper-into-the-decoder-for-video-captioning

Source code for Delving Deeper into the Decoder for Video Captioning

tensorflow semantics decoder video-captioning state-of-the-art professional-learning msvd msr-vtt

Updated Jun 1, 2021
Jupyter Notebook

xuguohai / X-CLIP

An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"

multimodal activitynet msvd msrvtt video-text-retrieval lsmdc didemo

Updated Apr 6, 2024
Python

nasib-ullah / video-captioning-models-in-Pytorch

A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.

video deep-learning pytorch sequence-to-sequence video-captioning s2vt msvd pytorch-implementation msrvtt marn video-captioning-models recnet

Updated Jul 30, 2023
Python

tuyunbin / Video-Description-with-Spatial-Temporal-Attention

[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"

attention-mechanism video-caption msvd

Updated Oct 20, 2020
Python

WingsBrokenAngel / Semantics-AssistedVideoCaptioning

Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy

tensorflow python3 state-of-the-art msvd video-captioning-model youtube2text msrvtt

Updated Jul 31, 2021
Python

ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

search retrieval ranking clip multimodality multimodal-learning multimodal activitynet retrieval-model msvd msrvtt video-text-retrieval lsmdc didemo video-clip-retrieval

Updated Apr 12, 2024
Python

Improve this page

Add a description, image, and links to the msvd topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the msvd topic, visit your repo's landing page and select "manage topics."