[arXiv 23] Pytorch code for "Overcoming Weak Visual-Textual Alignment for Video Moment Retrieval"
-
Updated
May 28, 2024 - Python
[arXiv 23] Pytorch code for "Overcoming Weak Visual-Textual Alignment for Video Moment Retrieval"
[EMNLP 2022] Pytorch code for "Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval"
Awesome papers & datasets specifically focused on long-term videos.
[ICCV2023] UniVTG: Towards Unified Video-Language Temporal Grounding
Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)
Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"
[CVPR2022] Animal Kingdom: A Large and Diverse Dataset for Animal Behavior Understanding
Pytorch implementation of the paper 'Gaussian Mixture Proposals with Pull-Push Learning Scheme to Capture Diverse Events for Weakly Supervised Temporal Video Grounding' (AAAI2024).
PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
awesome grounding: A curated list of research papers in visual grounding
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"
paper list on Video Moment Retrieval (VMR), or Natural Language Video Localization (NLVL), or Temporal Sentence Grounding in Videos (TSGV))
Tensorflow Reproduction of the EMNLP-2018 paper "Temporally Grounding Natural Sentence in Video"
Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos"
Code for the paper Multimodal Dialogue State Tracking (NAACL22)
"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.
This repository contains an official PyTorch implementation of Position-aware Location Regression Network (PLRN) for temporal video grounding, which is presented in the paper Position-aware Location Regression Network for Temporal Video Grounding.
Implementation of paper "Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Losses"
Add a description, image, and links to the video-grounding topic page so that developers can more easily learn about it.
To associate your repository with the video-grounding topic, visit your repo's landing page and select "manage topics."