FrVD: French Video Description dataset
-
Updated
Jun 22, 2023
FrVD: French Video Description dataset
Leveraging Self-Supervised Training for Unintentional Action Recognition (ECCVW 2022)
[NAACL 2024] Z-GMOT: Zero-shot Generic Multiple Object Tracking
Code for the Paper: Quasi-Online Detection of Take and Release Actions from Egocentric Videos. International Conference on Image Analysis and Processing 2023.
Pre-trained and Reproduced Deep Learning Models (『飞桨』官方模型库,包含多种学术前沿和工业场景验证的深度学习模型)
Tool employed to visualize synchronized FrVD metadata and videos simultaneously.
Undergraduate Thesis @ Department of Automation, Tsinghua -- Understanding Few-shot Video with Pretrained Image-Text Models
[IJCNN 2024] Unifying Global and Local Scene Entities Modelling for Precise Action Spotting
The code for 3DTDS-Net with Pytorch
Video understanding with C3D
The code for FSTA-Net with Pytorch
📚 Paper Notes (Computer vision)
The code for L3AM loss with Pytorch
[ICCV 2021] On the hidden treasure of dialog in video question answering
Source code for "Visually aligned sound generation via sound-producing motion parsing" (Published at Neurocomputing)
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
[CVPR 2018] Non-local Neural Networks
The code for PB-Net with Pytorch
We use visual data alone to learn a control policy for a robotic arm by observing expert video demonstrations. We implement and test several models, accomplishing an 85% success rate for a pick-and-place task.
Add a description, image, and links to the video-understanding topic page so that developers can more easily learn about it.
To associate your repository with the video-understanding topic, visit your repo's landing page and select "manage topics."