Skip to content

CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

License

Notifications You must be signed in to change notification settings

GAP-LAB-CUHK-SZ/TASTE-Rob

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

CVPR 2025
Hongxiang Zhao*  Xingchen Liu*  Mutian XuYiming HaoWeikai ChenXiaoguang Han§
CUHKSZ GAP-Lab
*Indicates Equal Contribution §Indicates Corresponding Author

📖 Project Page | 📄 Paper Link | 🎥 Dataset Form

We introduce TASTE-Rob: 1) a dataset with 100,856 task-oriented hand-object interaction videos, 2) a three-stage pose-refinement video generation pipeline. With the above contributions, TASTE-Rob is able to generate realistic interactions and support the possibility of transferring on robots.

If you find our work useful in your research, please consider citing:

@InProceedings{Zhao_2025_CVPR,
    author    = {Zhao, Hongxiang and Liu, Xingchen and Xu, Mutian and Hao, Yiming and Chen, Weikai and Han, Xiaoguang},
    title     = {TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {27683-27693}
}

📣 News

  • [6/7/2025] TASTE-Rob Dataset Download Tool has been released now!!!
  • [5/3/2025] TASTE-Rob Dataset has been released now!!!
  • [3/14/2025] TASTE-Rob has been released on Arxiv!!!
  • [2/27/2025] 🎉🎉🎉TASTE-Rob has been accepted by CVPR 2025!!!🎉🎉🎉

🚩 Plan

  • Paper Released.
  • Dataset will be released before 05/05/2025.
  • Source Code and Pretrained Weights.

🎥 Dataset

TASTE-Rob contains 100,856 task-oriented ego-centric hand-object interaction videos crossing different environments. We provide an OneDrive link to download the full data. Please fill out this form, and we will send the download link and password to your e-mail soon.

We split the full data into SingleHand/DoubleHand and multiple environments, the total size is about 1.55 TB.

Folder structure

|-- TASTE_Rob
    |-- SingleHand  # stores videos with single-hand interaction captions
        |-- Bathroom
            |-- 50254.mp4
            |-- 50255.mp4
            |-- 50256.mp4
            |-- ...
        |-- Bedroom
        |-- Dinning
        |-- DressingTable
        |-- Kitchen
        |-- Office
    |-- DoubleHand  # stores videos with single-hand interaction captions
        |-- Bathroom
        |-- Dinning
        |-- Kitchen
        |-- Office
    |-- captions.xlsx  # stores captions

In captions.xlsx, the sheet Single-Hand stores single-hand interaction captions, and the sheet Double-Hand stores double-hand interaction captions. In each sheet, there has three attributes: id, scene and caption. You could search ids of desired videos in this excel file.

🎥 Dataset Download Tool

You could use download_tool_taste_rob.py to download zip files or mp4 files, as follows:

python download_tool_taste_rob.py \
    --file_list downlist.txt \
    --url {our_given_url} \
    --download_folder {local_path_of_downloaded_files} \
    --force True

To download successfully, you need to modify our_given_url, local_path_of_downloaded_files and downlist.txt, which stores desired file paths.

License

The data is released under the TASTE-Rob Terms of Use.

Copyright (c) 2025

About

CVPR2025 | TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages