EgoExoLearn

This repository contains the data and benchmark code of the following paper:

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang, Guo Chen, Jilan Xu, Mingfang Zhang, Lijin Yang, Baoqi Pei, Hongjie Zhang, Lu Dong, Yali Wang, Limin Wang, Yu Qiao
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Presented by OpenGVLab in Shanghai AI Lab

🔥 News

[2024/03] Part of EgoExoLearn raw annotations released.
[2024/03] EgoExoLearn paper released.
[2024/03] Annotation and code for the cross-view association and cross-view referenced skill assessment benchmarks are released.
[2024/03] EgoExoLearn code and data initially released.

📣 Overview

We propose EgoExoLearn, a dataset that emulates the human demonstration following process, in which individuals record egocentric videos as they execute tasks guided by exocentric-view demonstration videos. Focusing on the potential applications in daily assistance and professional support, EgoExoLearn contains egocentric and demonstration video data spanning 120 hours captured in daily life scenarios and specialized laboratories. Along with the videos we record high-quality gaze data and provide detailed multimodal annotations, formulating a playground for modeling the human ability to bridge asynchronous procedural actions from different viewpoints.

🎓 Benchmarks

Please visit each subfolder for code and annotations. More updates coming soon.

We design benchmarks of 1) cross-view association, 2) cross-view action understanding (action segmentation, action ancitipation, action planning), 3) cross-view referenced skill assessment, and 4) cross-view referenced video captioning. Each benchmark is meticulously defined, annotated, and supported by baseline implementations. In addition, we pioneeringly explore the role of gaze in these tasks. We hope our dataset can provide resources for future work for bridging asynchronous procedural actions in ego- and exo-centric perspectives, thereby inspiring the design of AI agents adept at learning from real-world human demonstrations and mapping the procedural actions into robot-centric views.

📑 Data access

Option 1: Google Drive links

Videos (320p, mp4)

Gaze (processed, npy)

CLIP features 5fps

I3D RGB features

CLIP features of gaze cropped videos

I3D RGB features of gaze cropped videos

Option 2: BaiduYun link

EgoExoLearn

Code: tm1g

✒️ Citation

If you find our repo useful for your research, please consider citing our paper:

 @InProceedings{huang2024egoexolearn,
     title={EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World},
     author={Huang, Yifei and Chen, Guo and Xu, Jilan and Zhang, Mingfang and Yang, Lijin and Pei, Baoqi and Zhang, Hongjie and Lu, Dong and Wang, Yali and Wang, Limin and Qiao, Yu},
     booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
     year={2024}
 }

♥️ Acknowledgement

Led by Shanghai AI Laboratory, Nanjing University and Shenzhen Institute of Advanced Technology, this project is jointly accomplished by talented researchers from multiple institutes including The University of Tokyo, Fudan University, Zhejiang University, and University of Science and Technology of China.

📬 Primary contact: Yifei Huang ( hyf at iis.u-tokyo.ac.jp )

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
action_anticipation_planning_benchmark		action_anticipation_planning_benchmark
annotations		annotations
asset		asset
association_benchmark		association_benchmark
skill_benchmark		skill_benchmark
temporal_action_segmentation_benchmark		temporal_action_segmentation_benchmark
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
video_meta_downsampled.json		video_meta_downsampled.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

action_anticipation_planning_benchmark

action_anticipation_planning_benchmark

annotations

annotations

asset

asset

association_benchmark

association_benchmark

skill_benchmark

skill_benchmark

temporal_action_segmentation_benchmark

temporal_action_segmentation_benchmark

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

video_meta_downsampled.json

video_meta_downsampled.json

Repository files navigation

EgoExoLearn

🔥 News

📣 Overview

🎓 Benchmarks

📑 Data access

Option 1: Google Drive links

Option 2: BaiduYun link

✒️ Citation

♥️ Acknowledgement

About

Releases

Packages

Languages

License

OpenGVLab/EgoExoLearn

Folders and files

Latest commit

History

Repository files navigation

EgoExoLearn

🔥 News

📣 Overview

🎓 Benchmarks

📑 Data access

Option 1: Google Drive links

Option 2: BaiduYun link

✒️ Citation

♥️ Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Languages