Skip to content

Jason-cs18/awesome-avatar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

awesome-avatar

This is a repository for organizing papers, codes and other resources related to the topic of Avatar (talking-face and talking-body).

🔆 This project is still on-going, pull requests are welcomed!!

If you have any suggestions (missing papers, new papers, key researchers or typos), please feel free to edit and pull a request.

TO DO LIST

  • Main paper list
  • Researchers list
  • Toolbox for avatar
  • Add paper link
  • Add paper notes
  • Add codes if have
  • Add project page if have
  • Datasets and metrics
  • Related links

Researchers and labs

  1. NVIDIA Research
  2. Aliaksandr Siarohin @ Snap Research
  3. Ziwei Liu @ Nanyang Technological University
  4. Xiaodong Cun @ Tencent AI Lab:
  1. Max Planck Institute for Informatics:

Papers

Example: [Conference'year] Title, First-author Affiliation, ProjectPage, Code

Avatar (face+body)

[arXiv 2024.01] From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations, Meta Reality Labs Research, ProjectPage, Code Github stars Github forks

2D talking-face synthesis

3D talking-face synthesis

Talking-body synthesis

Co-speech gesture synthesis

Pose2video

Datasets

Talking-face

Audio-Visual Datasets for Enlish Speakers
Dataset name Environment Year Resolution Subject Duration Sentence
VoxCeleb1 Wild 2017 360p~720p 1251 352 hours 100k
VoxCeleb2 Wild 2018 360p~720p 6112 2442 hours 1128k
HDTF Wild 2020 720p~1080p 300+ 15.8 hours
LSP Wild 2021 720p~1080p 4 18 minutes 100k
Audio-Visual Datasets for Chinese Speakers
Dataset name Environment Year Resolution Subject Duration Sentence
CMLR Lab 2019 11 102k
MAVD Lab 2023 1920x1080 64 24 hours 12k
CN-Celeb Wild 2020 3000 1200 hours
CN-Celeb-AV Wild 2023 1136 660 hours
CN-CVS Wild 2023 2500+ 300+ hours

Talking-body

Metrics

Talking-face

Lip-Sync
Metric name Description Code/Paper
LMD↓ Mouth landmark distance
LMD↓ Mouth landmark distance
MA↑ The Insertion-over-Union (IoU) for the overlap between the predicted mouth area and the ground truth area
Sync↑ The confidence score from SyncNet (Sync) wav2lip
LSE-C↑ Lip Sync Error - Confidence wav2lip
LSE-D↓ Lip Sync Error - Distance wav2lip
Image Quality (identity preserving)
Metric name Description Code/Paper
MAE↓ Mean Absolute Error metric for image mmagic
MSE↓ Mean Squared Error metric for image mmagic
PSNR↑ Peak Signal-to-Noise Ratio mmagic
SSIM↑ Structural similarity for image mmagic
FID↓ Frchet Inception Distance mmagic
IS↑ Inception score mmagic
NIQE↓ Natural Image Quality Evaluator metric mmagic
CSIM↑ The cosine similarity of identity embedding InsightFace
CPBD↑ The cumulative probability blur detection python-cpbd
Diversity
Metric name Description Code/Paper
Diversity of head motions↑ A standard deviation of the head motion feature embeddings extracted from the generated frames using Hopenet (Ruiz et al., 2018) is calculated SadTalker
Beat Align Score↑ The alignment of the audio and generated head motions is calculated in Bailando (Siyao et al., 2022) SadTalker

Talking-body

TBD

Toolbox

  1. A general toolbox for AIGC, including common metrics and models https://github.com/open-mmlab/mmagic
  2. face3d: Python tools for processing 3D face https://github.com/yfeng95/face3d
  3. 3DMM model fitting using Pytorch https://github.com/ascust/3DMM-Fitting-Pytorch
  4. OpenFace: a facial behavior analysis toolkit https://github.com/TadasBaltrusaitis/OpenFace
  5. autocrop: Automatically detects and crops faces from batches of pictures https://github.com/leblancfg/autocrop
  6. OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation https://github.com/CMU-Perceptual-Computing-Lab/openpose
  7. GFPGAN: Practical Algorithm for Real-world Face Restoration https://github.com/TencentARC/GFPGAN
  8. CodeFormer: Robust Blind Face Restoration https://github.com/sczhou/CodeFormer

Related Links

If you are interested in avatar and digital human, we would also like to recommend you to check out other related collections: