Skip to content

KID-7391/seeking-the-shape-of-sound

Repository files navigation

Seeking the Shape of Sound

An implement of the CVPR 2021 paper: Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association image

Environments

  • Ubuntu 16.04
  • CUDA 10.2
  • Python 3.7.3
  • Pytorch 1.4.0

See requirement.txt.

Data preparation

Download VoxCeleb, VGGFace and unzip them to ./data.

Limited by file size, only part of the query lists is included in ./data. Other lists used in the article can be downloaded from Google drive or Baidu drive (passwd: rfri).

Training

  1. Download pretrained models for backbones into ./pretrained_models.

Google drive:

SE-ResNet-50

Thin-ResNet-34

Baidu drive:

SE-ResNet-50 (passwd: jy55)

Thin-ResNet-34 (passwd: tc6i)

  1. Train the model and update identity weights:
python3 train.py config/train_reweight.yaml
  1. Extract identity weights from saved model file:
python3 extract_id_weight.py config/train_reweight.yaml

The 4. Retrain the final model:

python3 train.py config/train_main.yaml

The model and log are saved in save/vox1_train/Voice2Face/main by default.

Evaluation

  1. Download the pretrained model from Google drive or Baidu drive (passwd: 4vyf).
  2. Modify configures in config/train_main.yaml: change resume\_eval to the path where the model is saved.
  3. Run
python3 eval.py config/train_main.yaml

Expected results (%):

1:2 Matching (U) 1:2 Matching (G) Verification (U) Verification (G) Retrieval
Voice-to-Face 87.2 77.7 87.2 77.5 5.5
Face-to-Voice 86.5 75.3 87.0 76.1 5.8

The results might slightly differ from the above due to random factors in the training process.

References

If this code is helpful to you, please consider citing our paper:

@inproceedings{wen2021seeking,
  title={Seeking the shape of sound: An adaptive framework for learning voice-face association},
  author={Wen, Peisong and Xu, Qianqian and Jiang, Yangbangyan and Yang, Zhiyong and He, Yuan and Huang, Qingming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2021}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published