Lip-to-Speech Synthesis in the Wild with Multi-task Learning

Overview

This repository contains a video demo of IEEE International Conference on Acoustics, Speech and Signal Processing submitted paper titled "Lip-to-Speech Synthesis in the Wild with Multi-task Learning."

The official code is now available in here.

Demo video

A demo video contains the original speech, the generated speech from previous state-of-the-art work [1], and the generated speech from the proposed method from three different speakers on both LRS2 and LRS3 datasets, respectively, and six speakers on LRW dataset. The video demo is located in demo-video folder in our repository, and it is also available in Youtube:

LRS2 and LRS3 [Demo Video]
LRW [Demo Video]

References

[1] Rodrigo Mira, Alexandros Haliassos, Stavros Petridis, Bj̈orn W Schuller, and Maja Pantic, “Svts: Scalable video-to-speech synthesis,” arXiv preprint arXiv:2205.02058, 2022.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Lip-to-Speech Synthesis in the Wild with Multi-task Learning

Overview

Demo video

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Lip-to-Speech Synthesis in the Wild with Multi-task Learning

Overview

Demo video

References