Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问一下,在只有音频没有对应的文本情况下,可以根据音频的音色和韵律风格实现类似speech continue的操作吗? #3

Open
zhiqiuiyiye opened this issue Nov 22, 2023 · 4 comments

Comments

@zhiqiuiyiye
Copy link

zhiqiuiyiye commented Nov 22, 2023

假如只有说话人的音频作为参考音频,可以实现zero-shot的功能吗

@cantabile-kwok
Copy link
Member

这个任务跟这篇文章的设定有点不同,主要是目前CTX-txt2vec在训练中是音频和文本完全对应的,所以暂时还做不到只给音频、不给文本地续写;这在训练模型时如果单独切一段不带文本的音频prompt作为context,那么有可能可以做到,但是我们还没有尝试过

@zhiqiuiyiye
Copy link
Author

好的,感谢解答,不知道大佬未来有没有尝试这方面工作的打算呢

@cantabile-kwok
Copy link
Member

我们有考虑过,不过目前没有好的在进行中的想法哈哈

@zhiqiuiyiye
Copy link
Author

好的好的,我们这边也一直在搞这方面的工作,期望未来能够多多交流,感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants