Skip to content
This repository has been archived by the owner on Nov 11, 2023. It is now read-only.

content提取方式的比较? #16

Open
hongwen-sun opened this issue Mar 13, 2023 · 11 comments
Open

content提取方式的比较? #16

hongwen-sun opened this issue Mar 13, 2023 · 11 comments

Comments

@hongwen-sun
Copy link

您好,这个项目做的很好。
我看你用了第九层的hubert,出于什么考虑呢?如何权衡内容信息丢失、音色泄漏的问题,您有对比过其他层或者whisper这种方式吗?

@hongwen-sun
Copy link
Author

还有个疑问:

logits = model.extract_features(**inputs)
feats = model.final_proj(logits[0])

这部分代码提取的第9层的信息,又经过了final_proj的结构,这样是bug还是有意为之?我的理解是最后一层经过它才是合理的

@Geraint-Dou Geraint-Dou added the help wanted The issue author is asking for help label Mar 19, 2023
@leng-yue
Copy link
Contributor

我个人倾向于第 9 层是故意的, 但是 final proj 是不小心的...
有不少论文讨论了不同层的 feature 是有区别的, 第 9 层可能是从这些论文来的.
原理上不应该经过 final proj, 但是可能经过了也不影响, 最多丢点信息. 具体得问开发第一版本的佬了...

@leng-yue
Copy link
Contributor

根据 content vec 原作者的信息, final_proj 是一个错误的用法, 但是不幸的是我们现在所有的模型都在用 (

@w-okada
Copy link

w-okada commented Apr 3, 2023

So, does officail svc-develop-team have any plan to fix this usage of content vec?

@Likkkez
Copy link

Likkkez commented Apr 4, 2023

Fix when? pls

@MuruganR96
Copy link

@Likkkez
Copy link

Likkkez commented Apr 4, 2023

Fix when? pls

https://huggingface.co/lengyue233/content-vec-best

Sorry I'm a bit confused. What do i need to do with that to apply the fix to so-vits-svc?

@MuruganR96
Copy link

MuruganR96 commented Apr 4, 2023

This is fixed in https://github.com/34j/so-vits-svc-fork.

Check out this issue: voicepaw/so-vits-svc-fork#213
Check out this PR: voicepaw/so-vits-svc-fork#197

in utils.py. get_hubert_content https://github.com/svc-develop-team/so-vits-svc/blob/4.0/utils.py#L225

    with torch.no_grad(), timer() as t:
        params = {"output_layer": 9} if legacy_final_proj else {}
        c: torch.Tensor = cmodel.extract_features(audio, **params)[0]
        if legacy_final_proj:
            warnings.warn("legacy_final_proj is deprecated")
            assert hasattr(cmodel, "final_proj")
            assert isinstance(cmodel.final_proj, torch.nn.Module)
            c = cmodel.final_proj(c)
        c = c.transpose(1, 2)

I haven't tried yet.

@Likkkez
Copy link

Likkkez commented Apr 4, 2023

This is fixed in https://github.com/34j/so-vits-svc-fork.

Check out this issue: 34j/so-vits-svc-fork#213 Check out this PR: 34j/so-vits-svc-fork#197

in utils.py. get_hubert_content https://github.com/svc-develop-team/so-vits-svc/blob/4.0/utils.py#L225

    with torch.no_grad(), timer() as t:
        params = {"output_layer": 9} if legacy_final_proj else {}
        c: torch.Tensor = cmodel.extract_features(audio, **params)[0]
        if legacy_final_proj:
            warnings.warn("legacy_final_proj is deprecated")
            assert hasattr(cmodel, "final_proj")
            assert isinstance(cmodel.final_proj, torch.nn.Module)
            c = cmodel.final_proj(c)
        c = c.transpose(1, 2)

I haven't tried yet.

Alright, I'll try. thanks!

@Miuzarte Miuzarte removed the help wanted The issue author is asking for help label Apr 9, 2023
@MuruganR96
Copy link

@Likkkez
Copy link

Likkkez commented Apr 18, 2023

@Likkkez refer this options -> https://github.com/yxlllc/DDSP-SVC/blob/master/ddsp/vocoder.py#L114

A ye thanks! I think now theres also a branch here that does the same thing right? The '4.0-Vec768-Layer12'.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants