Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some minor bugs inside Hifi-Codec code #7

Open
rishikksh20 opened this issue May 6, 2023 · 3 comments
Open

Some minor bugs inside Hifi-Codec code #7

rishikksh20 opened this issue May 6, 2023 · 3 comments

Comments

@rishikksh20
Copy link
Contributor

rishikksh20 commented May 6, 2023

As I am analyzing new HiFi-codec code I encountered three small bugs:

  1. Torchaudio Melspectrogram :
    Here :
    melspec = MelSpectrogram(sample_rate=24000, n_fft=s, hop_length=s//4, n_mels=64, wkwargs={"device": device}).to(device)

    MelSpectrogram not imported before use :
from torchaudio.transforms import MelSpectrogram
  1. Modules not present inside HiFi-Codec:
    Here :

    from modules import NormConv2d

    modules not present inside HiFi-Codec folder. So, neede to copy or change modules reference from other model's modules implementation.

  2. Shape of input tensor x, here :

    c = self.encoder(x.unsqueeze(1))

    While my testing with 24 khz mono channel wav shape of x before line 33 comes out -> [Batch, Samples, 1] and after .unsqueeze(1) operation at line 33 it becomes [batch, 1, samples, 1] a 4D tensor which supposed to be 3D tensor. So shape of x needed to be check before line 33 and if it has 3 dimensions and last dimension is 1 then we needed to squeeze last dimension.
    After modifying and correct the shape of x, code is working fine without an error, and I am able to get desired output.

Thanks @yangdongchao .

@listener17
Copy link

@rishikksh20 I guess similar issues also exist for the other shared codecs?

@rishikksh20 rishikksh20 changed the title Some minor missing dependency Some minor bugs inside Hifi-Codec code May 6, 2023
@yangdongchao
Copy link
Owner

Thanks for your help, I clean some code before I push to the github, so some errors may exist.

@rishikksh20
Copy link
Contributor Author

rishikksh20 commented May 6, 2023

As this repo is still a work in progress having some minor bugs are understandable, my focus currently on HiFi-Codec as I am testing that.
But yes, there are some minor bugs on other Codec code which is simple to correct, one of those is in Encodec_16k_320:

compressed = soundstream.encode(wav)

in line 119 and 121 , it should be

compressed = model.encode(wav)
print('after compression:',compressed.shape)
out = model.decode(compressed)

as

model = SoundStream(n_filters=32, D=256, ratios=[6, 5, 4, 2])

the Soundstream assigne to variable model not soundstream .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants