Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output mel spectrograms for python and CPP don't match #7

Open
sahi1422 opened this issue Jul 7, 2022 · 4 comments
Open

Output mel spectrograms for python and CPP don't match #7

sahi1422 opened this issue Jul 7, 2022 · 4 comments

Comments

@sahi1422
Copy link

sahi1422 commented Jul 7, 2022

Python Code

samples = np.full(16_000 * 5, 0.2)
mel_spec = librosa.feature.melspectrogram(y=samples, sr=16_000, n_fft=1024, hop_length=512, n_mels=128)

print('shape: ', mel_spec.shape)
print('sum: ', mel_spec.sum())
print('min: ', mel_spec.min())
print('max: ', mel_spec.max())

Python Output

shape:  (128, 157)
sum:  11806.58392107923
min:  7.865755998623591e-34
max:  89.1440147017634

CPP Code

    int sr = 16000;
    int n_fft = 1024;
    int n_hop = 512;
    int n_mel = 128;
    int fmin = 0.0;
    int fmax = sr / 2.0;

    std::vector<float> x(sr * 5, 0.2);

    std::vector<std::vector<float>> mels = librosa::Feature::melspectrogram(x, sr, n_fft, n_hop, "hann", true,
                                                                            "reflect",
                                                                            2.f, n_mel, fmin, fmax);

    std::cout << "shape: [" << mels.size() << "," << mels[0].size() << "]" << std::endl;

    double sums = 0;
    float maxi = -INT_MAX, mini = INT_MAX;
    for(auto &arr: mels) {
        for(auto k: arr) {
            sums += k;
            maxi = max(maxi, k);
            mini = min(mini, k);
        }
    }
    cout << "sums: " << sums << "\n";
    cout << "mini: " << mini << "\n";
    cout << "maxi: " << maxi << "\n";

CPP output

shape: [157,128]
sums: 11761.6
mini: 5.23758e-16
maxi: 74.9149
@shreks7
Copy link

shreks7 commented Sep 22, 2022

I have the same issue, the outputs don't match.

@abdr17
Copy link

abdr17 commented Mar 13, 2023

Can you tell me some resources to study basics of audio signal processing like concepts of STFT, MelSpectrogram, etc(code oriented).

@xiehengye
Copy link

Python Code

samples = np.full(16_000 * 5, 0.2)
mel_spec = librosa.feature.melspectrogram(y=samples, sr=16_000, n_fft=1024, hop_length=512, n_mels=128)

print('shape: ', mel_spec.shape)
print('sum: ', mel_spec.sum())
print('min: ', mel_spec.min())
print('max: ', mel_spec.max())

Python Output

shape:  (128, 157)
sum:  11806.58392107923
min:  7.865755998623591e-34
max:  89.1440147017634

CPP Code

    int sr = 16000;
    int n_fft = 1024;
    int n_hop = 512;
    int n_mel = 128;
    int fmin = 0.0;
    int fmax = sr / 2.0;

    std::vector<float> x(sr * 5, 0.2);

    std::vector<std::vector<float>> mels = librosa::Feature::melspectrogram(x, sr, n_fft, n_hop, "hann", true,
                                                                            "reflect",
                                                                            2.f, n_mel, fmin, fmax);

    std::cout << "shape: [" << mels.size() << "," << mels[0].size() << "]" << std::endl;

    double sums = 0;
    float maxi = -INT_MAX, mini = INT_MAX;
    for(auto &arr: mels) {
        for(auto k: arr) {
            sums += k;
            maxi = max(maxi, k);
            mini = min(mini, k);
        }
    }
    cout << "sums: " << sums << "\n";
    cout << "mini: " << mini << "\n";
    cout << "maxi: " << maxi << "\n";

CPP output

shape: [157,128]
sums: 11761.6
mini: 5.23758e-16
maxi: 74.9149

I found the same problem, did you solve it?

@ewan-xu
Copy link
Owner

ewan-xu commented Mar 24, 2023

setting pad_mode = "reflect" in python code like this:

samples = np.full(16000 * 5, 0.2)
mel_spec = librosa.feature.melspectrogram(y=samples, sr=16000, n_fft=1024, hop_length=512, n_mels=128,pad_mode='reflect')

print(mel_spec.T)
print('shape: ', mel_spec.shape)
print('sum: ', mel_spec.sum())
print('min: ', mel_spec.min())
print('max: ', mel_spec.max())

python output:

shape:  (128, 157)
sum:  11761.64503417969
min:  1.1692227111050498e-33
max:  74.91493652343752

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants