Wrong Logic for HopSize? #1099

fyuvb · 2021-12-06T12:33:12Z

Thank you for the great work in this project.
I was trying to use Meyda for MFCC feature collection. However, I went through the code here in scriptprocessor onaudioprocess callback:
https://github.com/meyda/meyda/blob/main/src/meyda-wa.ts#L163-L184

   this._m.spn.onaudioprocess = (e) => {
      var buffer;
      if (this._m.inputData !== null) {
        this._m.previousInputData = this._m.inputData;
      }
      this._m.inputData = e.inputBuffer.getChannelData(this._m.channel);
      if (!this._m.previousInputData) {
        buffer = this._m.inputData;
      } else {
        buffer = new Float32Array(
          this._m.previousInputData.length +
            this._m.inputData.length -
            this._m.hopSize
        );
        buffer.set(this._m.previousInputData.slice(this._m.hopSize));
        buffer.set(
          this._m.inputData,
          this._m.previousInputData.length - this._m.hopSize
        );
      }
      var frames = utilities.frame(buffer, this._m.bufferSize, this._m.hopSize);
      frames.forEach((f) => {
        this._m.frame = f;
        var features = this._m.extract(
          this._m._featuresToExtract,
          this._m.frame,
          this._m.previousFrame
        );
        // call callback if applicable
        if (
          typeof this._m.callback === "function" &&
          this._m.EXTRACTION_STARTED
        ) {
          this._m.callback(features);
        }
        this._m.previousFrame = this._m.frame;
      });
    };

I made a drawing to illustrate:

My question is,

To compute accuracy features, we might not want to pad zeros. We would want to queue enough samples until we could start again.
we should not take the newly input buffer as the previous input buffer. Instead, we should start at where Frame5 started. Otherwise the feature extraction window will mess up.

I understand that the current implementation could give similar result as my proposal, but I am not quite sure how big the difference is. I was trying to use Meyda as my feature extraction preprocessing step for my Tensorflow MFCC. The extracted features does not match and my model gave me bad results. Please advice if I am making a mistake here. Many thanks!

The text was updated successfully, but these errors were encountered:

hughrawlinson · 2021-12-06T17:01:02Z

Hey @fyuvb - Happy to help!

Good catch! Thanks for catching the bug. Definitely something we should fix. If you'd like to give it a go, I'd be happy to review a PR. Otherwise, I'll try and get to it.

fyuvb · 2021-12-07T01:21:34Z

Hi @hughrawlinson. Thank you for confirming this problem. I am not very good at javascript so thank you so much for fixing this problem. Much appreciated.

fyuvb added the question label Dec 6, 2021

fyuvb mentioned this issue Dec 6, 2021

Feature Request: tensorflow.js support mozilla/DeepSpeech#2233

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong Logic for HopSize? #1099

Wrong Logic for HopSize? #1099

fyuvb commented Dec 6, 2021 •

edited

hughrawlinson commented Dec 6, 2021

fyuvb commented Dec 7, 2021

Wrong Logic for HopSize? #1099

Wrong Logic for HopSize? #1099

Comments

fyuvb commented Dec 6, 2021 • edited

hughrawlinson commented Dec 6, 2021

fyuvb commented Dec 7, 2021

fyuvb commented Dec 6, 2021 •

edited