Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

streaming pcm frame to server using pykandi. #322

Closed
donaldos opened this issue May 16, 2024 · 1 comment
Closed

streaming pcm frame to server using pykandi. #322

donaldos opened this issue May 16, 2024 · 1 comment

Comments

@donaldos
Copy link

I want to perform frame-by-frame speech recognition using pykaldi.
I want to send voice data to the server with websocket every 250ms, and pykaldi will proceed with speech recognition after receiving pcm data from the server.
Please find the problem in the following code.
Thank you for your review.

using 0.2.2 version

  1. convert pcm data to np array data

  2. processing to recognize speech
    'def process_audio_chunk(audio_data, rate=16000):

    Convert audio to Kaldi format

    waveform = kaldi.matrix.SubVector(audio_data)

    Compute MFCC features

    mfcc_opts = kaldi.feat.mfcc.MfccOptions()
    mfcc_opts.frame_opts.samp_freq = 16000.0
    mfcc = kaldi.feat.mfcc.Mfcc(mfcc_opts)
    feats = kaldi.matrix.Matrix()
    mfcc.compute_features(waveform, rate, 1.0, feats)

    Perform decoding

    result = recognizer.decode(feats)'

  3. problem.
    for mfcc.compute_feature(), please detailed example.
    in Documentation 0.1.1, argments is 4 but Error is happend in 0.2.2

@bmilde
Copy link
Contributor

bmilde commented May 17, 2024

I recommend checking out kaldi-model-server (https://github.com/uhh-lt/kaldi-model-server). It's using pykaldi to do online decoding and it has good latency as well. You can see it in action here: https://ltdata1.informatik.uni-hamburg.de/meetingbot/meetingbot_de.mp4 (German) and https://ltdata1.informatik.uni-hamburg.de/meetingbot/meetingbot_en.mp4 (English).

@bmilde bmilde closed this as completed May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants