Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with realtime resampling #73

Open
gabrielfreire opened this issue Feb 14, 2023 · 3 comments
Open

Help with realtime resampling #73

gabrielfreire opened this issue Feb 14, 2023 · 3 comments

Comments

@gabrielfreire
Copy link

gabrielfreire commented Feb 14, 2023

I am trying to resample from 44100 to 16000 wav audio bytes in realtime using NWaves but the resulting audio is noisy and broken

Could someone provide an example on how to resample audio from a stream in realtime using NWaves please?

Below is a snippet of my code

  • _continuousSpeechStream is a Stream that keeps growing with wav audio bytes without header as the connection is on.
  • the bytes are read from the stream to be resampled before adding header and processing.

This is what I have so far:

var _chunkSize = 35192 - 44 - 2; // size: 8192 could be used here
var buffer = new byte[checked((uint)Math.Min(_chunkSize, (int)_continuousSpeechStream.Length))];
int bytesRead = 0;

while (_continuousSpeechStream.CanRead &&
          !waitTask.Wait(300) && _websocketClient.IsConnected)
{
            bytesRead = await _continuousSpeechStream.ReadAsync(buffer, 0, buffer.Length);

            if (bytesRead > 0)
            {
                         // ----------- START RESAMPLE
                         var sizeInFloats = _chunkSize / sizeof(short);  // for Pcm 16bit

                         var f = new float[1][];
                         f[0] = new float[sizeInFloats];

                         ByteConverter.ToFloats16Bit(buffer, f);

                         var signal = new DiscreteSignal(44100, f[0]);
                         var resampledSignal = Operation.Resample(signal, 16000);
                            
                         var resampledBytes = new byte[resampledSignal.Length * 2];
                         ByteConverter.FromFloats16Bit(new float[][] { resampledSignal.Samples }, resampledBytes);
                         // ---------- END RESAMPLE

                         // .... process resampled bytes 
                         var tempStream = new MemoryStream();

                         // Write WAVE header the first time ONLY
                         if (sendRIFFHeader)
                         {
                                tempStream.WriteWaveHeader(channelCount.Value, sampleRate.Value, bitsPerSample.Value, 0);

                                // the RIFF header is only written in the first time
                                sendRIFFHeader = false;
                         }

                          // write the bytesRead to send to the websocket service
                          await tempStream.WriteAsync(resampledBytes, 0, resampledBytes.Length);

                         // DO STUFF
            }
}
@ar1st0crat
Copy link
Owner

Hi, realtime resampling is not supported. I've been planning to revise and improve resampling functions in NWaves since long ago, but unfortunately I can't find enough time for it

@gabrielfreire
Copy link
Author

Ok =/. Thanks for your answer

@damian-666
Copy link

damian-666 commented Feb 17, 2024

voice/ and LLMs / is the new UI ( i am sick of mousing and typing), and not even blind yet.

so watch out for news. the new ARM based laptops to run windows are fast, and low power, like macs M chips . and they have FPGA based chips / you can use Hastlayer and c# and burn Ai , or real DSP algorithms, with , fixed point filters. kernels, and FFTs TDFFT whatever.

low power, low latency and not needing processor core, affinity, shader hack effects, or consider about blocking GC during voice or record.

you can make FIR kernels , EQ and do hotword detectors on it mid 2024

For a friend i just I looked at SDL and XNA guts( monogame) and a i think it puts effects trough the DSP via COM and c# wrapper and or Asio or SD. If we succeed to get reverb to work realtime filters

i can see if its can be fixed that way, because a pop or crackles on a speaker is a super high F noise. i didn't know that Intel has DSP programmable by C and that ARM has similar all these years, or how bad floating point issue can get . But now they are taking unum types, ( 5 or N bit) for AI biases , and posit and fixed point. floating points are just not periodic or discreet or that suitable for ffts but ok enough for images, so waiting to see what come out of the Ai and chip companies might help you decide if ifs worth the trouble and what will will be done for general purpose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants