Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'OnlineSpeakerDiarization' from 'diart' #220

Open
ameer-kanaan opened this issue Nov 24, 2023 · 9 comments
Labels
question Further information is requested

Comments

@ameer-kanaan
Copy link

ameer-kanaan commented Nov 24, 2023

I am trying to run your tutorial on transcription coloring. But I am getting the mentioned error.

The library runs fine per "diart.stream microphone".

Running on Windows 11 with Python 3.11.5. I am on your .yml environment.

@thieugiactu
Copy link

They got rid of OnlineSpeakerDiarization, please use SpeakerDiarization and SpeakerDiarizationConfig instead.

@juanmc2005
Copy link
Owner

I just updated the gist accordingly

@juanmc2005 juanmc2005 added the question Further information is requested label Nov 24, 2023
@ameer-kanaan
Copy link
Author

I just updated the gist accordingly

Thanks a lot.
It is running, but it is not transcribing, it just keeps "listening". Any workaround?

@juanmc2005
Copy link
Owner

@ameer-kanaan some people have reported this but it could be due to many things, so I can't help without more information.

I suggest you debug line by line to find out what's going on.

You can also take a look at:

Some common problems are Whisper being too slow due to RAM and/or CPU requirements, and sample rate mismatch

@ameer-kanaan
Copy link
Author

ameer-kanaan commented Nov 27, 2023

@ameer-kanaan some people have reported this but it could be due to many things, so I can't help without more information.

I suggest you debug line by line to find out what's going on.

You can also take a look at:

Some common problems are Whisper being too slow due to RAM and/or CPU requirements, and sample rate mismatch

Just tried running it on my friend's Mac Book Pro 2018 too, he gets the same issue.. it continues "listening" but doesn't output.

Additionally, we tried it on 2 other Windows devices, one of which has excellent GPU and CPU, but there it didn't listen, they just got the warnings and exited. We tried to make use of the issues section but for no avail.

We are all getting the same warnings:

UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
torchaudio.set_audio_backend("soundfile")
The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
Model was trained with pyannote.audio 0.0.1, yours is 3.1.0. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.1.1+cpu. Bad things might happen unless you revert torch to 1.x.
Model was trained with pyannote.audio 0.0.1, yours is 3.1.0. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.8.1+cu102, yours is 2.1.1+cpu. Bad things might happen unless you revert torch to 1.x.
Model was trained with pyannote.audio 0.0.1, yours is 3.1.0. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.8.1+cu102, yours is 2.1.1+cpu. Bad things might happen unless you revert torch to 1.x.

@juanmc2005
Copy link
Owner

juanmc2005 commented Nov 27, 2023

@ameer-kanaan the warnings are normal, you can simply ignore them.

Could you try downgrading diart to v0.8 and v0.7 and see if the issue persists? (keep in mind you'll have to change the class names again)

Are you trying to run the pipeline on your microphone or on a specific file?

Also, please try debugging line by line to see what the audio chunks are, what model outputs are, etc. This should give you an idea of what is happening

@ameer-kanaan
Copy link
Author

@ameer-kanaan the warnings are normal, you can simply ignore them.

Could you try downgrading diart to v0.8 and v0.7 and see if the issue persists? (keep in mind you'll have to change the class names again)

Are you trying to run the pipeline on your microphone or on a specific file?

Also, please try debugging line by line to see what the audio chunks are, what model outputs are, etc. This should give you an idea of what is happening

It started working with 0.7 now. We are trying to use the microphone.

For the previous issue with 0.8, we did try to run basic debugging line by line, but we didn't catch any errors.

@juanmc2005
Copy link
Owner

@ameer-kanaan ok, I'll try to take a look at this issue in the coming weeks to see if something broke the whisper code with v0.8

@juanmc2005
Copy link
Owner

Update (copied from my comment on the gist):

I tried it out using diart 0.9, both from the mic and from an audio file, with and without GPU. Each time I was able to see colored transcriptions. However, what may be happening is that the chunk processing is too slow (due to hardware) and hence interrupts the recording of the microphone (although it should be asynchronous with MicrophoneAudioSource).

If you can get real time diarization with only diart (quick test diart.stream microphone and see what you get), then what I suggest is that you change line 151 to source = WebSocketAudioSource(config.sample_rate) and run the script, then from another terminal run diart.client microphone --host 127.0.0.1 --port 7007 --sample-rate 16000 --step 0.5.

This is basically reading from the microphone and sending chunks to the pipeline through a websocket server, then you should see the colored captions on the pipeline script's output.
This will guarantee that the mic streaming and the pipeline run in different processes, avoiding the interference problem that I mentioned.

Let me know if that works out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants