Embeddings takes 3x the length of the audio length #1687

cdreetz · 2024-04-14T01:01:31Z

Tested versions

pyannote-audio 3.1.1

System information

windows 10 - pyannote.3.1.1 - rtx 3070

Issue description

Diarization taking much longer than it should, using the progress hook see that getting embeddings is the slow part. The audio file I am using is a minute long, but the embeddings step alone is taking 3:14. Also when running it is using the majority of my vram, like upwards of 6.5GB. I can't find anything online that shows what the actual size of speaker-diarization-3.1 is but I didn't think it was suppose to be that big.

Also when I upload the wav file to Google Colab and run the same exact code it runs almost instant. So can't think of what locally is causing the embeddings to run so slowly.

ProgressHook output when ran locally:
segmentation 100% 0:00:00
speaker_counting 100% 0:00:00
embeddings 100% 0:03:14
discrete_diarization 100% 0:00:00

NVIDIA GeForce RTX 3070

Minimal reproduction example (MRE)

https://colab.research.google.com/github/pyannote/pyannote-audio/blob/develop/tutorials/MRE_template.ipynb

cdreetz · 2024-04-14T03:07:25Z

Update: I just uninstalled pyannote.audio 3.1.1 and installed 3.1.0 instead and it fixed it. Running the same exact code with 3.1.0 took 3.06 seconds, but with 3.1.1 it took 3 minutes and 14 seconds

hbredin · 2024-04-18T07:12:51Z

That's strange.

Looking at the diff between 3.1.1 and 3.1.0, I see no reason why this would happen:
3.1.0...3.1.1

JuergenFleiss · 2024-04-24T14:56:41Z

Also found very long runtime for embeddings in 3.1.1 on both Ryzen and Apple M1 CPU. Around 27 and 20 minutes for a 22 minute audio file.

cdreetz · 2024-04-24T15:49:06Z

@hbredin yeah I tried to look at the diffs between the versions and couldn't find anything that would cause the issue. I almost didn't even try the older version because there didn't appear to be much code difference, yet it still resulted in huge change in processing time

cdreetz · 2024-04-24T15:51:30Z

@JuergenFleiss did you try the same processing with 3.1.0?

JuergenFleiss · 2024-04-25T13:57:00Z

tried it, did not change; worked fine with 3.0.0; we will try to revert back to that

JuergenFleiss · 2024-04-26T08:46:14Z

After reading further, I think I am actually experiencing issue #1621

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embeddings takes 3x the length of the audio length #1687

Embeddings takes 3x the length of the audio length #1687

cdreetz commented Apr 14, 2024

cdreetz commented Apr 14, 2024

hbredin commented Apr 18, 2024

JuergenFleiss commented Apr 24, 2024

cdreetz commented Apr 24, 2024

cdreetz commented Apr 24, 2024

JuergenFleiss commented Apr 25, 2024

JuergenFleiss commented Apr 26, 2024

Embeddings takes 3x the length of the audio length #1687

Embeddings takes 3x the length of the audio length #1687

Comments

cdreetz commented Apr 14, 2024

Tested versions

System information

Issue description

Minimal reproduction example (MRE)

cdreetz commented Apr 14, 2024

hbredin commented Apr 18, 2024

JuergenFleiss commented Apr 24, 2024

cdreetz commented Apr 24, 2024

cdreetz commented Apr 24, 2024

JuergenFleiss commented Apr 25, 2024

JuergenFleiss commented Apr 26, 2024