-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processing batches of audio files through Essentia-Tensorflow pre-trained models #1353
Comments
Yes, you can initialize from essentia.standard import MonoLoader, TensorflowPredictVGGish
audio_paths = ["file1.wav", "file2.wav"]
loader = MonoLoader()
model = TensorflowPredictVGGish(graphFilename="audioset-vggish-3.pb", output="model/vggish/embeddings")
for audio in audio_paths:
loader.configure(filename=audio, sampleRate=16000, resampleQuality=4)
audio = loader()
embeddings = model(audio) |
returns It seems |
that's the expected return value for configure. |
Ok got it, but I still don't understand how this could work out... |
sorry @Galvo87! The loader had to be configured first and then called. |
@burstMembrane, did you find a good solution for batch processing? I have 8 GPUs and want to extract a bunch of embeddings as quickly as possible I noticed the "batch_size" argument, but it seems like that has to do with how many "patches" it will process from the input audio file, rather than an option to batch-process multiple audio files. Any tips appreciated. |
The simplest approach would be to modify this script to receive a list of files to process with something like
then you can divide the filelist you want to process in 8 chunks, (e.g., Finally you can launch one script per GPU:
|
Thanks, yes, I actually realized there was something similar I could do, in just chunking my data into my GPU-count chunks (8) and having a separate serial process for each GPU. Works well. (I also used |
First of all thanks to the contributors of this library!
I'm currently trying to batch create embeddings from the AudioSet-VGGish pre-trained model
Am able to follow the docs to download the pretrained model and generate embeddings.
The problem is the examples don't show any implementation for batch_processing of multiple audio files. When I chucked the below code in for loop, it reinitializes tensorflow and runs really slow each iteration of the loop e.g
I've tried it like this and it does the same thing, is there a way to process audio in batches or stop tensorflow from reinitializing each run?
The text was updated successfully, but these errors were encountered: