Add audio slice prediction function #26

G-haoyu · 2022-06-26T05:21:00Z

When using basic-pitch for audio to midi, for longer audio, the prediction consumes more memory resources, which may cause Tensorflow to kill the predict process.
By slicing the audio and splitting the prediction into slices, and then combining them into a complete prediction result, memory resource consumption is greatly reduced. It is more suitable for basic-pitch to be promoted and used.

drubinstein · 2022-06-27T16:12:57Z

Hi @G-haoyu

Good idea, and similar to what we do in the typescript version, but I think there are some changes to this implementation that would better fix the issue you are trying to solve

You could switch the model call to use predict which may solve your problem entirely. predict runs input in batches through a keras model.
Instead of loading the entire audio at once with librosa.load, you could switch to librosa.stream which will not load the entire file all at once (a possible concern for files on the scale of hours)
AUDIO_SLICE_TIME might be better off as AUDIO_SLICE_FRAMES since going between seconds and samples can be a little dangerous at times (floating point errors) and the frame size is known.
You are opening the file in append mode and that makes sense, but you also probably want to open the file once in write mode before the looping begins to clear out any existing debug files. We never check if debug_file exists (unlike all the other outputs) since it's unlikely to be used for anything important so it's possible the file may already exist.

schneiderfelipe · 2023-05-07T19:07:15Z

Any updates on this?

G-haoyu added 9 commits June 26, 2022 08:37

add function of audio slicing

52ac65c

tox

a2b5266

pass flake8

4a715e6

pass test

2a9c585

fix

f0ccd86

go

d885018

go

f618eca

add AUDIO_SLICE_TIME

fc160ff

pass test

47d5d66

rabitt mentioned this pull request Jan 27, 2023

Very long audio files crash/cause OOM #53

Closed

rabitt assigned bgenchel Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add audio slice prediction function #26

Add audio slice prediction function #26

G-haoyu commented Jun 26, 2022

drubinstein commented Jun 27, 2022

schneiderfelipe commented May 7, 2023

Add audio slice prediction function #26

Are you sure you want to change the base?

Add audio slice prediction function #26

Conversation

G-haoyu commented Jun 26, 2022

drubinstein commented Jun 27, 2022

schneiderfelipe commented May 7, 2023