Add new algo audio2pitch #1413

xaviliz · 2024-05-13T13:43:49Z

This pull request add a new essntia algorithm, Audio2Pitch. It has been designed for real time pitch extraction.

src/algorithms/tonal/audio2pitch.cpp
src/algorithms/tonal/audio2pitch.h
test/src/unittests/tonal/test_audio2pitch.py

Before we review everything it would be nice to have an idea of which test could be useful and any suggestions you could have. Kindly.

dbogdanov · 2024-05-14T11:52:30Z

src/algorithms/tonal/audio2pitch.h

+      declareParameter("minFrequency", "minimum frequency to detect in Hz", "[10,20000]", 60.f);
+      declareParameter("maxFrequency", "maximum frequency to detect in Hz", "[10,20000]", 2300.f);


We do not use such a floating-point notation in the rest of the algorithms (use 60.0, 2300.0, etc.)

dbogdanov · 2024-05-14T11:52:39Z

src/algorithms/tonal/audio2pitch.h

+      declareParameter("pitchAlgorithm", "pitch algorithm to use", "{pyin,pyin_fft}", "pyin_fft");
+      declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms");
+      declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default");
+      declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f);


dbogdanov · 2024-05-14T11:54:06Z

src/algorithms/tonal/audio2pitch.h

+      declareParameter("maxFrequency", "maximum frequency to detect in Hz", "[10,20000]", 2300.f);
+      declareParameter("pitchAlgorithm", "pitch algorithm to use", "{pyin,pyin_fft}", "pyin_fft");
+      declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms");
+      declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default");


Unclear what default does. No weighting? Then we can rename it.

By default PitchYinFFT apply a weighting function defined here we can think in another name though. Otherwise we can add a comment like Default PitchYinFFt weighting function ?

dbogdanov · 2024-05-14T11:55:41Z

src/algorithms/tonal/audio2pitch.h

+      declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms");
+      declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default");
+      declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f);
+      declareParameter("pitchConfidenceThreshold", "level of pitch confidence above which note ON/OFF start to be considered", "[0,1]", 0.25);


More consistent description: "above/below which note ON/OFF start to be considered"

agreed, what do you think about level of pitch confidence above used for voiced frame detection?

dbogdanov · 2024-05-14T11:56:34Z

src/algorithms/tonal/audio2pitch.h

+      declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default");
+      declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f);
+      declareParameter("pitchConfidenceThreshold", "level of pitch confidence above which note ON/OFF start to be considered", "[0,1]", 0.25);
+      declareParameter("loudnessThreshold", "loudness level above which note ON/OFF start to be considered, in linear values", "[0,1]", 0.0031); // ~ -50dB


similar here

btw, would it be more intuitive for a user to be able to specify dB values instead? However, we can keep using linear values for consistency with the outputs of Loudness and RMS.

yes, decibels are easy to handle. I think we can make the conversion, it won't be an issue. Indeed we need dBs to for velocity conversion. Let me check which would be the impact.

dbogdanov · 2024-05-14T12:02:47Z