-
Notifications
You must be signed in to change notification settings - Fork 527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new algo audio2pitch #1413
base: master
Are you sure you want to change the base?
Add new algo audio2pitch #1413
Conversation
src/algorithms/tonal/audio2pitch.h
Outdated
declareParameter("minFrequency", "minimum frequency to detect in Hz", "[10,20000]", 60.f); | ||
declareParameter("maxFrequency", "maximum frequency to detect in Hz", "[10,20000]", 2300.f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not use such a floating-point notation in the rest of the algorithms (use 60.0
, 2300.0
, etc.)
src/algorithms/tonal/audio2pitch.h
Outdated
declareParameter("pitchAlgorithm", "pitch algorithm to use", "{pyin,pyin_fft}", "pyin_fft"); | ||
declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms"); | ||
declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default"); | ||
declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
declareParameter("maxFrequency", "maximum frequency to detect in Hz", "[10,20000]", 2300.f); | ||
declareParameter("pitchAlgorithm", "pitch algorithm to use", "{pyin,pyin_fft}", "pyin_fft"); | ||
declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms"); | ||
declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unclear what default
does. No weighting? Then we can rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default PitchYinFFT apply a weighting function defined here we can think in another name though. Otherwise we can add a comment like Default PitchYinFFt weighting function
?
src/algorithms/tonal/audio2pitch.h
Outdated
declareParameter("loudnessAlgorithm", "loudness algorithm to use", "{loudness,rms}", "rms"); | ||
declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default"); | ||
declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f); | ||
declareParameter("pitchConfidenceThreshold", "level of pitch confidence above which note ON/OFF start to be considered", "[0,1]", 0.25); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More consistent description: "above/below which note ON/OFF start to be considered"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, what do you think about level of pitch confidence above used for voiced frame detection
?
declareParameter("weighting", "string to assign a weighting function", "{default,A,B,C,D,Z}", "default"); | ||
declareParameter("tolerance", "sets tolerance for peak detection on pitch algorithm", "[0,1]", 1.0f); | ||
declareParameter("pitchConfidenceThreshold", "level of pitch confidence above which note ON/OFF start to be considered", "[0,1]", 0.25); | ||
declareParameter("loudnessThreshold", "loudness level above which note ON/OFF start to be considered, in linear values", "[0,1]", 0.0031); // ~ -50dB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, would it be more intuitive for a user to be able to specify dB values instead? However, we can keep using linear values for consistency with the outputs of Loudness and RMS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, decibels are easy to handle. I think we can make the conversion, it won't be an issue. Indeed we need dBs to for velocity conversion. Let me check which would be the impact.
src/algorithms/tonal/audio2pitch.cpp
Outdated
|
||
const char* Audio2Pitch::name = "Audio2Pitch"; | ||
const char* Audio2Pitch::category = "Pitch"; | ||
const char* Audio2Pitch::description = DOC("Extractor algorithm to compute pitch with several possible pitch algorithms, specifically targeted for real-time pitch detection on saxophone signals."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, in all algorithms this DOC string starts with a sentence "This algorithm ..." summarizing what the algorithm does, which is then auto-parsed by doc generation script to create short descriptions for the algorithm index. Let's keep it similar.
src/algorithms/tonal/audio2pitch.cpp
Outdated
throw EssentiaException("Audio2Pitch: Max frequency cannot be higher than Nyquist frequency"); | ||
} | ||
if (_maxFrequency <= _minFrequency) { | ||
throw EssentiaException("Audio2Pitch: Max frequency cannot be lower than min frequency"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"lower or equal"
} | ||
|
||
if (_pitchAlgorithmName != "pyin_fft" && _pitchAlgorithmName != "pyin") { | ||
E_INFO("Audio2Pitch: 'pitchAlgorithm' = "<<_pitchAlgorithmName<<"\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not to add the parameter value directly to the exception message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm... because operands are invalid in the exception message:
../src/algorithms/tonal/audio2pitch.cpp:60:69: error: invalid operands to binary expression ('const char [39]' and 'std::string' (aka 'basic_string<char, char_traits<char>, allocator<char> >')) throw EssentiaException("Audio2Pitch: Bad 'loudnessAlgorithm' ="<<_loudnessAlgorithmName<<"\n");
Can you confirm that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I have seen how you use to do that in some other algorithm like here:
throw EssentiaException(fullName(), ": Could not push 1 value, output buffer is full");
_loudnessAlgorithm = AlgorithmFactory::create("RMS"); | ||
} | ||
else { | ||
E_INFO("Audio2Pitch: 'loudnessAlgorithm' = "<<_loudnessAlgorithmName<<"\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we can provide the parameter value directly in the exception message.
throw EssentiaException("Audio2Pitch: Bad 'loudnessAlgorithm' parameter"); | ||
} | ||
|
||
// switch between pyin and pyin_fft to propagate the weighting parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are directly using parameter strings for comparison, but isSpectral
is used in other parts of the code. Is this because there could be multiple FFT-based algorithms in the future? If not, we could simplify and get rid of isSpectral
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, this was because we started including other algorithms like PitchMelodia and in the python prototype we have more algorithms based on spectral processing. I think it might be removed if we don't plan to include others.
This pull request add a new essntia algorithm, Audio2Pitch. It has been designed for real time pitch extraction.
Before we review everything it would be nice to have an idea of which test could be useful and any suggestions you could have. Kindly.