Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More information on multipitch evaluation? #335

Open
maxpv opened this issue Jan 14, 2021 · 3 comments
Open

More information on multipitch evaluation? #335

maxpv opened this issue Jan 14, 2021 · 3 comments

Comments

@maxpv
Copy link

maxpv commented Jan 14, 2021

I'm trying to understand the multipitch, the documentation redirects to two papers but I couldn't find anything that explains the metrics:

OrderedDict([('Precision', 0.5),
             ('Recall', 0.5),
             ('Accuracy', 0.3333333333333333),
             ('Substitution Error', 0.5),
             ('Miss Error', 0.0),
             ('False Alarm Error', 0.0),
             ('Total Error', 0.5),
             ('Chroma Precision', 0.5),
             ('Chroma Recall', 0.5),
             ('Chroma Accuracy', 0.3333333333333333),
             ('Chroma Substitution Error', 0.5),
             ('Chroma Miss Error', 0.0),
             ('Chroma False Alarm Error', 0.0),
             ('Chroma Total Error', 0.5)])

In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2). Due to the format of the input for multipitch.evaluate (frequencies associated with an onset) I suppose the 5.2 was mentioned. There's literally nothing in it that explains the metrics.

What am I missing? It seems unnecessary obscure to me.

@justinsalamon
Copy link
Collaborator

@rabitt

@rabitt
Copy link
Contributor

rabitt commented Feb 15, 2021

Hey @maxpv

In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2).

In the documentation
image

The paper you mentioned, and a second are cited. The equations for all the metrics are there - Equations 3-6 in the first paper, and equations 1-8 in the second. Both papers give pretty lengthly explanations of the metrics if you want more details.

Due to the format of the input for multipitch.evaluate (frequencies associated with an onset) I suppose the 5.2 was mentioned.

There's no notion of onsets in mir_eval.multipitch.evaluate. Note-level metrics are implemented separately in mir_eval.transcription.

Hope that helps clarify things.

@maxpv
Copy link
Author

maxpv commented Feb 16, 2021

Thanks for your detailed answer.

I think this should be added to the documentation, it is not obvious that the 5.1 Frame-Level Transcription is related to the metrics we try to compute with multipitch or transcription. Partly because of the input format -frequencies and timestamps, as opposed to an NxT matrix.

Example from the 5.1 section: TP (“true positives”) is the number of correctly transcribed voiced frames (over all notes) .
Ok, but what is a frame when the input is a list of intervals and pitches? To do that we need to set the offset_min_tolerance but this parameter isn't exposed in the doc for the mir_eval.transcription.evaluate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants