More information on multipitch evaluation? #335

maxpv · 2021-01-14T14:51:31Z

I'm trying to understand the multipitch, the documentation redirects to two papers but I couldn't find anything that explains the metrics:

OrderedDict([('Precision', 0.5),
             ('Recall', 0.5),
             ('Accuracy', 0.3333333333333333),
             ('Substitution Error', 0.5),
             ('Miss Error', 0.0),
             ('False Alarm Error', 0.0),
             ('Total Error', 0.5),
             ('Chroma Precision', 0.5),
             ('Chroma Recall', 0.5),
             ('Chroma Accuracy', 0.3333333333333333),
             ('Chroma Substitution Error', 0.5),
             ('Chroma Miss Error', 0.0),
             ('Chroma False Alarm Error', 0.0),
             ('Chroma Total Error', 0.5)])

In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2). Due to the format of the input for multipitch.evaluate (frequencies associated with an onset) I suppose the 5.2 was mentioned. There's literally nothing in it that explains the metrics.

What am I missing? It seems unnecessary obscure to me.

The text was updated successfully, but these errors were encountered:

justinsalamon · 2021-01-26T17:45:01Z

@rabitt

rabitt · 2021-02-15T16:41:19Z

Hey @maxpv

In the first paper from 2007 there are two sections in the Transcription Results section: Frame-level transcription (5.1) and Note onset detection (5.2).

In the documentation

The paper you mentioned, and a second are cited. The equations for all the metrics are there - Equations 3-6 in the first paper, and equations 1-8 in the second. Both papers give pretty lengthly explanations of the metrics if you want more details.

Due to the format of the input for multipitch.evaluate (frequencies associated with an onset) I suppose the 5.2 was mentioned.

There's no notion of onsets in mir_eval.multipitch.evaluate. Note-level metrics are implemented separately in mir_eval.transcription.

Hope that helps clarify things.

maxpv · 2021-02-16T10:18:33Z

Thanks for your detailed answer.

I think this should be added to the documentation, it is not obvious that the 5.1 Frame-Level Transcription is related to the metrics we try to compute with multipitch or transcription. Partly because of the input format -frequencies and timestamps, as opposed to an NxT matrix.

Example from the 5.1 section: TP (“true positives”) is the number of correctly transcribed voiced frames (over all notes) .
Ok, but what is a frame when the input is a list of intervals and pitches? To do that we need to set the offset_min_tolerance but this parameter isn't exposed in the doc for the mir_eval.transcription.evaluate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More information on multipitch evaluation? #335

More information on multipitch evaluation? #335

maxpv commented Jan 14, 2021 •

edited

justinsalamon commented Jan 26, 2021

rabitt commented Feb 15, 2021

maxpv commented Feb 16, 2021

More information on multipitch evaluation? #335

More information on multipitch evaluation? #335

Comments

maxpv commented Jan 14, 2021 • edited

justinsalamon commented Jan 26, 2021

rabitt commented Feb 15, 2021

maxpv commented Feb 16, 2021

maxpv commented Jan 14, 2021 •

edited