Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add F1 score, precision, and recall metrics as MultilabelSegmentation default metrics #1336

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from

Conversation

FrenchKrab
Copy link
Contributor

Currently the MultilabelSegmentation task has no default metric, this PR adds these 3 (torchmetrics) metrics as default.

@codecov
Copy link

codecov bot commented Apr 20, 2023

Codecov Report

Patch coverage: 48.64% and project coverage change: +0.15 🎉

Comparison is base (bbe1395) 32.98% compared to head (3107105) 33.13%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1336      +/-   ##
===========================================
+ Coverage    32.98%   33.13%   +0.15%     
===========================================
  Files           64       65       +1     
  Lines         4072     4134      +62     
===========================================
+ Hits          1343     1370      +27     
- Misses        2729     2764      +35     
Impacted Files Coverage Δ
pyannote/audio/cli/train.py 0.00% <ø> (ø)
pyannote/audio/pipelines/speaker_diarization.py 0.00% <0.00%> (ø)
pyannote/audio/pipelines/utils/oracle.py 0.00% <0.00%> (ø)
pyannote/audio/tasks/embedding/mixins.py 26.51% <0.00%> (-0.41%) ⬇️
.../tasks/segmentation/overlapped_speech_detection.py 42.50% <0.00%> (ø)
...dio/tasks/segmentation/voice_activity_detection.py 40.54% <0.00%> (ø)
pyannote/audio/utils/preview.py 0.00% <0.00%> (ø)
pyannote/audio/core/pipeline.py 21.85% <14.28%> (-0.22%) ⬇️
...te/audio/tasks/segmentation/speaker_diarization.py 47.68% <20.00%> (+0.01%) ⬆️
pyannote/audio/core/task.py 81.48% <50.00%> (-0.40%) ⬇️
... and 7 more

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@FrenchKrab
Copy link
Contributor Author

Before merging I think i need to reshape what's passed to the metric in the validation, so that it's compatible with more metrics (currently i think flat tensors are passed).

@FrenchKrab FrenchKrab marked this pull request as draft April 20, 2023 09:22
@FrenchKrab FrenchKrab marked this pull request as ready for review April 20, 2023 11:59
@@ -251,10 +254,25 @@ def validation_step(self, batch, batch_idx: int):

# mask (frame, class) index for which label is missing
mask: torch.Tensor = y_true != -1
y_pred = y_pred[mask]
y_true = y_true[mask]
y_pred = y_pred[mask].reshape(shape)
Copy link
Member

@hbredin hbredin Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break as soon as mask contains at least one -1 because the overall size of y_pred will then be smaller than shape.

Still support "global" metrics, BUT they have to be of a binary type.
@FrenchKrab FrenchKrab marked this pull request as draft April 20, 2023 15:08
@FrenchKrab
Copy link
Contributor Author

Writing down some things before i forget them:

  • I added "per class" metrics, which probably need a better name since it's the metrics computed for each individual class (but the same metrics are used for each class)
  • I kept the "global" metric (computed for all outputs on all classes), BUT because we support missing targets, and torchmetrics does not (to my knowledge), without any assumptions, we can only treat the multilabel as a binary problem and are limited to these metrics.
    We can still let the user set the global metric as "multilabel", but they have to make sure there arent missing targets.

(we can discuss this tomorrow !)

@FrenchKrab FrenchKrab marked this pull request as ready for review May 2, 2023 12:02
@hbredin
Copy link
Member

hbredin commented May 10, 2023

Is this ready for review?

@FrenchKrab
Copy link
Contributor Author

FrenchKrab commented May 11, 2023

It should be now ! (although it changes more than anticipated) (sorry for the last two commits, should've reread my whole code before committing a "fix")

Multilabel metrics use ignore_index=-1 to support having targets of different labels ignored in the metrics (partially annotated data).
Classwise (binary) metrics do not need it since we can just filter out the missing data.

Maybe we should enforce ignore_index==-1 in the multilabel metrics / raise an exception, it's easy to miss and the metric would either still work (but give wrong values) or crash (and the user probably wont understand why).

@hbredin
Copy link
Member

hbredin commented May 15, 2023

As discussed right now, would be nice to try average="none" instead of those pesky classwise metrics :)

@hbredin
Copy link
Member

hbredin commented Sep 15, 2023

Going over your PRs :) Is this mergeable?

@FrenchKrab
Copy link
Contributor Author

I should test it before, but i'm done with the implementation (i dont know if it's ok for you though :) ).

In the end I didn't find how to do away with "metric_classwise", there's a torchmetric classwise wrapper but it doesn't do what we want.
Torchmetrics's wrapper is for metrics that return a multidim tensor, but for example the multilabel variant of Calibration Error doesn't exist, while the "binary" version exists for probably every metric.

Copy link

stale bot commented Mar 19, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 19, 2024
@hbredin hbredin removed the wontfix label Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants