Add support for sample weighting (dataset imbalance) #2738

InakiRaba91 · 2023-03-10T18:44:37Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Sample weighting is a common strategy used to deal with imbalanced datasets. The idea is to weight the contribution to the loss of each item on the dataset based to its frequency.

There are different strategies, such as Inverso of Number of Samples (INS), Inverse of Square Root of Number of Samples (ISNS) or Effective Number of Samples (ENS), to cite a few (see a brief summary here)

Modification

The code already supports providing an item-wise weight to the loss. However, that weight is not exposed at the level of the head, so it can not be injected at runtime.

There are mechanisms to weight different classes, or different losses (multi-loss scenario). But as far as I understand, it is not possible to carry out sample weighting.

BC-breaking (Optional)

Does the modification introduce changes that break the backward-compatibility of the downstream repos?

No, the changes only extend the functionality without affecting any current scenario.

Use cases (Optional)

Support added for sample weighting. In order to do so, the dataset just needs to add a float scalar named "weight" to the annotation info.

The loading pipeline has been uploaded to pick it up, and it will be injected to the loss function.

This should be quite useful when dealing with imbalanced datasets.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMDet3D.
The documentation has been modified accordingly, like docstring or example tutorials.

Issue

open-mmlab/mmdetection#9905

InakiRaba91 · 2023-03-10T18:44:54Z

Please let me know if you can think of any issues with this approach. If you believe it is on the right track, I'll update the forward_train method in point_head.py and add unit tests for the new functionality.

MeowZheng · 2023-03-16T07:59:37Z

thanks for your contribution, we are working on reviewing this pr.

* [MS Text To Video} Add first text to video * upload * make first model example * match unet3d params * make sure weights are correcctly converted * improve * forward pass works, but diff result * make forward work * fix more * finish * refactor video output class. * feat: add support for a video export utility. * fix: opencv availability check. * run make fix-copies. * add: docs for the model components. * add: standalone pipeline doc. * edit docstring of the pipeline. * add: right path to TransformerTempModel * add: first set of tests. * complete fast tests for text to video. * fix bug * up * three fast tests failing. * add: note on slow tests * make work with all schedulers * apply styling. * add slow tests * change file name * update * more correction * more fixes * finish * up * Apply suggestions from code review * up * finish * make copies * fix pipeline tests * fix more tests * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * apply suggestions * up * revert --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

jgoodman8 · 2023-03-27T15:23:00Z

It would be great to add this feature 👏

expose the weight in the decode head and in the loading pipeline

b1712d7

mm-assistant bot assigned xiexinch Mar 10, 2023

irabanillo91 mentioned this pull request Mar 10, 2023

[Feature] Weight item loss based on image metadata? open-mmlab/mmdetection#9905

Open

InakiRaba91 marked this pull request as ready for review March 13, 2023 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for sample weighting (dataset imbalance) #2738

Add support for sample weighting (dataset imbalance) #2738

InakiRaba91 commented Mar 10, 2023

InakiRaba91 commented Mar 10, 2023

MeowZheng commented Mar 16, 2023

jgoodman8 commented Mar 27, 2023

Add support for sample weighting (dataset imbalance) #2738

Are you sure you want to change the base?

Add support for sample weighting (dataset imbalance) #2738

Conversation

InakiRaba91 commented Mar 10, 2023

Motivation

Modification

BC-breaking (Optional)

Use cases (Optional)

Checklist

Issue

InakiRaba91 commented Mar 10, 2023

MeowZheng commented Mar 16, 2023

jgoodman8 commented Mar 27, 2023