ENH WIP Arbitrary target labels instead of 0/1 #1006

SeanMcCarren · 2022-01-12T17:37:15Z

WIP on issue #911 (don't need to call the police)

This PR involves:

threshold optimizer
exponentiated gradient. (somewhat finished)
grid search

Issues:

If all supplied labels are one of the two possible labels, then we can't infer self.classes_. So, now, we should raise an error. But this breaks the testcases test_threshold_optimization_degenerate_labels and test_single_y_class (the latter is of gridsearch).
In Exponentiated gradient, how can we know if y is binary or continuous? If it is continuous, we should not apply the encoding. Can we do the same as in gridsearch: isinstance(self.constraints, ClassificationMoment) ?
Exponentiated gradient testcase is dependent on 0/1 predictions in Moments, I don't know how to fix this. For now, I'll leave the prediction as 0/1s.

…nal labels because I can't figure it out

romanlutz · 2022-01-13T18:32:17Z

@SeanMcCarren

In Exponentiated gradient, how can we know if y is binary or continuous? If it is continuous, we should not apply the encoding. Can we do the same as in gridsearch: isinstance(self.constraints, ClassificationMoment) ?
Yes! The Moment classes are for both GridSearch and ExponentiatedGradient.

romanlutz · 2022-01-13T18:33:37Z

Exponentiated gradient testcase is dependent on 0/1 predictions in Moments, I don't know how to fix this. For now, I'll leave the prediction as 0/1s.

You can adjust those test cases to run multiple times, of course. Just parametrize it to have a variety of labels, e.g.,

0 and 1
-1 and 1
0 and 2
(Are text labels allowed? I'm not sure, feel free to try, but it's not mandatory to have that IMO)

SeanMcCarren · 2022-01-14T07:20:45Z

You can adjust those test cases to run multiple times, of course. Just parametrize it to have a variety of labels, e.g.,

Thanks for the quick response! Yes, I did parameterize, but the problem here is that I added preprocessing in Exponentiated Gradient, and not in all the individual load_data functions. In test_argument_types_difference_bound there are some operations on y that assume it is 0/1 and don't work for other values/strings, but this should not be the case I think. Additionally, there are some operations that assume that the prediction outputs are 0/1, which also shouldn't necessarily be true. I guess I could change the testcases, but I'm wondering if instead I should more fundamentally change load_data (but this seems wrong)

SeanMcCarren · 2022-02-24T16:24:02Z

@adrinjalali @romanlutz Is this important to be done quick? If so, you should know I am a bit stuck, see the issues in the top post.

adrinjalali · 2022-02-25T11:32:50Z

If all supplied labels are one of the two possible labels, then we can't infer self.classes_. So, now, we should raise an error. But this breaks the testcases test_threshold_optimization_degenerate_labels and test_single_y_class (the latter is of gridsearch).

I'd say that's the expected behavior. We should raise if there is only one class.

In Exponentiated gradient, how can we know if y is binary or continuous? If it is continuous, we should not apply the encoding. Can we do the same as in gridsearch: isinstance(self.constraints, ClassificationMoment) ?

Generally, is_classifier should work, if it doesn't we should change the implementation for it to work. Meaning you may end up having two different classes for a regressor and a classifier.

Also, this is not in the milestone, so we're not in too much of a rush for this PR.

SeanMcCarren · 2022-03-31T15:04:41Z

Sorry i've been quite inactive on this issue, I might pick it up in the future but for the coming couple of months not. If you are interested in this, please feel free to contribute!

SeanMcCarren added 3 commits January 12, 2022 17:35

Common relaxation of input validation

c955729

threshold optimizer

ba4575b

threshold optimizer pos_label

539d288

SeanMcCarren changed the title ~~WIP Arbitrary targets instead of 0/1~~ ENH WIP Arbitrary target labels instead of 0/1 Jan 12, 2022

SeanMcCarren added 6 commits January 13, 2022 12:11

flake8

ab8be1c

Move encoding stuff

6203b1e

Moved pos_label to init instead of fit kwargs

8024cd8

Exponentiated gradient WITHOUT changing predictions from 0/1 to origi…

ecece7d

…nal labels because I can't figure it out

Fix testcase

d73b079

flake8

a1cb205

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH WIP Arbitrary target labels instead of 0/1 #1006

ENH WIP Arbitrary target labels instead of 0/1 #1006

SeanMcCarren commented Jan 12, 2022 •

edited

romanlutz commented Jan 13, 2022

romanlutz commented Jan 13, 2022

SeanMcCarren commented Jan 14, 2022

SeanMcCarren commented Feb 24, 2022

adrinjalali commented Feb 25, 2022

SeanMcCarren commented Mar 31, 2022 •

edited

ENH WIP Arbitrary target labels instead of 0/1 #1006

Are you sure you want to change the base?

ENH WIP Arbitrary target labels instead of 0/1 #1006

Conversation

SeanMcCarren commented Jan 12, 2022 • edited

romanlutz commented Jan 13, 2022

romanlutz commented Jan 13, 2022

SeanMcCarren commented Jan 14, 2022

SeanMcCarren commented Feb 24, 2022

adrinjalali commented Feb 25, 2022

SeanMcCarren commented Mar 31, 2022 • edited

SeanMcCarren commented Jan 12, 2022 •

edited

SeanMcCarren commented Mar 31, 2022 •

edited