Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using recall as primary automl objective #1973

Closed
npapan69 opened this issue Mar 12, 2021 · 4 comments
Closed

Error when using recall as primary automl objective #1973

npapan69 opened this issue Mar 12, 2021 · 4 comments

Comments

@npapan69
Copy link

Dear All,
In a binary classification problem, where the cost of FN is higher than the cost of FP, when trying to use recall as an objective I get the following error:

ValueError: recall is not allowed in AutoML! Use evalml.objectives.utils.get_core_objective_names() to get all objective names allowed in automl.

I went through the allowed names and I can see recall is there

Any suggestions?

Many thanks in advance
Nikos

@freddyaboulton
Copy link
Contributor

Hello @npapan69 !

We discourage using recall as a primary objective in automl search because a trivial pipeline that always predicts the positive class will usually produce a perfect recall score. So automl is incentivized to find trivial pipelines. See issue #476. That's why it's not in our list of core objectives, like the error says:

image

However, you still can use recall as an automl objective! Just pass it as an instance rather than a string:

from evalml.demos import load_breast_cancer
from evalml.automl import AutoMLSearch
from evalml.objectives import Recall
X, y = load_breast_cancer()
automl = AutoMLSearch(X, y, problem_type="binary", objective=Recall())

automl.search()

Output:

image

@dsherry dsherry changed the title Error Error when using recall as primary automl objective Mar 15, 2021
@dsherry
Copy link
Contributor

dsherry commented Mar 15, 2021

Thanks @freddyaboulton !

I had one more suggestion: if you want automl to know that the cost of false positives on your problem should be higher than the cost of false negatives, you could try using the CostBenefitMatrix objective:

import evalml
obj = evalml.objectives.CostBenefitMatrix(true_positive=1.0, true_negative=1.0, false_positive=10.0, false_negative=1.0)
automl = AutoMLSearch(X, y, problem_type="binary", objective=obj)

@tyler3991
Copy link
Contributor

Thanks for filing the issue @npapan69 ! I'm closing it since @dsherry and @freddyaboulton chimed in. Please reach out if you have any additional questions!

@npapan69
Copy link
Author

npapan69 commented Mar 18, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants