-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AUC metric #3935
Fix AUC metric #3935
Conversation
@Yard1 @moezali1 @tvdboom @glemaitre @ogrisel @thomasjpfan @lorentzenchr @adrinjalali something is wrong with AUC metrics... i have no idea how to fix this pull-request... |
Could you please provide a minimal reproducer on synthetic data that ideally only involves scikit-learn? Working on crafting such a reproducer will likely help you understand what's going on. |
|
@ngupta23 can you help? |
@@ -115,10 +116,11 @@ def __init__( | |||
if scorer | |||
else pycaret.internal.metrics.make_scorer_with_error_score( | |||
score_func, | |||
needs_proba=target == "pred_proba", | |||
needs_threshold=target == "threshold", | |||
response_method=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is calling scikit-learn's make_scorer
under the covers, then you can pass in the response_method
directly here.
if target == "pred"
response_method = "predict"
elif target == "pred_proba":
response_method = "predict_proba"
else: # threshold
response_method = "decision_function"
...
else pycaret.internal.metrics.make_scorer_with_error_score(
score_func,
response_method=response_method,
greater_is_better=greater_is_better,
error_score=0.0,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested but still not working...
logs.log show:
2024-03-12 18:16:58,428:WARNING:C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\metrics.py:196: FitFailedWarning: Metric 'make_scorer(roc_auc_score, response_method=('decision_function', 'predict_proba'), average=weighted, multi_class=ovr)' failed and error score 0.0 has been returned instead. If this is a custom metric, this usually means that the error is in the metric code. Full exception below:
Traceback (most recent call last):
File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\metrics.py", line 188, in _score
return super()._score(
File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 345, in _score
y_pred = method_caller(
File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_scorer.py", line 87, in _cached_call
result, _ = _get_response_values(
File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\utils\_response.py", line 210, in _get_response_values
y_pred = prediction_method(X)
File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\pipeline.py", line 341, in predict_proba
Xt = transform.transform(Xt)
File "C:\Users\celes\anaconda3\lib\site-packages\sklearn\utils\_set_output.py", line 295, in wrapped
data_to_wrap = f(self, X, *args, **kwargs)
File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\internal\preprocess\transformers.py", line 233, in transform
X = to_df(X, index=getattr(y, "index", None))
File "C:\Users\celes\anaconda3\lib\site-packages\pycaret\utils\generic.py", line 103, in to_df
data = pd.DataFrame(data, index, columns)
File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\frame.py", line 822, in __init__
mgr = ndarray_to_mgr(
File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 319, in ndarray_to_mgr
values = _prep_ndarraylike(values, copy=copy_on_sanitize)
File "C:\Users\celes\anaconda3\lib\site-packages\pandas\core\internals\construction.py", line 575, in _prep_ndarraylike
values = np.array([convert(v) for v in values])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
warnings.warn(
2024-03-12 18:16:58,428:WARNING:C:\Users\celes\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1561: UserWarning: Note that pos_label (set to 'MM') is ignored when average != 'binary' (got 'weighted'). You may use labels=[pos_label] to specify a single positive class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasjpfan Can you help to investigate if the problem is with pycaret or scikit-learn? I'm doing tests on my laptop but I'm not sure where the error is to fix...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have the bandwidth to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have the bandwidth to investigate.
but can you talk to someone on the scikit-learn side for support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to debug to see if it is a pycaret
bug or a scikit-learn bug. If it is a scikit-learn bug, then open an issue with a minimal reproduce that only involves scikit-learn.
#3935 (comment) is not a valid reproducer for scikit-learn because it is still using pycaret
.
Does any good heart have the time and ability to fix this problem? Ping: @Yard1 @ngupta23 @tvdboom @moezali1 @TremaMiguel @daikikatsuragawa @timho102003 @andrinbuerli @goodwanghan @drmario-gh @AJarman @reza1615 @batmanscode @sherpan @IncubatorShokuhou @wkuopt @cspartalis @ryanxjhan @jinensetpal |
@Aloqeely can you give help fix bugs in pycaret? |
Sorry, I am not familiar with PyCaret |
Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
some details have not been updated to support the latest scikit-learn 1.4 code
Closes #3932