New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add method for regression #571
Comments
I am not opposed to it. |
closing in favor of #105 |
Hi @glemaitre am I right that currently only Thanks! |
@bluemonk482 the name of the models you mentioned ends with
Currently no, but we are interested on including an implementation of such a method. |
Thanks @chkoar ! I assume it is more complex than simply changing |
Yes because you need to understand and make a proper resampling strategy in
the context of regression which is not really straightforward and there is
almost no literature on this.
…On Tue, 30 Jul 2019 at 15:13, bluemonk482 ***@***.***> wrote:
Thanks @chkoar <https://github.com/chkoar> !
I assume it is more complex than simply changing class
BalancedRandomForestClassifier(RandomForestClassifier) to class
BalancedRandomForestClassifier(RandomForestRegressor) in
https://github.com/scikit-learn-contrib/imbalanced-learn/blob/c0aa81c40173bd28b863ccc1b82bbafcacb240c4/imblearn/ensemble/_forest.py
???
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#571?email_source=notifications&email_token=ABY32P44ML33YLHD4EI62A3QCA5A3A5CNFSM4HNZNXWKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3D5MVQ#issuecomment-516413014>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABY32P2YI5JL4TJ4OGTZV43QCA5A3ANCNFSM4HNZNXWA>
.
--
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
|
Understood. Thanks @glemaitre ! |
@glemaitre this thread is such a godsend for me! so, i understand there is no way presently to generate synthetic data for regression problems where obviously the output variable Y is a continuous value. is that correct ? |
I reopen this issue, we could make a generic tool which would quantize the target and allow to apply any sampler. We could think about a meta-estimator to do the job. This would require what is called a relevance function. |
I believe these are relevant for this issue:
|
She wrote several papers on the topic and has some of them implemented in R. |
I think the most simple way to do it without adding new methods, is to discretize target (uniformly or kmeans, quantiles won't do), then fit oversampler and then make an inverse transform (assign midrange bin values instead of bin numbers). It should work through Pipeline and TargetTransformer. |
I also vote for SMOTER. I don't want to have to download a different package https://pypi.org/project/smogn/ to do SMOTE with regression problems. |
As title. and I find a method in R as following:
https://github.com/paobranco/Pre-processingApproachesImbalanceRegression
and paper as :
https://www.semanticscholar.org/paper/SMOTE-for-Regression-Torgo-Ribeiro/43cda672b9ac0833086e19c90d42c2c0fbc361c6
The text was updated successfully, but these errors were encountered: