Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

valueSelector = value does not inject noise, making imputation inappropriate #21

Open
137alpha opened this issue Dec 4, 2022 · 1 comment

Comments

@137alpha
Copy link

137alpha commented Dec 4, 2022

The valueSelector = "value" option uses the model prediction from ranger to impute the points

if (valueSelector == "value") {
return(pred)
} else {

This is easy to do but inappropriate because it mean that the imputed values will be noiseless, rather than reflecting the observational error of the model.

For regression, the correct thing to do would be to add random noise to the predictions with mean zero and a standard deviation equal to the standard deviation of the OOB residuals.

@AnotherSamWilson
Copy link

The imputed values will still contain noise, since random forests inherently have random aspects to them in the training process. I do like the idea of adding noise based on the OOB residuals though - it would certainly be more appropriate for the typical use cases for MICE. I would like to keep both options, since imputing with the value is useful in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants