-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle numeric NA values #187
Conversation
…eplacing numeric NA values)
… using the utility function replace_NA_num()
replace_num_NA <- TRUE
…esponding utility functions)
…alues in numeric predictors. Fixed by adding as.vector() when determining cue class.
Hi Hans! Just lurking here without fully reading the code and only the description. But is this PR really enabling missing value imputation logic for within FFTrees? If so is it enabled by default or not? For both training and test data or just one? Sorry if I missed the description in NEWS or readme, but I didn't see it, thanks! |
Hi Nathaniel, good points, of course. Some clarifications to answer your questions:
Yes, it’s replacing NA values in numeric predictors by their mean (per predictor), as commonly done in simulation studies. (Other replacement policies could easily be implemented in the same way.)
Yes, but there are several global constants that allow enabling / disabling functionality for handling NA values in data. The relevant ones here are: -
Both are currently set to If
Both training and test data are handled in the same way — and detecting and replacing NA values issues corresponding warnings for both. Overall, and especially in combination with the default handling of NA values in categorical predictors (as distinct categories), these changes should provide pretty comprehensive options for a variety of cases. I’d be happy to discuss which of the options should become defaults and which should become user-controlled parameters (of the main Looking forward to hear your thoughts, |
This PR improves the handling of
NA
values in numeric predictors (by adding the option of replacingNA
by means) and revises corresponding tests.