Automated estimation of number resamplings given the size of the train data #231
Labels
contributors
Proposed by contributors.
enhancement
New feature or request
good first issue
Good for newcomers
Projects
Is your feature request related to a problem? Please describe.
When defining the "cv" splitter using the Subsample class, it is required to provide the "n_resamplings" and "n_samples". If the "n_resamplings" is not properly selected, the following warning message is raised:
"WARNING: at least one point of training set belongs to every resamplings. Increase the number of resamplings"
Describe the solution you'd like
I think it will be beneficial if there is an automated way to estimate "n_resamplings" given the "n_samples". For instance, a user would choose to fix the "n_samples" in the following manner: n_samples= int(0.25 * gral_train_inputs.shape[0])
Then, the "n_resamplings" is determined accordingly to the size of the training data.
Describe alternatives you've considered
In my case, I decided to fix the "n_samples" as shown above. But now, I have to do trail/error to find the minimum "n_resamplings" to avoid the warning message to ensure good statistical results.
Kind regards,
Ivan
The text was updated successfully, but these errors were encountered: