You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In Chapter2, we had make a training set by using stratified sampling to guarantee that the test set is representative of the overall population. However, in the "Better Evaluation Using Cross-Validation" we just use Scikit-Learn’s K-fold cross-validation feature. to randomly splits the training set, that means everytime we train the model, we use a training set that might be not representative of the overall population. Why would this be okay? why don't we need to divided the traing set to k-folds by using stratified sampling?
Thank you for your answer
The text was updated successfully, but these errors were encountered:
In Chapter2, we had make a training set by using stratified sampling to guarantee that the test set is representative of the overall population. However, in the "Better Evaluation Using Cross-Validation" we just use Scikit-Learn’s K-fold cross-validation feature. to randomly splits the training set, that means everytime we train the model, we use a training set that might be not representative of the overall population. Why would this be okay? why don't we need to divided the traing set to k-folds by using stratified sampling?
Thank you for your answer
The text was updated successfully, but these errors were encountered: