-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About "getting_started.ipynb" #52
Comments
|
Using the same training set for validation during training doesn't really make sense though. The idea is that we have to use samples that the model hasn't seen before and measure the metrics on it. |
Yeah, that's fair. Waiting to see their reply for that I guess now haha. |
Hi guys, I purely agree with @aymuos15 ‘s reply. ;) The batch size is only for faster inference. We also monitor the training set performance because the metrics on the whole training set is more consistent than those monitored on training batches. I guess it’s a standard practice especially for research projects which need reporting in the papers. Nevertheless, it’s only a design choice that has no direct impact on model performances. Feel free to customize your training pipeline, or you can also integrate with popular training framework like PyTorch Lightning. Jiancheng |
Hi there,
Why did we use
batch_size * 2
when we initialize the dataloaders in the following part:Also, why did we use the
train_dataset
for evaluation during training as well? Wouldn't it be a better practice to useval
split of the dataset? Is there a specific reason for your choices?The text was updated successfully, but these errors were encountered: