About "getting_started.ipynb" #52

KarahanS · 2024-05-12T13:45:36Z

Hi there,

Why did we use batch_size * 2 when we initialize the dataloaders in the following part:

train_loader = data.DataLoader(dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True)
train_loader_at_eval = data.DataLoader(dataset=train_dataset, batch_size=2*BATCH_SIZE, shuffle=False)
test_loader = data.DataLoader(dataset=test_dataset, batch_size=2*BATCH_SIZE, shuffle=False)

Also, why did we use the train_dataset for evaluation during training as well? Wouldn't it be a better practice to use val split of the dataset? Is there a specific reason for your choices?

The text was updated successfully, but these errors were encountered:

aymuos15 · 2024-05-13T09:51:17Z

The bigger batch size is just for faster inference.
Regarding evaluation, I imagine its simply to gauge train vs test.

KarahanS · 2024-05-13T09:59:06Z

Using the same training set for validation during training doesn't really make sense though. The idea is that we have to use samples that the model hasn't seen before and measure the metrics on it.

aymuos15 · 2024-05-13T10:04:45Z

Yeah, that's fair. Waiting to see their reply for that I guess now haha.

duducheng · 2024-05-13T11:04:36Z

Hi guys,

I purely agree with @aymuos15 ‘s reply. ;)

The batch size is only for faster inference. We also monitor the training set performance because the metrics on the whole training set is more consistent than those monitored on training batches. I guess it’s a standard practice especially for research projects which need reporting in the papers.

Nevertheless, it’s only a design choice that has no direct impact on model performances. Feel free to customize your training pipeline, or you can also integrate with popular training framework like PyTorch Lightning.

Jiancheng

KarahanS closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About "getting_started.ipynb" #52

About "getting_started.ipynb" #52

KarahanS commented May 12, 2024

aymuos15 commented May 13, 2024

KarahanS commented May 13, 2024 •

edited

aymuos15 commented May 13, 2024

duducheng commented May 13, 2024

About "getting_started.ipynb" #52

About "getting_started.ipynb" #52

Comments

KarahanS commented May 12, 2024

aymuos15 commented May 13, 2024

KarahanS commented May 13, 2024 • edited

aymuos15 commented May 13, 2024

duducheng commented May 13, 2024

KarahanS commented May 13, 2024 •

edited