No shuffling of examples in introduction notebook #54

lamblin · 2020-11-06T21:15:54Z

We realized that in the introduction notebook, the usage examples given for the make_multisource_episode_pipeline did not set the shuffle_buffer_size parameter, which defaults to not shuffling examples within each class.

Two unfortunate consequences we identified in code that would not shuffle examples are:

Evaluation on the traffic_sign dataset were overly optimistic, since the examples were organized as 30-image sequences of pictures from the same physical sign (successive frames from the same video), leading to support and query examples being more frequently really close.
Training on small datasets can be worse, since the first examples of a given class would always tend to be support examples, and the later ones would be query examples, reducing the diversity of episodes.

Code using the training loop of Meta-Dataset was not affected, since it gets its shuffle_buffer_size value from a DataConfig object set from a gin configuration that is explicitly passed to Trainer's constructor (in all.gin and imagenet.gin).

We have mitigated the first point by updating the dataset conversion code to shuffle the traffic_sign images once (3512a82), and updated the notebook to show a better practice (c3f62a1), but existing datasets, and code inspired from the notebook (outside of this repository) are still impacted.

Similarly, make_multisource_batch_pipeline does not pass a shuffle_buffer_size, but the impact seems much smaller (batch training should be less sensitive to the order of examples, and the random mixing of different classes adds randomness already).

The text was updated successfully, but these errors were encountered:

lamblin · 2020-12-15T01:15:16Z

The validation procedure on unshuffled examples may also produce biased results, depending on how it was carried out, which could lead to sub-optimal results.

lamblin · 2020-12-17T13:33:33Z

To clarify further: the code in the notebook does not create the DataConfig object, neither explicitly nor implicitly, so setting DataConfig.shuffle_buffer_size does not have any effect when calling pipeline.make_..._pipeline().

…et issue #54 (google-research/meta-dataset#54).

…set issue #54 (google-research/meta-dataset#54).

…et issue #54 (google-research/meta-dataset#54).

lamblin added the bug Something isn't working label Nov 6, 2020

jfb54 added a commit to cambridge-mlg/cnaps that referenced this issue Dec 22, 2020

Explicity set the shuffle_buffer_size parameter to address meta-datas…

7db98d3

…et issue #54 (google-research/meta-dataset#54).

jfb54 added a commit to cambridge-mlg/adv-fsl that referenced this issue Dec 23, 2020

Explicitly set the shuffle_buffer_size parameter to address meta-data…

30cfadd

…set issue #54 (google-research/meta-dataset#54).

jfb54 added a commit to cambridge-mlg/melloo that referenced this issue Dec 29, 2020

Explicity set the shuffle_buffer_size parameter to address meta-datas…

91a0414

…et issue #54 (google-research/meta-dataset#54).

jfb54 mentioned this issue Mar 15, 2021

shuffle buffer issue? mboudiaf/pytorch-meta-dataset#7

Closed

liulu112601 mentioned this issue Jun 22, 2021

why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations？ liulu112601/URT#1

Open

WeiHongLee mentioned this issue Jun 2, 2022

No model update on 5-way 1-shot VICO-UoE/URL#8

Closed

WeiHongLee mentioned this issue Dec 5, 2022

关于您发布的URL模型的测试结果 VICO-UoE/URL#12

Closed

WeiHongLee mentioned this issue May 2, 2023

acurracy in MSCOCO and CIFAR-10 VICO-UoE/URL#17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No shuffling of examples in introduction notebook #54

No shuffling of examples in introduction notebook #54

lamblin commented Nov 6, 2020 •

edited

lamblin commented Dec 15, 2020

lamblin commented Dec 17, 2020 •

edited

No shuffling of examples in introduction notebook #54

No shuffling of examples in introduction notebook #54

Comments

lamblin commented Nov 6, 2020 • edited

lamblin commented Dec 15, 2020

lamblin commented Dec 17, 2020 • edited

lamblin commented Nov 6, 2020 •

edited

lamblin commented Dec 17, 2020 •

edited