Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier #134

martin0258 · 2024-01-30T09:38:09Z

Problem
I'm running my_first_few_shot_classifier.ipynb without modifying anything, and I found that my test accuracy after training was only 86.90%, i.e., almost no improvement compared with no training at all (86.44%), a large gap compared to the claimed 98% test accuracy in the notebook.

Eval before training

100/100 [00:00<00:00, 197.05it/s]
Model tested on 100 tasks. Accuracy: 86.44%

Training log (the loss jumped between 0.2X ~ 0.3X)

100%|████████████████████| 40000/40000 [07:58<00:00, 83.61it/s, loss=0.305]

Eval after training

100/100 [00:00<00:00, 194.86it/s]
Model tested on 100 tasks. Accuracy: 86.90%

Considered solutions
Not yet, maybe it's caused by seed or differnt package versions?
The loss was not going down during training may a big clue of the reproducibiilty problem.

How can we help
Any ideas what I may miss?

My environment

GPU: RTX 3090 (24GB RAM)
CUDA: nvcc --version (Cuda compilation tools, release 11.8, V11.8.89)
OS: Ubuntu 20.04
Python 3.10
Python package version

easyfsl==1.5.0
torch==2.1.2
torchvision==0.16.2

resnet18 pretrained weight used before training (https://download.pytorch.org/models/resnet18-f37072fd.pth)

The text was updated successfully, but these errors were encountered:

ebennequin · 2024-01-30T10:41:40Z

Hi. I ran the notebook on Colab and could not reproduce the error:

the loss decreases during training (from ~1.2 to ~0.25)
98.18% accuracy during evaluation

My versions for easyfsl, torch and torchvision match yours.

The random seed seems like an unlikely cause for this kind of difference in the result. You may try your theory by fixing the seeds and running the training again.

Did you make any change to the notebook?

martin0258 · 2024-01-30T11:23:33Z

@ebennequin Hi, thanks for the quick reply!

The only change I maded was commmed out the following two lines to avoid using your trained weight:

# !wget https://public-sicara.s3.eu-central-1.amazonaws.com/easy-fsl/resnet18_with_pretraining.tar
# model.load_state_dict(torch.load("resnet18_with_pretraining.tar", map_location="cuda"))

martin0258 · 2024-01-30T14:36:34Z

@ebennequin Update: I just ran Colab (change nothing) and after training with T4 GPU I got Accuracy: 90.90%, better than my local environment but still a large gap compared with your result:

ebennequin · 2024-02-01T10:51:14Z

I also didn't load the pretrained weights. I'm sorry I am unable to reproduce nor explain this gap in the results.

I am keeping this issue open for now. Feel free to share any new findings that could help us solve the issue.

martin0258 added the question Further information is requested label Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier #134

Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier #134

martin0258 commented Jan 30, 2024 •

edited

ebennequin commented Jan 30, 2024

martin0258 commented Jan 30, 2024

martin0258 commented Jan 30, 2024 •

edited

ebennequin commented Feb 1, 2024

Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier #134

Cannot reproduce the claimed 98% test accuracy after training in my_first_few_shot_classifier #134

Comments

martin0258 commented Jan 30, 2024 • edited

ebennequin commented Jan 30, 2024

martin0258 commented Jan 30, 2024

martin0258 commented Jan 30, 2024 • edited

ebennequin commented Feb 1, 2024

martin0258 commented Jan 30, 2024 •

edited

martin0258 commented Jan 30, 2024 •

edited