Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to obtain results of ResNet50 v1 #8363

Open
netw0rkf10w opened this issue Mar 31, 2024 · 1 comment
Open

Unable to obtain results of ResNet50 v1 #8363

netw0rkf10w opened this issue Mar 31, 2024 · 1 comment

Comments

@netw0rkf10w
Copy link
Contributor

馃悰 Describe the bug

Hello,

I have tried using the reference classification training code to train ResNet50 on ImageNet. I would like to reproduce the results for the classical recipe (i.e., V1) with step learning rate schedule etc. What I did was executing the following:

GPUs=8
BATCH=32
MODEL=resnet50
OPT=sgd
LRSCHEDULER=steplr
LR=0.1
WD=1e-2
EPOCHS=90
torchrun --nproc_per_node=${GPUs}  train.py --model ${MODEL} --data-path ${DATA_PATH} --batch-size ${BATCH} --opt ${OPT} --lr ${LR} --lr-scheduler ${LRSCHEDULER} --epochs ${EPOCHS} --weight-decay ${WD} --norm-weight-decay 0.0  --model-ema"

However, the results were very bad (like 1% of validation accuracy after 6 epochs). Could you please tell me if this is expected?

Thank you very much in advance!

Versions

PyTorch 2.2., Nvidia V100.

@netw0rkf10w
Copy link
Contributor Author

Cc @datumbox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant