Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_preds gives me non-deterministic results when using dls.train #3987

Open
damiankucharski opened this issue Dec 2, 2023 · 0 comments
Open

Comments

@damiankucharski
Copy link

Please confirm you have the latest versions of fastai, fastcore, and nbdev prior to reporting a bug (delete one): YES

Describe the bug

I am trying to perform inference with resnet18 model I have trained earlier. I want to get predictions for both train and validation datasets.
For the validation dataset I am getting conistent predictions. For the training dataset there is a randomness component. Running the same code (only specifying dls.train) results with different predictions every time. It is not a matter of random shuffle of data and predictions because the overall loss is different too.

To Reproduce

This snippet of code defines the data loader and shows the way I want to get predictions:

path = "./data/"
test_dblock = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,  # Function to get image files
    get_y=parent_label,  # Get label from parent folder name
    splitter=GrandparentSplitter(valid_name="test"),  # Split based on grandparent folder name
    item_tfms=Resize(460),  # Scale image
)
dls_test = test_dblock.dataloaders(path, bs=16, shuffle=False)

learn = vision_learner(dls_test, model_class, metrics=F1Score(), pretrained=True)
learn.load(f"models_deploy/{model_name}_fold_{fold_number}_epoch_{epoch_number}")

preds_train, y_train = learn.get_preds(dl=dls_test.train)

Expected behavior

Every time the preds_train output tensor should be identical.

Error with full stack trace

Multiple runs of this code produce different results.
Additionally, if I change splitter=GrandparentSplitter(valid_name="test") to splitter=GrandparentSplitter(valid_name="train")
and call preds_train, y_train = learn.get_preds(dl=dls_test.valid) instead, the behavior is as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant