Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastViT-autodistill error #126

Closed
1 of 2 tasks
andysingal opened this issue Feb 6, 2024 · 1 comment
Closed
1 of 2 tasks

FastViT-autodistill error #126

andysingal opened this issue Feb 6, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@andysingal
Copy link

Search before asking

  • I have searched the Autodistill issues and found no similar bug report.

Bug

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-4-c319f567610d>](https://localhost:8080/#) in <cell line: 17>()
     15 # base_model = FastViT(None)
     16 
---> 17 predictions = base_model.predict("example.png")
     18 
     19 labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

6 frames
[/usr/local/lib/python3.10/dist-packages/torchvision/transforms/_functional_tensor.py](https://localhost:8080/#) in normalize(tensor, mean, std, inplace)
    926     if std.ndim == 1:
    927         std = std.view(-1, 1, 1)
--> 928     return tensor.sub_(mean).div_(std)
    929 
    930 

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

Environment

pip install autodistill-fastvit roboflow transformers -Uq

Minimal Reproducible Example

from autodistill_fastvit import FastViT, FASTVIT_IMAGENET_1K_CLASSES
from autodistill.detection import CaptionOntology

# zero shot with prompts from FASTVIT_IMAGENET_1K_CLASSES
base_model = FastViT(
    ontology=CaptionOntology(
        {
            "Beagle": "beagle",
            "Border Collie": "collie"
        }
    )
)

# zero shot without prompts
# base_model = FastViT(None)

predictions = base_model.predict("example.png")

labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

print(labels)

ERROR

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-4-c319f567610d>](https://localhost:8080/#) in <cell line: 17>()
     15 # base_model = FastViT(None)
     16 
---> 17 predictions = base_model.predict("example.png")
     18 
     19 labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

6 frames
[/usr/local/lib/python3.10/dist-packages/torchvision/transforms/_functional_tensor.py](https://localhost:8080/#) in normalize(tensor, mean, std, inplace)
    926     if std.ndim == 1:
    927         std = std.view(-1, 1, 1)
--> 928     return tensor.sub_(mean).div_(std)
    929 
    930 

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0
example

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@andysingal andysingal added the bug Something isn't working label Feb 6, 2024
@capjamesg
Copy link
Member

This error is now fixed. The issue was that the image had four channels -- RGBA -- and the module only supports RGB images. The requisite conversion code has been added to autodistill-fastvit, and we will likely incorporate similar logic in the main autodistill load_image function.

To use the updated code, run pip install --upgrade autodistill-fastvit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants