FastViT-autodistill error #126

andysingal · 2024-02-06T03:25:48Z

Search before asking

I have searched the Autodistill issues and found no similar bug report.

Bug

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-4-c319f567610d>](https://localhost:8080/#) in <cell line: 17>()
     15 # base_model = FastViT(None)
     16 
---> 17 predictions = base_model.predict("example.png")
     18 
     19 labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

6 frames
[/usr/local/lib/python3.10/dist-packages/torchvision/transforms/_functional_tensor.py](https://localhost:8080/#) in normalize(tensor, mean, std, inplace)
    926     if std.ndim == 1:
    927         std = std.view(-1, 1, 1)
--> 928     return tensor.sub_(mean).div_(std)
    929 
    930 

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

Environment

pip install autodistill-fastvit roboflow transformers -Uq

Minimal Reproducible Example

from autodistill_fastvit import FastViT, FASTVIT_IMAGENET_1K_CLASSES
from autodistill.detection import CaptionOntology

# zero shot with prompts from FASTVIT_IMAGENET_1K_CLASSES
base_model = FastViT(
    ontology=CaptionOntology(
        {
            "Beagle": "beagle",
            "Border Collie": "collie"
        }
    )
)

# zero shot without prompts
# base_model = FastViT(None)

predictions = base_model.predict("example.png")

labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

print(labels)

ERROR

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-4-c319f567610d>](https://localhost:8080/#) in <cell line: 17>()
     15 # base_model = FastViT(None)
     16 
---> 17 predictions = base_model.predict("example.png")
     18 
     19 labels = [FASTVIT_IMAGENET_1K_CLASSES[i] for i in predictions.class_id.tolist()]

6 frames
[/usr/local/lib/python3.10/dist-packages/torchvision/transforms/_functional_tensor.py](https://localhost:8080/#) in normalize(tensor, mean, std, inplace)
    926     if std.ndim == 1:
    927         std = std.view(-1, 1, 1)
--> 928     return tensor.sub_(mean).div_(std)
    929 
    930 

RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

capjamesg · 2024-05-20T14:49:40Z

This error is now fixed. The issue was that the image had four channels -- RGBA -- and the module only supports RGB images. The requisite conversion code has been added to autodistill-fastvit, and we will likely incorporate similar logic in the main autodistill load_image function.

To use the updated code, run pip install --upgrade autodistill-fastvit.

andysingal added the bug Something isn't working label Feb 6, 2024

capjamesg closed this as completed May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FastViT-autodistill error #126

FastViT-autodistill error #126

andysingal commented Feb 6, 2024

capjamesg commented May 20, 2024

FastViT-autodistill error #126

FastViT-autodistill error #126

Comments

andysingal commented Feb 6, 2024

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

capjamesg commented May 20, 2024