Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Indexing multimodal with images using text-only models raises 500 #563

Open
vicilliar opened this issue Aug 4, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@vicilliar
Copy link
Contributor

Describe the bug
When indexing multimodal data with images into a text model (hf/all_datasets_v4_MiniLM-L6) raises a 500 internal error.

Marqo output:

    image_vectors = s2_inference.vectorise(
  File "/app/src/marqo/s2_inference/s2_inference.py", line 74, in vectorise
    vector_batches.append(_convert_tensor_to_numpy(available_models[model_cache_key][AvailableModelsKey.model].encode(batch, normalize=normalize_embeddings, **kwargs)))
  File "/app/src/marqo/s2_inference/hf_utils.py", line 138, in encode
    inputs = self.tokenizer(sentence, padding=True, truncation=True, max_length=self.max_seq_length,
  File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py", line 2548, in __call__
    encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
  File "/usr/local/lib/python3.8/dist-packages/transformers/tokenization_utils_base.py", line 2606, in _call_one
    raise ValueError(
ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).

To Reproduce
Steps to reproduce the behavior:

  1. Create a marqo index with model hf/all_datasets_v4_MiniLM-L6
  2. Index documents in multimodal form with images
  3. Observe error in marqo and client

Expected behavior
Marqo should return a 400 with a better message explaining why the error happened.

Screenshots
Client error:
image
Marqo error:
image

Desktop (please complete the following information):

  • SageMaker notebook client
@vicilliar vicilliar added the bug Something isn't working label Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant