Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't handle vectors with dtype np.float32. np.float64 works just fine #1079

Closed
1 task done
Nitinsiwach opened this issue May 19, 2024 · 2 comments
Closed
1 task done
Assignees
Labels
bug Something isn't working

Comments

@Nitinsiwach
Copy link

Nitinsiwach commented May 19, 2024

How to reproduce this bug?

jeopardy = client.collections.get("JeopardyQuestion")
uuid = jeopardy.data.insert(
    properties={
        "question": "This vector DB is OSS and supports automatic property type inference on import",
        "answer": "Weaviate",
    },
    vector= list(np.array([0.12345] * 1536, dtype = np.float32))
)

What is the expected behavior?

The data should insert just fine

What is the actual behavior?

  1. If I insert one object with collections.data.insert Raises TypeError: Object of type float32 is not JSON serializable
  2. If I insert using collections.data.insert_many. Doesnt use the vectors provided at all instead uses the default vectorizer provided in the yml used to create the client to vectorized the queries and the uses those vectors to store the object

Supporting information

No response

Server Version

1.25.0

Code of Conduct

@Nitinsiwach Nitinsiwach added the bug Something isn't working label May 19, 2024
@tsmith023 tsmith023 self-assigned this May 20, 2024
@tsmith023 tsmith023 transferred this issue from weaviate/weaviate May 20, 2024
@tsmith023
Copy link
Contributor

tsmith023 commented May 20, 2024

Your 1. issue is fixed by this PR: #1077

However, I think your 2. issue may actually be because you have a default vectorizer set in your .yml config but you are creating your collection, as posted on SO, like so:

client.collections.create(
    name = "legal_sections", 
    properties = [
        wvc.config.Property(
            name="content",
            description="The actual section chunk that the answer is to be extracted from",
            data_type=wvc.config.DataType.TEXT,
            index_searchable=True,
            index_filterable=True,
            skip_vectorization=True,
            vectorize_property_name=False
        )
    ]
)

without specifying a vectoriser, so it defaults to the server default. Instead, you should add vectorizer_config=wvc.config.Configure.Vectorizer.none() to the client.collections.create(...) call.

v4.6.3 is now live, I would be grateful if you validate that it works then the issue can be closed 😁

@dirkkul
Copy link
Collaborator

dirkkul commented May 29, 2024

please create a new issue if the mentioned PR does not solve your problem, thanks!

@dirkkul dirkkul closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants