Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid search & normalization #64

Open
ShravanSunder opened this issue Apr 25, 2024 · 0 comments
Open

Hybrid search & normalization #64

ShravanSunder opened this issue Apr 25, 2024 · 0 comments

Comments

@ShravanSunder
Copy link

ShravanSunder commented Apr 25, 2024

Hello! I see many articles (like pinecones) that use the following ways to combine the hybrid search results from dense vector and splade.

However i'm a bit confused of how it would work if the dense vectors are normalized to 1, but splade's output is not. any thoughts. What is the best way to conduct hybrid search with both vectors?

I understand the ANN search is done with dot product, so we would just use the highest score and not try to normalize?

def hybrid_scale(dense, sparse, alpha: float):
    # check alpha value is in range
    if alpha < 0 or alpha > 1:
        raise ValueError("Alpha must be between 0 and 1")
    # scale sparse and dense vectors to create hybrid search vecs
    hsparse = {
        'indices': sparse['indices'],
        'values':  [v * (1 - alpha) for v in sparse['values']]
    }
    hdense = [v * alpha for v in dense]
    return hdense, hsparse

i seee this prior issue: #34 but it seemed inconclusive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant