Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flat index] Switch from objects to `vectors for rescoring BQ #4940

Open
trengrj opened this issue May 16, 2024 · 0 comments
Open

[flat index] Switch from objects to `vectors for rescoring BQ #4940

trengrj opened this issue May 16, 2024 · 0 comments

Comments

@trengrj
Copy link
Member

trengrj commented May 16, 2024

The flat index currently rescores using the objects bucket. This should be switched to use the decidated vectors bucket for performance (as no object metadata needs to be read).

if h.shouldRescore() {
ids := make([]uint64, res.Len())
i := len(ids) - 1
for res.Len() > 0 {
res := res.Pop()
ids[i] = res.ID
i--
}
res.Reset()
for _, id := range ids {
dist, _, _ := h.distanceFromBytesToFloatNode(compressorDistancer, id)
res.Insert(id, dist)
if res.Len() > ef {

Reading occurs using the standard TempVectorForIDThunk which points to the object bucket. This needs to change to use the vectors bucket.

vec, err := h.TempVectorForIDThunk(context.Background(), nodeID, slice)
if err != nil {
var e storobj.ErrNotFound
if errors.As(err, &e) {
h.handleDeletedNode(e.DocID)
return 0, false, nil
} else {
// not a typed error, we can recover from, return with err
return 0, false, errors.Wrapf(err, "get vector of docID %d", nodeID)
}
}
vec = h.normalizeVec(vec)
return concreteDistancer.DistanceToFloat(vec)
}
func (h *hnsw) distanceToFloatNode(distancer distancer.Distancer,
nodeID uint64,
) (float32, bool, error) {
candidateVec, err := h.vectorForID(context.Background(), nodeID)
if err != nil {
return 0, false, err
}
dist, _, err := distancer.Distance(candidateVec)
if err != nil {
return 0, false, errors.Wrap(err, "calculate distance between candidate and query")
}
return dist, true, nil
}
// the underlying object seems to have been deleted, to recover from
// this situation let's add a tombstone to the deleted object, so it
// will be cleaned up and skip this candidate in the current search
func (h *hnsw) handleDeletedNode(docID uint64) {
if h.hasTombstone(docID) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant