New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: migrate IVF_PQ indices when vector column is casted #2102
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED Lance follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2102 +/- ##
==========================================
- Coverage 81.07% 80.97% -0.10%
==========================================
Files 160 160
Lines 47328 47533 +205
Branches 47328 47533 +205
==========================================
+ Hits 38370 38490 +120
- Misses 6768 6822 +54
- Partials 2190 2221 +31
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@@ -67,6 +67,8 @@ pub trait ProductQuantizer: Send + Sync + std::fmt::Debug { | |||
|
|||
fn dimension(&self) -> usize; | |||
|
|||
fn metric_type(&self) -> MetricType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to use DistanceType
here?
cc @westonpace
.map(|(offset, length)| { | ||
index | ||
.sub_index | ||
.load(reader.clone(), *offset, *length as usize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on new index format, the semantics of load
now is loading the index, and load_partition
is loading the index as a sub index.
the latter requires partition_id
to load the sub index with partition metadata of given partition.
When a user calls
alter_columns()
to change the data type of a vector column, we can attempt to migrate the vector index to the new data type as part of the same transaction. This will allow users to easily migrate from f32-based vectors to f16 ones.Closes #1978