Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Writable Warm] initialize a writeable warm index from a snapshot #13675

Open
mch2 opened this issue May 14, 2024 · 1 comment
Open

[Writable Warm] initialize a writeable warm index from a snapshot #13675

mch2 opened this issue May 14, 2024 · 1 comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Remote

Comments

@mch2
Copy link
Member

mch2 commented May 14, 2024

Is your feature request related to a problem? Please describe

This is an optimization related to the Writeable Warm feature.

When the writeable warm feature is introduced we will have the ability to create indices and then migrate them to a warm tier. A use case this does not cover is to create a warm index from a snapshot without having to go through expensive segment download/re-upload as a remote backed index.

Describe the solution you'd like

I think we can make this happen with a new RemoteDirectory implementation that conditionally fetches from a blob store wired to an existing snapshot or another with the remote store directory. This new dir could be injected into RemoteSegmentStoreDirectory as its data directory. All metadata continues to push to the remote store as normal it is only when fetching a file that we would interface with the original snapshot if necessary. In a way its similar to a searchable snapshot dir however those code paths would not be reusable with the incoming writeable warm CompositeDirectory implementation. That dir handles block fetch above RemoteSegmentStoreDirectory.

This would look something like below with new writes would push to the remote store as normal and reads flowing through FilteredRemoteDirectory.

image

A requirement here would be to enforce some level of deletion protection on the original snapshot for the lifetime of the index or at least until all segments from the original snapshot are merged away. We could do this with some new index level settings to validate at snapshot deletion time to ensure its not backing any existing index, similar to searchable snapshots.

Related component

Storage:Remote

Describe alternatives you've considered

Nothing - restore from snapshot as a hot index then migrate.
Migrate data off cluster and somehow wire it up when the dir initializes. This is risky because remote store paths are determined at index creation.

Additional context

No response

@mch2 mch2 added enhancement Enhancement or improvement to existing feature or request untriaged labels May 14, 2024
@peternied
Copy link
Member

[Triage - attendees 1 2 3 4 5 6 7 8]
@mch2 Thanks for creating this issue, looking forward to seeing a pull request to add this functionality

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Remote
Projects
Status: 🆕 New
Status: No status
Development

No branches or pull requests

2 participants