Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Backup and restore using S3 compatible bucket #1673

Open
b0g3r opened this issue Apr 15, 2024 · 2 comments
Open

[Feature request] Backup and restore using S3 compatible bucket #1673

b0g3r opened this issue Apr 15, 2024 · 2 comments

Comments

@b0g3r
Copy link

b0g3r commented Apr 15, 2024

Description

After adding embedding to a medium sized collection you can't rely on resync disaster recovery anymore, because of embedding generation cost/time consuming. So backups are a rescue. Right now backups and backup restores of self-hosted Typesense in a Kubernetes environment is pretty painful:

  • You need to create a sidecar container with shared directory for a snapshot
  • Call snapshot API of the node
  • Pack the snapshot into archive
  • Upload to an external resource
  • Clean up all the files and wait for next schedule

And restoring is also not easy:

  • Create a temporary initContainer
  • Download an archive from external resource
  • Unpack it into shared data directory

Both of operations can be largely simplified if Typesense:

  • Adds support for S3-compatible buckets into the service itself
  • Adds endpoint which will make a snapshot, pack it and upload to S3
  • Adds start parameter, which will download archive from S3, and replace data directory with it before running in-memory indexing

In this case, self-hosted operators will need to only implement cron-job for calling the API + deployment with a special parameter to restore from a backup.

@WoodyWoodsta
Copy link

Of all the services in a cluster I manage, only one of them actually has s3 backup direct from the service. It's a nice idea but I don't think it's realistic. To make it easier, have a look at restic if you haven't already - it makes backing up arbitrary directories to s3 pretty smooth.

@WoodyWoodsta
Copy link

That being said - if it were to support this, I'd be happy. However Kubernetes-friendliness in Typesense is sub-par and I would hope peer discovery stability is worked on before this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants