You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After adding embedding to a medium sized collection you can't rely on resync disaster recovery anymore, because of embedding generation cost/time consuming. So backups are a rescue. Right now backups and backup restores of self-hosted Typesense in a Kubernetes environment is pretty painful:
You need to create a sidecar container with shared directory for a snapshot
Call snapshot API of the node
Pack the snapshot into archive
Upload to an external resource
Clean up all the files and wait for next schedule
And restoring is also not easy:
Create a temporary initContainer
Download an archive from external resource
Unpack it into shared data directory
Both of operations can be largely simplified if Typesense:
Adds support for S3-compatible buckets into the service itself
Adds endpoint which will make a snapshot, pack it and upload to S3
Adds start parameter, which will download archive from S3, and replace data directory with it before running in-memory indexing
In this case, self-hosted operators will need to only implement cron-job for calling the API + deployment with a special parameter to restore from a backup.
The text was updated successfully, but these errors were encountered:
Of all the services in a cluster I manage, only one of them actually has s3 backup direct from the service. It's a nice idea but I don't think it's realistic. To make it easier, have a look at restic if you haven't already - it makes backing up arbitrary directories to s3 pretty smooth.
That being said - if it were to support this, I'd be happy. However Kubernetes-friendliness in Typesense is sub-par and I would hope peer discovery stability is worked on before this.
Description
After adding embedding to a medium sized collection you can't rely on resync disaster recovery anymore, because of embedding generation cost/time consuming. So backups are a rescue. Right now backups and backup restores of self-hosted Typesense in a Kubernetes environment is pretty painful:
And restoring is also not easy:
Both of operations can be largely simplified if Typesense:
In this case, self-hosted operators will need to only implement cron-job for calling the API + deployment with a special parameter to restore from a backup.
The text was updated successfully, but these errors were encountered: