S3 Migration
This guide will walk you through the migration from on-disk storage to an S3 compatible object storage.
This document refers to sections of the introduction document on the S3 Setup.
- A S3 compatible object storage to talk to, see "S3 Setup" for options
- Free disk space for migrating existing data, about the current on disk size
- A maintenance window for doing the actual migration
- A full backup, including the config, to enable restoring from it
We can use du(1)
for calculating the current disk usage:
docker exec sharelatex \
du --human-readable --max-depth=0 /var/lib/sharelatex/data/user_files
docker exec sharelatex \
du --human-readable --max-depth=0 /var/lib/sharelatex/data/template_files
In case you do not have sufficient disk space available on the current server, try attaching another disk to the server.
NOTE: The history directories already have the correct layout. You can upload directly from the bind-mounted source folder.
We need to make sure that all user/template files will get migrated. It is best to shut down the instance to avoid missing newly uploaded files.
Please see the wiki page on performing a backup for the shutdown procedure.
We need to rewrite the directory layout of project files for uploading them to
S3.
The directory layout for local storage in filestore is <project-id>_<file-id>
and the directory layout in S3 is <project-id>/<file-id>
.
In the following, /srv/overleaf-s3-migration
is used for storing the files in
the new directory layout.
We can make use of tar(1)
for rewriting the layout:
mkdir -p /srv/overleaf-s3-migration/user_files \
/srv/overleaf-s3-migration/template_files
docker exec sharelatex \
tar --create --directory /var/lib/sharelatex/data/user_files . \
| tar --extract --directory /srv/overleaf-s3-migration/user_files \
--transform=sx_x/x
docker exec sharelatex \
tar --create --directory /var/lib/sharelatex/data/template_files . \
| tar --extract --directory /srv/overleaf-s3-migration/template_files \
--transform=sx_x/xg
Depending on your preference, you can use the minio mc S3 client or the aws cli for uploading the files to your S3 compatible object storage.
Note: Here you should replace
overleaf-user-files
,overleaf-template-files
,overleaf-project-blobs
andoverleaf-chunks
with the names of your S3 buckets.
Note: Also replace
/srv/overleaf-bind-mount
with the local path of the/var/lib/sharelatex
bind-mount. By default, this is~/sharelatex_data
in a docker-compose.yml setup and<toolkit-checkout>/data/sharelatex
when using the Overleaf toolkit.
aws s3 sync /srv/overleaf-s3-migration/user_files s3://overleaf-user-files
aws s3 sync /srv/overleaf-s3-migration/template_files s3://overleaf-template-files
aws s3 sync /srv/overleaf-bind-mount/data/history/overleaf-project-blobs s3://overleaf-project-blobs
aws s3 sync /srv/overleaf-bind-mount/data/history/overleaf-chunks s3://overleaf-chunks
We are using the server alias "s3" here, you may have picked another name.
mc mirror /srv/overleaf-s3-migration/user_files s3/overleaf-user-files
mc mirror /srv/overleaf-s3-migration/template_files s3/overleaf-template-files
mc mirror /srv/overleaf-bind-mount/data/history/overleaf-project-blobs s3/overleaf-project-blobs
mc mirror /srv/overleaf-bind-mount/data/history/overleaf-chunks s3/overleaf-chunks
Add all the S3 related variables to your config, as detailed in the "S3 Setup -> Overview of variables" section.
Docker-compose config: You can also remove the bind-mount for the data dir from the volumes section. Note: Please keep the bind-mount of a scratch disk for ephemeral files in place.
You can now start the instance and validate the migration:
- can preview binary files in the editor
- can compile a PDF with images
- can upload new files
You can roll back the migration gracefully in reversing the steps:
- shutdown the instance
- mirror back the files by flipping the sequence of source/destination
- write new files back into the local directory using an inverse
transform
- restart the instance with the old configuration
# When using aws cli
aws s3 sync s3://overleaf-user-files /srv/overleaf-s3-migration/user_files
aws s3 sync s3://overleaf-template-files /srv/overleaf-s3-migration/template_files
aws s3 sync s3://overleaf-project-blobs /srv/overleaf-bind-mount/data/history/overleaf-project-blobs
aws s3 sync s3://overleaf-chunks /srv/overleaf-bind-mount/data/history/overleaf-chunks
# When using minio mc
mc mirror s3/overleaf-user-files /srv/overleaf-s3-migration/user_files
mc mirror s3/overleaf-template-files /srv/overleaf-s3-migration/template_files
mc mirror s3/overleaf-project-blobs /srv/overleaf-bind-mount/data/history/overleaf-project-blobs
mc mirror s3/overleaf-chunks /srv/overleaf-bind-mount/data/history/overleaf-chunks
# Write files into local Server CE/Server Pro
tar --create --directory /srv/overleaf-s3-migration/user_files . \
| docker exec --interactive sharelatex \
tar \
--extract \
--keep-old-files \
--directory /var/lib/sharelatex/data/user_files \
--transform=sx./xx --transform=sx/x_x \
--wildcards '*/*/*'
tar --create --directory /srv/overleaf-s3-migration/template_files . \
| docker exec --interactive sharelatex \
tar \
--extract \
--keep-old-files \
--directory /var/lib/sharelatex/data/template_files \
--transform=sx./xx --transform=sx/x_xg \
--wildcards '*/*/*/*/pdf-converted-cache/*' \
--wildcards '*/*/*/*/pdf' \
--wildcards '*/*/*/*/zip'
Note: The first transform removes the top level folder. The 2nd transform changes the directory layout to a flat one. The wildcards ensure that only files are extracted, not their parent (project) folders.
- Quickstart Guide (Overleaf Toolkit)
- Hardware Requirements
- Database & Dependencies
- Creating and managing users
- General configuration
- Configuring Email
- SSL & Nginx reverse proxy
- Data and Backups
- Configuring Headers, Footers & Logo
- Password Restrictions
- i18n Languages
- Logging
- Common Config Options
- F.A.Q
- Troubleshooting
- Full Project History Migration