Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3_scrubber snapshot gets inconsistent LSNs on sharded tenants #7573

Open
jcsp opened this issue May 1, 2024 · 0 comments
Open

s3_scrubber snapshot gets inconsistent LSNs on sharded tenants #7573

jcsp opened this issue May 1, 2024 · 0 comments
Labels
c/storage Component: storage t/bug Issue Type: Bug

Comments

@jcsp
Copy link
Contributor

jcsp commented May 1, 2024

This test/debug tool was recently added in #7444 .

This ticket tracks a limitation in the tool when used for sharded tenants which are being written to.

It works well enough for shards in that it fetches all the data for all the shards, and one can start up a pageserver. However, because shards advance their disk_consistent_lsn independently, trying to run an endpoint against the downloaded data has a couple of problems:

  • Shard zero will serve a basebackup by default at whatever its latest LSN is, but other shards may not have seen that LSN
  • If we hack the metadata for tenants to all have the same disk_consistent_lsn, one would still end up with an un-writable tenant, as a compute trying to write from the basebackup lsn would end up writing safekeeper data that some shards wouldn't ingest because they'd already seen a higher lsn.

To solve this, we probably need to make the tenant-import command smart enough to trim back imported data to a specific lsn (the lowest disk_consistent_lsn of the shards), including trimming layer files. This could either be done in the scrubber or as a pageserver API (perhaps as part of the tenant-import flow).

@jcsp jcsp added t/bug Issue Type: Bug c/storage Component: storage labels May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage Component: storage t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

1 participant