generation creation failed due to S3 upload multipart failed #447

kakarukeys · 2022-12-11T09:48:07Z

when starting litestream I saw this message in the log:

litestream v0.3.8
initialized db: /data/db.sqlite3
replicating to: name="s3" type="s3" bucket="xxx" path="lb-pipeline-prod/db.sqlite3" region="fra1" endpoint="https://fra1.digitaloceanspaces.com" sync-interval=1s
/data/db.sqlite3: init: cannot determine last wal position, clearing generation; primary wal header: EOF
/data/db.sqlite3: sync: new generation "c51b0ab65d5a9c1f", no generation exists


/data/db.sqlite3(s3): monitor error: MultipartUpload: upload multipart failed
        upload id: 2~AVDf9oLvoUjwWcYb5So7CZmoZnpUguF
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

litestream snapshots / litestream generations does not reveal anything under new generation. Apparently the new generation creation has failed.

Is there any config I could set to tune the multipart upload?

my config is:

access-key-id: xxx
secret-access-key: xxx

dbs:
  - path: /data/db.sqlite3
    replicas:
      - url: s3://xxx.fra1.digitaloceanspaces.com/lb-pipeline-prod/db.sqlite3
        retention: 1h
        retention-check-interval: 20m

The text was updated successfully, but these errors were encountered:

kakarukeys · 2022-12-11T10:04:03Z

maybe related to caused by: InvalidArgument: Part number must be an integer between 1 and 1000

benbjohnson · 2022-12-12T01:19:44Z

@kakarukeys There's not currently a config option for this. @anacrolix created a PR a while back but the change should be a configuration option. I'm open to a PR for it if you want to add the config fields.

anacrolix · 2022-12-12T05:51:18Z

@kakarukeys #284

kakarukeys · 2022-12-12T07:46:16Z

i'd love to to. let me see if I can follow the code, and the previous PR. My golang skill got very rusty.

fyi another note, the above failure (OP) does not crash the container, does not raise any alarm. This together with the advice here to set pragma wal_autocheckpoint to 0 cause the WAL file to grow huge on my production server.

hifi · 2022-12-20T13:00:27Z

@kakarukeys We have a downstream patch that prevents the WAL growing in some cases: beeper@cb44be6

Does that work for you? I've only seen it in some rare error conditions and indeed got WALs that were gigabytes in size. We haven't upstreamed it yet as we're running on patched 0.3.9 which conflicts with the current git head.

kakarukeys · 2022-12-21T01:07:03Z

It might work, but I won't bet on that, because....
I am operating sqlite at crazy scale -> 350GB+ file, with several heavy writers and frequent readers.
Even after` turning off litestream and re-enabling the default checkpointing, I see 200GB wal file sometimes.

I read somewhere, if there is not a single moment where the db is not locked for R/W, there is no chance for sqlite to do a checkpointing.
I place my hope on the coming wal2 changes from sqlite (I think it might break litestream).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generation creation failed due to S3 upload multipart failed #447

generation creation failed due to S3 upload multipart failed #447

kakarukeys commented Dec 11, 2022 •

edited

kakarukeys commented Dec 11, 2022

benbjohnson commented Dec 12, 2022

anacrolix commented Dec 12, 2022

kakarukeys commented Dec 12, 2022

hifi commented Dec 20, 2022

kakarukeys commented Dec 21, 2022

generation creation failed due to S3 upload multipart failed #447

generation creation failed due to S3 upload multipart failed #447

Comments

kakarukeys commented Dec 11, 2022 • edited

kakarukeys commented Dec 11, 2022

benbjohnson commented Dec 12, 2022

anacrolix commented Dec 12, 2022

kakarukeys commented Dec 12, 2022

hifi commented Dec 20, 2022

kakarukeys commented Dec 21, 2022

kakarukeys commented Dec 11, 2022 •

edited