Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wal-g backup-push hangs after get 409 from S3 #1639

Open
IgorOhrimenko opened this issue Feb 7, 2024 · 1 comment
Open

wal-g backup-push hangs after get 409 from S3 #1639

IgorOhrimenko opened this issue Feb 7, 2024 · 1 comment

Comments

@IgorOhrimenko
Copy link

Database name

postgresql 10-16

Issue description

Describe your problem

wal-g backup-push runs by cron and sometimes get 409 code from S3:

ERROR: 2024/02/07 07:16:19.407366 failed to upload 'postgresql/14/basebackups_005/base_0000000A00000CC20000005D/tar_partitions/part_122.tar.br' to bucket 'backup-postgresql-wal-g': MultipartUpload: upload multipart failed
        upload id: 4d82b763e5237cfbf77e51da1ddc1503
caused by: OperationAborted: A conflicting conditional operation is currently in progress against this resource. Try again.
        status code: 409, request id: 52aa525d-5813-4119-b02e-1b00d78b59d4, host id:
ERROR: 2024/02/07 07:16:19.408824 failed to upload 'postgresql/14/basebackups_005/base_0000000A00000CC20000005D/tar_partitions/part_122.tar.br' to bucket 'backup-postgresql-wal-g': MultipartUpload: upload multipart failed
        upload id: 4d82b763e5237cfbf77e51da1ddc1503
caused by: OperationAborted: A conflicting conditional operation is currently in progress against this resource. Try again.
        status code: 409, request id: 52aa525d-5813-4119-b02e-1b00d78b59d4, host id:

and after that wal-g backup-push hangs and use a lot of CPU:
изображение
but do nothing.

Please provide steps to reproduce

Maybe simulate 409 code?

Please add config and wal-g stdout/stderr logs for debug purpose

.walg.json

{
    "WALE_S3_PREFIX": "s3://backup-postgresql-wal-g/postgresql/14",
    "AWS_ACCESS_KEY_ID": "key",
    "AWS_SECRET_ACCESS_KEY": "secret",
    "AWS_ENDPOINT": "https://s3.example.com",
    "AWS_S3_FORCE_PATH_STYLE": "true",
    "AWS_REGION": "reg-0",
    "PGDATA": "/var/lib/postgresql/14/postgresql",
    "PGHOST": "/var/run/postgresql",
    "PGPORT": "5432",
    "WALG_PREFETCH_DIR": "/var/lib/postgresql/",
    "WALG_TAR_SIZE_THRESHOLD": "4294967296",
    "WALG_UPLOAD_CONCURRENCY": "1",
    "WALG_COMPRESSION_METHOD": "brotli"
}

@x4m
Copy link
Collaborator

x4m commented Mar 9, 2024

This looks strange...
To debug what WAL-G is doing for 100% of cpu, you can send kill -SIGABRT.
But how come you get 409? don't you do two backups simultaneously?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants