Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why shows b.getTablesForUploadDiffRemote return error: not found on remote storage #885

Open
hueiyuan opened this issue Apr 5, 2024 · 9 comments

Comments

@hueiyuan
Copy link

hueiyuan commented Apr 5, 2024

Description

Our backup suddenly failed and shows :

{"command":"upload --diff-from-remote=\"shardshard2-increment-20240404222727\" --resumable=1 shardshard2-increment-20240405005102","status":"error","start":"2024-04-05 02:16:44","finish":"2024-04-05 02:16:45","error":"b.getTablesForUploadDiffRemote return error: \"shardshard2-increment-20240404222727\" not found on remote storage"

But we have checked remote storage indeed have shardshard2-increment-20240404222727 this backup with backup list command. The command result:

{"name":"shardshard2-increment-20240404222727","created":"2024-04-04 23:47:29","size":920897893739,"location":"remote","required":"shardshard2-increment-20240404210213","desc":"zstd, regular"}

So does have any ideas about this problem?

By the way, now we rerun watch command to recover backup process....

print-config

general:
    remote_storage: s3
    max_file_size: 0
    disable_progress_bar: true
    backups_to_keep_local: 0
    backups_to_keep_remote: 50
    log_level: debug
    allow_empty_backups: false
    download_concurrency: 2
    upload_concurrency: 2
    use_resumable_state: true
    restore_schema_on_cluster: ""
    upload_by_part: true
    download_by_part: true
    restore_database_mapping: {}
    retries_on_failure: 3
    retries_pause: 30s
    watch_interval: 30m
    full_interval: 24h
    watch_backup_name_template: shard{shard}-{type}-{time:20060102150405}
    sharded_operation_mode: ""
    cpu_nice_priority: 15
    io_nice_priority: idle
    retriesduration: 30s
    watchduration: 30m0s
    fullduration: 24h0m0s
clickhouse:
    username: xxxxx
    password: xxxxxx
    host: localhost
    port: 9000
    disk_mapping: {}
    skip_tables:
        - system.*
        - INFORMATION_SCHEMA.*
        - information_schema.*
        - _temporary_and_external_tables.*
    skip_table_engines: []
    timeout: 5m
    freeze_by_part: false
    freeze_by_part_where: ""
    use_embedded_backup_restore: false
    embedded_backup_disk: ""
    backup_mutations: true
    restore_as_attach: false
    check_parts_columns: true
    secure: false
    skip_verify: false
    sync_replicated_tables: false
    log_sql_queries: true
    config_dir: /etc/clickhouse-server/
    restart_command: exec:systemctl restart clickhouse-server
    ignore_not_exists_error_during_freeze: true
    check_replicas_before_attach: true
    tls_key: ""
    tls_cert: ""
    tls_ca: ""
    max_connections: 8
    debug: false
s3:
    access_key: ""
    secret_key: ""
    bucket: ipp-clickhouse-backup-prod
    endpoint: ""
    region: us-west-2
    acl: private
    assume_role_arn: arn:aws:iam::xxxx:role/backup-role
    force_path_style: true
    path: backup/chi-shard-backup
    object_disk_path: tiered-backup
    disable_ssl: false
    compression_level: 1
    compression_format: zstd
    sse: ""
    sse_kms_key_id: ""
    sse_customer_algorithm: ""
    sse_customer_key: ""
    sse_customer_key_md5: ""
    sse_kms_encryption_context: ""
    disable_cert_verification: false
    use_custom_storage_class: false
    storage_class: STANDARD
    custom_storage_class_map: {}
    concurrency: 9
    part_size: 0
    max_parts_count: 2000
    allow_multipart_download: false
    object_labels: {}
    request_payer: ""
    check_sum_algorithm: ""
    debug: true
gcs:
    credentials_file: ""
    credentials_json: ""
    credentials_json_encoded: ""
    bucket: ""
    path: ""
    object_disk_path: ""
    compression_level: 1
    compression_format: tar
    debug: false
    force_http: false
    endpoint: ""
    storage_class: STANDARD
    object_labels: {}
    custom_storage_class_map: {}
    client_pool_size: 24
cos:
    url: ""
    timeout: 2m
    secret_id: ""
    secret_key: ""
    path: ""
    compression_format: tar
    compression_level: 1
    debug: false
api:
    listen: 0.0.0.0:7171
    enable_metrics: true
    enable_pprof: false
    username: ""
    password: ""
    secure: false
    certificate_file: ""
    private_key_file: ""
    ca_cert_file: ""
    ca_key_file: ""
    create_integration_tables: true
    integration_tables_host: ""
    allow_parallel: false
    complete_resumable_after_restart: true
ftp:
    address: ""
    timeout: 2m
    username: ""
    password: ""
    tls: false
    skip_tls_verify: false
    path: ""
    object_disk_path: ""
    compression_format: tar
    compression_level: 1
    concurrency: 24
    debug: false
sftp:
    address: ""
    port: 22
    username: ""
    password: ""
    key: ""
    path: ""
    object_disk_path: ""
    compression_format: tar
    compression_level: 1
    concurrency: 24
    debug: false
azblob:
    endpoint_schema: https
    endpoint_suffix: core.windows.net
    account_name: ""
    account_key: ""
    sas: ""
    use_managed_identity: false
    container: ""
    path: ""
    object_disk_path: ""
    compression_level: 1
    compression_format: tar
    sse_key: ""
    buffer_size: 0
    buffer_count: 3
    max_parts_count: 256
    timeout: 4h
    debug: false
custom:
    upload_command: ""
    download_command: ""
    list_command: ""
    delete_command: ""
    command_timeout: 4h
    commandtimeoutduration: 4h0m0s
@Slach
Copy link
Collaborator

Slach commented Apr 5, 2024

Which clickhouse-backup version do you use?

@hueiyuan
Copy link
Author

hueiyuan commented Apr 5, 2024

Which clickhouse-backup version do you use?

@Slach version is 2.4.32

@Slach
Copy link
Collaborator

Slach commented Apr 5, 2024

could you upgrade to 2.4.35 ?

@hueiyuan
Copy link
Author

hueiyuan commented Apr 5, 2024

@Slach
Does version 2.4.35 have fixed corresponding problems?

@Slach
Copy link
Collaborator

Slach commented Apr 5, 2024

Need more logs from clickhouse-backup container to understand what's wrong

@hueiyuan
Copy link
Author

hueiyuan commented Apr 8, 2024

@Slach
I want to confirm additional question, our backup is the sidecar in clickhouse server pod.(Just like this example) And we found when we try to update config of clickhouse-backup, the statefulset do not apply and update. About this, do you have any comment?

@Slach
Copy link
Collaborator

Slach commented Apr 8, 2024

This is a different question, provide more context
Is your clickhouse-backup configuration defined as a separate ConfigMap?
Do you use clickhouse-operator or install clickhouse-server some different way?

By default, kubernetes have time period before kubelet upgrade configmap inside pod
look details https://www.perplexity.ai/search/why-kubernetes-dont-u.h.fDuVT22JOO4ZqWEfug

@hueiyuan
Copy link
Author

hueiyuan commented Apr 8, 2024

This is a different question, provide more context Is your clickhouse-backup configuration defined as a separate ConfigMap? Do you use clickhouse-operator or install clickhouse-server some different way?

By default, kubernetes have time period before kubelet upgrade configmap inside pod look details https://www.perplexity.ai/search/why-kubernetes-dont-u.h.fDuVT22JOO4ZqWEfug

We use Altinity/clickhouse-operator to build clickhouse and sidecar for clickhouse-backup, so do not define additional ConfigMap for this.

@Slach
Copy link
Collaborator

Slach commented Apr 8, 2024

How did you change configuration in this case? Did you use env section?

could yuou share kind: ClickHouseInstallation manifest without sensitive credentials?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants