Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload / download --resume --tables=t1,t2 after upload / download --resume --tables=t1 don't processed t2 #840

Open
amirshabanics opened this issue Feb 19, 2024 · 18 comments
Assignees
Milestone

Comments

@amirshabanics
Copy link

Hi asume we have backup files a, b, c and we have tables z, x, y. now in step 1 i restore clickhouse with all three files for table z. then i want to do it for x and y too. this error throw:

warn 'base_backup_2024-02-15T00:30:01+03:30' doesn't contains tables for restore backup=base_backup_2024-02-15T00:30:01+03:30 operation=restore
@amirshabanics
Copy link
Author

I understand what happens. the metadata of files that are stored in /backup didn't update itself. I have the full metadata of it and when I copy it in the directory and run restore with the --resume param, download the new table successfully.

@Slach
Copy link
Collaborator

Slach commented Feb 20, 2024

Corry, i don't understand what is the issue?
Could you provide full backup commands sequence?

Recently, i saw someone try to restore_remote --schema and after that try to restore_remote and first command download metadata/db/table.json with metadata_only:true which not allow download data files again...

@Slach
Copy link
Collaborator

Slach commented Feb 21, 2024

any news from your side?

@amirshabanics
Copy link
Author

Assume i run this command clickhouse-backup restore --tables='A,B' backup_file and then run this command again clickhouse-backup restore --tables='A,B,C,D' backup_file.

We know that backup_file has all A,B,C,D tables. but it raise error that the table doesn't exist. we must remove all data in /var/lib/clickhouse/backup and run command clickhouse-backup restore --resume --tables='A,B,C,D' backup_file to get the new tables.

@Slach
Copy link
Collaborator

Slach commented Feb 22, 2024

Are you sure you are use restore command and not use restore_remote? --resume option is not present in restore command

@amirshabanics
Copy link
Author

amirshabanics commented Feb 22, 2024

oh sorry you are right, this is simple code I run:

tables=(A B C D)
tables_with_comma=$(echo "${tables[@]}" | tr '[ ]' ',')

clickhouse-backup list > /tmp/backup_list
BASE=$(cat /tmp/backup_list | grep remote | awk '{print $1}' | grep base | sort | tail -n 1)
clickhouse-backup download --resume --tables=$tables_with_comma "$BASE"
for table in ${tables[@]}; do
  echo "----- Restoring table: ${table}"
  clickhouse-backup restore --drop -t "${table}" "$BASE"
done

@amirshabanics
Copy link
Author

tables=(A B C D)

when you add one more table to this, the restore command doesn't work well. it raise error that the table doesn't exist in backup file.

@Slach
Copy link
Collaborator

Slach commented Feb 22, 2024

first execution of download --resume will store download state into
/var/lib/clickhouse/backup/backup_name/download.state

second execution when you change tables=
will find backup_name/metadata.json in /var/lib/clickhouse/backup/backup_name/download.state and decide whole backup already downloaded

you could just remove /var/lib/clickhouse/backup/$BASE/upload.state if you change tables=(A B C) to tables=(A B C D)

will try to fix it

@Slach Slach changed the title can't add new table when download a file s3 storage upload / download --resume --tables=t1,t2 after upload / download --resume --tables=t1 don't processed t2 Feb 22, 2024
@Slach Slach added this to the 2.6.0 milestone Feb 22, 2024
@Slach Slach self-assigned this Feb 22, 2024
@amirshabanics
Copy link
Author

thanks. if it needs just a simple fix, I can do it?

@Slach
Copy link
Collaborator

Slach commented Feb 22, 2024

not sure it will a simple fix from our side
we need to implement different format for resumableState files first
#828

@amirshabanics
Copy link
Author

why do we need to add --resume? it can't check itself whether need to download or not?

@Slach
Copy link
Collaborator

Slach commented Feb 23, 2024

your "$BASE" backup just not changed after previous download
and contains download.state file and /var/lib/clickhouse/backup/backup_name/metadata.json which contains only A B C tables

I don't see your whole workflow, and don't know your goals, you shared it partially
so i can't suggest to you properly command sequence

I don't know, why do you need --resume, but without --resume your download for backup which already exists will fail.

--resume turned on by default after 2.2.0
with config options USE_RESUMABLE_STATE

general:
  use_resumable_state: true

@amirshabanics
Copy link
Author

this is my whole code but it is simple. I have another clickhouse in another server and i backup it to s3. and then i restore it to the clickhouse every 2 hour. the clickhouse in another server backup all tables but my clickhouse need just some of them and may (business reason) need different tables.
if i don't pass --resume param it doesn't check that it download all tables for a backup file. when i remove metadata and download state and then run it with--resume param it download all remain tables.

@Slach
Copy link
Collaborator

Slach commented Feb 27, 2024

in this case you can try to use

USE_RESUMABLE_STATE=0 clickhouse-backup download --schema --tables=$tables_with_comma "$BASE"
clickhouse-backup download --resume --tables=$tables_with_comma "$BASE"

first execution will download $BASE/metadata folder
second execution will download $BASE/metadata + $BASE/shadow and will use download.state

@Slach Slach closed this as completed Feb 27, 2024
@Slach Slach removed this from the 2.6.0 milestone Feb 27, 2024
@amirshabanics
Copy link
Author

ok but when i change tables_with_comma this is correct or not?

@Slach Slach reopened this Feb 27, 2024
@Slach Slach added this to the 2.6.0 milestone Feb 27, 2024
@Slach
Copy link
Collaborator

Slach commented Feb 27, 2024

=( if $BASE is not changed from previous run, then it will fail, i try to fix it, wait when 2.5.0 will released

@Slach Slach modified the milestones: 2.6.0, 2.5.0 Feb 27, 2024
@amirshabanics
Copy link
Author

can i somehow run it until you fix it? may be by deleting download.state and metadata.json?

@Slach
Copy link
Collaborator

Slach commented Mar 3, 2024

@amirshabanics wait when 2.5.0 will released, after it your use case will work

@Slach Slach modified the milestones: 2.5.0, 2.6.0 Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants