Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting a CHI resource may leave debris from replicated tables in [Zoo]Keeper that requires later cleanup #1387

Open
hodgesrm opened this issue Apr 7, 2024 · 3 comments

Comments

@hodgesrm
Copy link
Member

hodgesrm commented Apr 7, 2024

If you delete a CHI cluster that has replicated tables, [Zoo]Keeper metadata is not cleaned up when the replica(s) are deleted. This leads to errors like the following if you then re-create the CHI resource and try to add tables again:

Received exception from server (version 23.7.4):
Code: 253. DB::Exception: Received from localhost:9000. DB::Exception: There was an error on [chi-demo2-s3-0-0:9000]: Code: 253. DB::Exception: Replica /clickhouse/s3/tables/0/default/test_local/replicas/chi-demo2-s3-0-0 already exists. (REPLICA_ALREADY_EXISTS) (version 23.7.4.5 (official build)). (REPLICA_ALREADY_EXISTS)
(query: CREATE TABLE IF NOT EXISTS test_local ON CLUSTER `{cluster}`

To duplicate this problem, follow the steps shown below.

  1. Create ClickHouse CHI using kubectl apply -f with proper connection to Keeper.
  2. Create at least one replica table. (See the example DDL below.)
  3. Delete the CHI using kubectl delete chi/<name>.

Now try to duplicate steps 1 and 2 again. This will fail due to existing ZooKeeper metadata.

The workaround is to remote the replica paths using SYSTEM DROP REPLICA. It's painful if there are many tables. Example:

SYSTEM DROP REPLICA 'chi-demo2-s3-0-0' FROM ZKPATH '/clickhouse/s3/tables/0/default/test_local'

Use the following DDL for step 2 above.

CREATE TABLE IF NOT EXISTS test_local ON CLUSTER `{cluster}`
(
    `A` Int64,
    `S` String,
    `D` Date
)
ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/{database}/test_local', '{replica}')
PARTITION BY D ORDER BY A;
@hodgesrm
Copy link
Member Author

hodgesrm commented Apr 7, 2024

Note: this problem does not arise if you scale replicas down. In that case the operator properly deletes replica tables which ensures ZooKeeper cleanup.

@alex-zaitsev
Copy link
Member

This is weird, operator deletes all replicated tables when deleting CHI. There s a regression test for that.

@alex-zaitsev
Copy link
Member

It could be related to #1388 -- DROP TABLE may return fast, but ClickHouse may keep deleting data in the background. Probably SYNC may help here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants