Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the bytes_on_disk column in system.cnch_parts is unreasonable #1317

Closed
ArthurZou opened this issue Mar 9, 2024 · 7 comments
Closed

the bytes_on_disk column in system.cnch_parts is unreasonable #1317

ArthurZou opened this issue Mar 9, 2024 · 7 comments
Assignees
Labels
bug Something isn't working P2 minor issue resolved

Comments

@ArthurZou
Copy link

image

the rows_count is about 2kw, but the bytes_on_disk is only 730 bytes

@ArthurZou ArthurZou added the bug Something isn't working label Mar 9, 2024
@ArthurZou
Copy link
Author

I'm not sure if there's any causal relationship between this and “alter table ctg_log_share materialize index xxxx”, because I found this bug after starting the mutation task.

@dmthuc dmthuc added the P2 minor issue label Mar 11, 2024
@dmthuc
Copy link
Collaborator

dmthuc commented Mar 11, 2024

Hi Arthur Zou, thank you for reporting the issue. Let we investigate this.

@dmthuc dmthuc self-assigned this Mar 11, 2024
@dmthuc
Copy link
Collaborator

dmthuc commented Mar 11, 2024

Hi @hustnn , I wonder when alter table materialize index do we create a new delta part? Hi @ArthurZou in ByConity, a logical data part may consist of multiple physical data parts. In that case, many physical parts form a part chain with only the top part is visible. For example, when we alter the data, we just create the delta part to contains the changed data and still keeps the original part. And the bytes_on_disk is the size of physical data parts. You may want to remove visible = 1 to see if the total size of all parts sense.

@dmthuc dmthuc assigned hustnn and unassigned dmthuc Mar 11, 2024
@ArthurZou
Copy link
Author

SELECT *
FROM system.cnch_parts
WHERE (table = 'xxxxx') AND (database = 'xxxxx')
ORDER BY partition ASC

Query id: aa919567-c0bf-43f3-a824-0896a61962a5

Row 1:
──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84
partition: '2024-02-26 12:00:00'
name: 1708920000_447983049704997022_447994595548856481_31_448235479823286277
bytes_on_disk: 730
rows_count: 23626261
columns:
marks_count: 2885
index_granularity: []
commit_time: 2024-03-08 15:24:19
kv_commit_time: 2024-03-08 15:24:19
columns_commit_time: 2024-03-08 15:18:53
mutation_commit_time: 2024-03-08 15:24:10
previous_version: 448235411280494669
partition_id: 1708920000
bucket_number: -1
table_definition_hash: 5747785587758883571
outdated: 0
visible: 0
part_type: InvisiblePart
part_id: da98baec-f388-e820-49ce-98dd47d9f159
commit_ts: 448235479877287937
end_ts: 0

Row 2:
──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84
partition: '2024-02-26 12:00:00'
name: 1708920000_447983049704997022_447994595548856481_30_448235411280494669
bytes_on_disk: 466790
rows_count: 23626261
columns:
marks_count: 2885
index_granularity: []
commit_time: 2024-03-08 15:20:07
kv_commit_time: 2024-03-08 15:20:07
columns_commit_time: 2024-03-08 15:18:53
mutation_commit_time: 2024-03-08 15:18:53
previous_version: 448235400619360261
partition_id: 1708920000
bucket_number: -1
table_definition_hash: 5747785587758883571
outdated: 0
visible: 0
part_type: InvisiblePart
part_id: be542387-ccec-c296-1251-e8969433b2df
commit_ts: 448235413769289769
end_ts: 0

Row 3:
──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84
partition: '2024-02-26 12:00:00'
name: 1708920000_447983049704997022_447994595548856481_29_448235400619360261
bytes_on_disk: 1565590
rows_count: 23626261
columns:
marks_count: 2885
index_granularity: []
commit_time: 2024-03-08 15:19:23
kv_commit_time: 2024-03-08 15:19:23
columns_commit_time: 2024-03-08 15:18:53
mutation_commit_time: 2024-03-08 15:18:53
previous_version: 447995488509886497
partition_id: 1708920000
bucket_number: -1
table_definition_hash: 5747785587758883571
outdated: 0
visible: 0
part_type: InvisiblePart
part_id: 09c53682-df43-be7b-635d-94e637b670be
commit_ts: 448235402255138819
end_ts: 0

Row 4:
──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84
partition: '2024-02-26 12:00:00'
name: 1708920000_447983049704997022_447994595548856481_28_447995488509886497
bytes_on_disk: 1186021093
rows_count: 23626261
columns:
marks_count: 2885
index_granularity: []
commit_time: 2024-02-27 01:07:06
kv_commit_time: 2024-02-27 01:07:06
columns_commit_time: 2023-12-23 00:06:46
mutation_commit_time: 1970-01-01 08:00:00
previous_version: 0
partition_id: 1708920000
bucket_number: -1
table_definition_hash: 5747785587758883571
outdated: 0
visible: 0
part_type: InvisiblePart
part_id: 1617a7c6-a4d4-b143-359e-505ebe2e97b0
commit_ts: 447995504640917553
end_ts: 0

Row 5:
──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84
partition: '2024-02-26 12:00:00'
name: 1708920000_447983049704997022_447994595548856481_32_448235486356176964
bytes_on_disk: 730
rows_count: 23626261
columns:
marks_count: 2885
index_granularity: []
commit_time: 2024-03-08 15:24:44
kv_commit_time: 2024-03-08 15:24:44
columns_commit_time: 2024-03-08 15:18:53
mutation_commit_time: 2024-03-08 15:24:42
previous_version: 448235479823286277
partition_id: 1708920000
bucket_number: -1
table_definition_hash: 5747785587758883571
outdated: 0
visible: 1
part_type: VisiblePart
part_id: 0af22e78-48c6-4d09-6cae-7b993850f01d
commit_ts: 448235486395498535
end_ts: 0

————————————————————————————————————————————————
@dmthuc here is the cnch part data of '2024-02-26 12:00:00' partition.

as you mentioned above, Can I get the accurate bytes_on_disk by
SELECT partition, sum(bytes_on_disk)
FROM system.cnch_parts
WHERE (table = 'xxxxx') AND (database = 'xxxxx')
group by partition

@dmthuc
Copy link
Collaborator

dmthuc commented Mar 12, 2024

SELECT * FROM system.cnch_parts WHERE (table = 'xxxxx') AND (database = 'xxxxx') ORDER BY partition ASC

Query id: aa919567-c0bf-43f3-a824-0896a61962a5

Row 1: ──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84 partition: '2024-02-26 12:00:00' name: 1708920000_447983049704997022_447994595548856481_31_448235479823286277 bytes_on_disk: 730 rows_count: 23626261 columns: marks_count: 2885 index_granularity: [] commit_time: 2024-03-08 15:24:19 kv_commit_time: 2024-03-08 15:24:19 columns_commit_time: 2024-03-08 15:18:53 mutation_commit_time: 2024-03-08 15:24:10 previous_version: 448235411280494669 partition_id: 1708920000 bucket_number: -1 table_definition_hash: 5747785587758883571 outdated: 0 visible: 0 part_type: InvisiblePart part_id: da98baec-f388-e820-49ce-98dd47d9f159 commit_ts: 448235479877287937 end_ts: 0

Row 2: ──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84 partition: '2024-02-26 12:00:00' name: 1708920000_447983049704997022_447994595548856481_30_448235411280494669 bytes_on_disk: 466790 rows_count: 23626261 columns: marks_count: 2885 index_granularity: [] commit_time: 2024-03-08 15:20:07 kv_commit_time: 2024-03-08 15:20:07 columns_commit_time: 2024-03-08 15:18:53 mutation_commit_time: 2024-03-08 15:18:53 previous_version: 448235400619360261 partition_id: 1708920000 bucket_number: -1 table_definition_hash: 5747785587758883571 outdated: 0 visible: 0 part_type: InvisiblePart part_id: be542387-ccec-c296-1251-e8969433b2df commit_ts: 448235413769289769 end_ts: 0

Row 3: ──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84 partition: '2024-02-26 12:00:00' name: 1708920000_447983049704997022_447994595548856481_29_448235400619360261 bytes_on_disk: 1565590 rows_count: 23626261 columns: marks_count: 2885 index_granularity: [] commit_time: 2024-03-08 15:19:23 kv_commit_time: 2024-03-08 15:19:23 columns_commit_time: 2024-03-08 15:18:53 mutation_commit_time: 2024-03-08 15:18:53 previous_version: 447995488509886497 partition_id: 1708920000 bucket_number: -1 table_definition_hash: 5747785587758883571 outdated: 0 visible: 0 part_type: InvisiblePart part_id: 09c53682-df43-be7b-635d-94e637b670be commit_ts: 448235402255138819 end_ts: 0

Row 4: ──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84 partition: '2024-02-26 12:00:00' name: 1708920000_447983049704997022_447994595548856481_28_447995488509886497 bytes_on_disk: 1186021093 rows_count: 23626261 columns: marks_count: 2885 index_granularity: [] commit_time: 2024-02-27 01:07:06 kv_commit_time: 2024-02-27 01:07:06 columns_commit_time: 2023-12-23 00:06:46 mutation_commit_time: 1970-01-01 08:00:00 previous_version: 0 partition_id: 1708920000 bucket_number: -1 table_definition_hash: 5747785587758883571 outdated: 0 visible: 0 part_type: InvisiblePart part_id: 1617a7c6-a4d4-b143-359e-505ebe2e97b0 commit_ts: 447995504640917553 end_ts: 0

Row 5: ──────

table_uuid: 70451b82-5a40-4cea-88c9-9c4c0fee8e84 partition: '2024-02-26 12:00:00' name: 1708920000_447983049704997022_447994595548856481_32_448235486356176964 bytes_on_disk: 730 rows_count: 23626261 columns: marks_count: 2885 index_granularity: [] commit_time: 2024-03-08 15:24:44 kv_commit_time: 2024-03-08 15:24:44 columns_commit_time: 2024-03-08 15:18:53 mutation_commit_time: 2024-03-08 15:24:42 previous_version: 448235479823286277 partition_id: 1708920000 bucket_number: -1 table_definition_hash: 5747785587758883571 outdated: 0 visible: 1 part_type: VisiblePart part_id: 0af22e78-48c6-4d09-6cae-7b993850f01d commit_ts: 448235486395498535 end_ts: 0

———————————————————————————————————————————————— @dmthuc here is the cnch part data of '2024-02-26 12:00:00' partition.

as you mentioned above, Can I get the accurate bytes_on_disk by SELECT partition, sum(bytes_on_disk) FROM system.cnch_parts WHERE (table = 'xxxxx') AND (database = 'xxxxx') group by partition

Hi @ArthurZou , yeah. As you can see, those part are form a part chain:

name: 1708920000_447983049704997022_447994595548856481_31_448235479823286277
name: 1708920000_447983049704997022_447994595548856481_30_448235411280494669
name: 1708920000_447983049704997022_447994595548856481_29_448235400619360261
name: 1708920000_447983049704997022_447994595548856481_28_447995488509886497
name: 1708920000_447983049704997022_447994595548856481_32_448235486356176964

@ArthurZou
Copy link
Author

Thanks

@hustnn
Copy link
Contributor

hustnn commented Mar 18, 2024

Hi @hustnn , I wonder when alter table materialize index do we create a new delta part?

Yes, correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2 minor issue resolved
Projects
None yet
Development

No branches or pull requests

4 participants