Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale index metrics #193

Open
Jiaweihu08 opened this issue May 24, 2023 · 0 comments · May be fixed by #330
Open

Stale index metrics #193

Jiaweihu08 opened this issue May 24, 2023 · 0 comments · May be fixed by #330
Labels
bug Something isn't working

Comments

@Jiaweihu08
Copy link
Member

The current version of the index metrics is not suited for datasets that have appends or have experiences optimization/replications.

Only cube level metrics are shwon and block of files level statistics are absent:

// EXAMPLE OUTPUT
OTree Index Metrics:
dimensionCount: 2
elementCount: 2879966589
depth: 9
cubeCount: 13141
desiredCubeSize: 500000
indexingColumns: ss_sold_date_sk,ss_item_sk
avgFanout: 4.0
depthOnBalance: 1.3567716601745503

Stats on cube sizes:
Quartiles:
- min: 456367
- 1stQ: 498510
- 2ndQ: 499954
- 3rdQ: 501410
- max: 536430
Stats:
- count: 3285
- l1_dev: 0.00449603896499239
- l2_dev: 1.3487574366807247E-4
Level-wise stats:
level, avgCubeSize, stdCubeSize, cubeCount, avgWeight:
- 0:	497810,		0,		1,	1.7361319627929786E-4
- 1:	494798,		3550,		4,	8.689350799817908E-4
- 2:	499781,		3488,		16,	0.003668841950401859
- 3:	500516,		4292,		64,	0.015534089088738918
- 4:	500289,		3967,		256,	0.06698862054431544
- 5:	499966,		3530,		1024,	0.287867372830027
- 6:	499962,		3040,		1792,	0.6729941083529944
- 7:	500142,		10508,		128,	0.7959112180321912

We need more statistics to describe the index with more than a single write.

Candidate metrics:

  • Total file/block count
  • Average file per cube
  • Average file size
  • Average records per file
@Jiaweihu08 Jiaweihu08 added the bug Something isn't working label May 24, 2023
@Jiaweihu08 Jiaweihu08 linked a pull request May 31, 2024 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant