Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abnormal average time when querying different data volume for the same key #3871

Open
gaoboal opened this issue Apr 12, 2024 · 1 comment
Open
Assignees
Labels
storage-engine openmldb storage engine. nameserver & tablet

Comments

@gaoboal
Copy link
Collaborator

gaoboal commented Apr 12, 2024

Description
During query performance testing, it was found that querying all data rows for a key incurs the least time cost; querying a subset of data within a specified time range for a key results in a relatively increased time cost; querying data for a specific timestamp for a key results in an even greater increase in time cost.

Detail:
table:import TalkingData train dataset (180+ million rows) into openmldb
query key: ip=88

key other condition rows average time(us)
ip=88 all 4278 9774.375
ip=88 '2017-11-06 00:00:00' <= ts < "2017-11-07 00:00:00" 183 11570.365
ip=88 ts='2017-11-06 16:19:38' 1 16504.145

more result detail: https://qiok3h8ob4.feishu.cn/docx/YkYfdBZm9oVk0MxLFx9co8lLn1g?from=from_copylink

Expected Behavior
querying smaller amounts of data should have shorter time costs, or at least not longer than querying larger amounts of data.

Steps to Reproduce

  1. deploy openmldb;
  2. load data (TalkingData train.csv)into table;
  3. find a key with large enough total data volume;
  4. execute queries and calculate the average time cost;
@gaoboal gaoboal added the bug Something isn't working label Apr 12, 2024
@aceforeverd
Copy link
Collaborator

storage design limitation. All records for the same key are seeking linearly, records are orderly by ts value, ts with largest value comes first.

@aceforeverd aceforeverd added storage-engine openmldb storage engine. nameserver & tablet and removed bug Something isn't working labels Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
storage-engine openmldb storage engine. nameserver & tablet
Projects
None yet
Development

No branches or pull requests

3 participants