-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the array index of FixedHashTable by min/max #62746
Conversation
This is an automated comment for commit d40c5a0 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
Hi @nickitat, thanks to review this PR. BTW, could you help to add the "can be tested" label? |
@@ -294,36 +296,28 @@ class FixedHashTable : private boost::noncopyable, protected Allocator, protecte | |||
|
|||
const_iterator begin() const | |||
{ | |||
if (!buf) | |||
if (!buf && min > max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls let's clarify what min > max
means.
maybe it makes sense to extract it into a separate function with a self-explanatory name. afaiu it means that container is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. In that case, min > max
means the container is empty. Then outside it
is always larger than the end
and the for loop will not stop.
void ALWAYS_INLINE mergeToViaEmplace(Self & that, Func && func)
{
for (auto it = this->begin(), end = this->end(); it != end; ++it)
{
typename Self::LookupResult res_it;
bool inserted;
that.emplace(it->getKey(), res_it, inserted, it.getHash());
func(res_it->getMapped(), it->getMapped(), inserted);
}
}
@@ -294,36 +296,28 @@ class FixedHashTable : private boost::noncopyable, protected Allocator, protecte | |||
|
|||
const_iterator begin() const | |||
{ | |||
if (!buf) | |||
if (!buf && min > max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should be !buf || min > max
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late reply.
Yes, it should be !buf || min > max
. Thanks to correct that.
I'm working on the testing failures. |
Hi @nickitat , sorry for the late reply as the holiday and personal's leave for one week. Here is the latest update.
BTW, we extract these two pre-requirements |
c881cd8
to
7f76473
Compare
Update the performance comparison before and after the patch. Test on 80x2 vCPUs system with ClickBench 43 queries.
|
auto buf_end = buf + NUM_CELLS; | ||
while (ptr < buf_end && ptr->isZero(*this)) | ||
++ptr; | ||
if (!use_min_max_optimization()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it also makes sense to abstract
return use_min_max_optimization() ? but + min : buf;
and the same for end
into a separate function. so all other code won't know anything inside this logic, it will just call firstPopulatedCell()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great. Thanks for your kind suggestion. The latest commit has packed the related code into firstPopulatedCell()
and lastPopulatedCell()
.
|
The test is passed on my local debug env. And it shows client connection error from the failing log. I will trigger the test again. |
c5015e1
to
662f05c
Compare
You need to rebase, sanitizers had been fixed in #64090 |
Thanks! |
report has complains about the new code https://s3.amazonaws.com/clickhouse-test-reports/62746/627722c285b610a5499e705bf1725af2621fccc1/stateful_tests__ubsan_/stderr.log |
4c9297a
to
b1dd2e5
Compare
0b6ca65
to
d40c5a0
Compare
If the type of key is 8 bits or 16 bits in aggregation, ClickHouse will use array of 256 or 65536 length to store the key and boost the mergeSingleLevel, rather than key comparison. However, if the key has occupied only small range of the total 65536 cells, most of the cycles are wasted on the `isZero()` to find the next cell which is not zero in iterator++. The solution is to use min/max and update min/max when emplace. Then we can set the upper searching limit to max in iterator++. And just set min as the value of `begin()`, rather than searching the first cell that not equals to 0. We have tested the patch on 2x80 vCPUs server, Query 7 of ClickBench has gained 2.1x performance improvement. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
…rface to update min/max. If the FixedHashTable.emplace() is not used to revise the hashtable value, then we should not continue the min/max optimization.
Add comment by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
Revise the method name by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>
Hi @nickitat, thanks to review the PR. I'm trying to fix the CI failure by rebase the master branch and check that. Currently, there are two types of testing failures.
|
I think we're fine. thanks for patience |
a7543cd
Thanks to review the PR! |
We have observed the serial mergeSingleLevel time of Query 7 in ClickBench has costed a lot. Diving into the performance issue, we have found most of the extra cycles have been spent on the
isZero()
loop condition in iterator ++ of fixedHashTable. This is a patch that fix the performance issue if the key range has only occupied small part of the total 16 bits in fixedHashTable.If the type of key is 8 bits or 16 bits in aggregation, ClickHouse will use array of 256 or 65536 length to store the key and boost the mergeSingleLevel, rather than key comparison. However, if the key has occupied only small range of the total 65536 cells, most of the cycles are wasted on the
isZero()
to find the next cell which is not zero in iterator++.The solution is to use min/max and update min/max when emplace. Then we can set the upper searching limit to max in iterator++. And just set min as the value of
begin()
, rather than searching the first cell that not equals to 0.We have tested the patch on 2x80 vCPUs server, Query 7 of ClickBench has gained 2.1x performance improvement. There is no regression for the other queries of ClickBench. The overall geomean has got 2% performance improvement.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add min/max in fixedHashTable to limit the array index and reduce the
isZero()
loop in iterator++.Documentation entry for user-facing changes
Modify your CI run:
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step
Include tests (required builds will be added automatically):
Exclude tests:
Extra options:
Only specified batches in multi-batch jobs: