Limit the array index of FixedHashTable by min/max #62746

jiebinn · 2024-04-18T08:25:54Z

We have observed the serial mergeSingleLevel time of Query 7 in ClickBench has costed a lot. Diving into the performance issue, we have found most of the extra cycles have been spent on the isZero() loop condition in iterator ++ of fixedHashTable. This is a patch that fix the performance issue if the key range has only occupied small part of the total 16 bits in fixedHashTable.

If the type of key is 8 bits or 16 bits in aggregation, ClickHouse will use array of 256 or 65536 length to store the key and boost the mergeSingleLevel, rather than key comparison. However, if the key has occupied only small range of the total 65536 cells, most of the cycles are wasted on the isZero() to find the next cell which is not zero in iterator++.

The solution is to use min/max and update min/max when emplace. Then we can set the upper searching limit to max in iterator++. And just set min as the value of begin(), rather than searching the first cell that not equals to 0.

We have tested the patch on 2x80 vCPUs server, Query 7 of ClickBench has gained 2.1x performance improvement. There is no regression for the other queries of ClickBench. The overall geomean has got 2% performance improvement.

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add min/max in fixedHashTable to limit the array index and reduce the isZero() loop in iterator++.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

Modify your CI run:

NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step

Include tests (required builds will be added automatically):

Exclude tests:

Extra options:

do not test (only style check)
disable merge-commit (no merge from master before tests)
disable CI cache (job reuse)

Only specified batches in multi-batch jobs:

1
2
3
4

robot-clickhouse-ci-2 · 2024-04-18T09:26:10Z

This is an automated comment for commit d40c5a0 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
A Sync	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	⏳ pending
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	⏳ pending
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Mergeable Check	Checks if all other necessary checks are successful	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	❌ failure

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
ClickHouse build check	Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker keeper image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docker server image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	Builds and tests the documentation	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
PR Check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success

jiebinn · 2024-04-18T10:01:45Z

Hi @nickitat, thanks to review this PR. BTW, could you help to add the "can be tested" label?

nickitat · 2024-04-18T11:02:15Z

src/Common/HashTable/FixedHashTable.h

@@ -294,36 +296,28 @@ class FixedHashTable : private boost::noncopyable, protected Allocator, protecte

    const_iterator begin() const
    {
-        if (!buf)
+        if (!buf && min > max)


pls let's clarify what min > max means.
maybe it makes sense to extract it into a separate function with a self-explanatory name. afaiu it means that container is empty.

Yes. In that case, min > max means the container is empty. Then outside it is always larger than the end and the for loop will not stop.

void ALWAYS_INLINE mergeToViaEmplace(Self & that, Func && func) { for (auto it = this->begin(), end = this->end(); it != end; ++it) { typename Self::LookupResult res_it; bool inserted; that.emplace(it->getKey(), res_it, inserted, it.getHash()); func(res_it->getMapped(), it->getMapped(), inserted); } }

nickitat · 2024-04-18T11:02:36Z

src/Common/HashTable/FixedHashTable.h

@@ -294,36 +296,28 @@ class FixedHashTable : private boost::noncopyable, protected Allocator, protecte

    const_iterator begin() const
    {
-        if (!buf)
+        if (!buf && min > max)


maybe it should be !buf || min > max?

Sorry for the late reply.
Yes, it should be !buf || min > max. Thanks to correct that.

jiebinn · 2024-04-22T06:50:27Z

I'm working on the testing failures.

jiebinn · 2024-05-09T01:40:31Z

Hi @nickitat , sorry for the late reply as the holiday and personal's leave for one week. Here is the latest update.
I have added the method use_min_max_optimization () to decide if we should this optimization or fallback. There are two cases that we should fall back to the original code path and not use the min/max optimization.

The FixedHashTable is empty, and min/max is not set by the emplace (). Then min > max in this case.
emplace () is the only Interface to update min/max. If the buf of FixedHashTable is updated not by emplace (), then the boundary(min/max) is invalid in this case. We should set the flag use_emplace_to_insert_data false in this case.

BTW, we extract these two pre-requirements max >= min && use_emplace_to_insert_data into a separate function use_min_max_optimization () as you have suggested before.

jiebinn · 2024-05-13T01:50:01Z

Update the performance comparison before and after the patch. Test on 80x2 vCPUs system with ClickBench 43 queries.
Query 7 has got 1.84x performance gain and overall geomean has got 3% performance improvement.

Query:	OPT/BASE
0	105.8%
1	100.2%
2	100.2%
3	99.3%
4	99.5%
5	100.1%
6	101.4%
7	184.3%
8	104.1%
9	103.9%
10	102.4%
11	103.5%
12	101.9%
13	105.4%
14	103.1%
15	103.0%
16	103.9%
17	106.8%
18	111.3%
19	100.2%
20	99.9%
21	100.0%
22	100.3%
23	100.1%
24	99.3%
25	99.1%
26	99.8%
27	100.3%
28	99.0%
29	102.1%
30	101.6%
31	102.7%
32	101.3%
33	98.8%
34	106.0%
35	100.9%
36	99.9%
37	100.6%
38	99.7%
39	100.2%
40	100.1%
41	100.5%
42	100.5%
Overall	103.0%

src/Common/HashTable/FixedHashTable.h

nickitat · 2024-05-16T21:45:56Z

src/Common/HashTable/FixedHashTable.h

-        auto buf_end = buf + NUM_CELLS;
-        while (ptr < buf_end && ptr->isZero(*this))
-            ++ptr;
+        if (!use_min_max_optimization())


I think it also makes sense to abstract

return use_min_max_optimization() ? but + min : buf;

and the same for end into a separate function. so all other code won't know anything inside this logic, it will just call firstPopulatedCell() .

That's great. Thanks for your kind suggestion. The latest commit has packed the related code into firstPopulatedCell() and lastPopulatedCell().

src/Common/HashTable/FixedHashTable.h

nickitat · 2024-05-17T11:06:16Z

Stateful tests (ubsan) - failure looks related

jiebinn · 2024-05-17T15:35:24Z

Stateful tests (ubsan) - failure looks related

The test is passed on my local debug env. And it shows client connection error from the failing log. I will trigger the test again.

jiebinn · 2024-05-20T06:31:47Z

Hi @nickitat, I have found that lots of recent PRs also produced the same msan/tsan error. I have checked the error log and the error is produced while building none x86 binary. I don't think that is related.
[#63942]
[#63942]
[#63985]

azat · 2024-05-20T08:16:50Z

You need to rebase, sanitizers had been fixed in #64090

jiebinn · 2024-05-20T08:31:52Z

You need to rebase, sanitizers had been fixed in #64090

Thanks!

nickitat · 2024-05-20T18:08:12Z

Hi @nickitat, I have found that lots of recent PRs also produced the same msan/tsan error. I have checked the error log and the error is produced while building none x86 binary. I don't think that is related. [#63942] [#63942] [#63985]

report has complains about the new code https://s3.amazonaws.com/clickhouse-test-reports/62746/627722c285b610a5499e705bf1725af2621fccc1/stateful_tests__ubsan_/stderr.log

jiebinn · 2024-05-22T02:24:29Z

Rebase the code and fix the ubsan warning. It seems the current binary (binary_riscv64, binary_loongarch64) compiling error is not related and also existed in PR #64175, #64128, #64202.

If the type of key is 8 bits or 16 bits in aggregation, ClickHouse will use array of 256 or 65536 length to store the key and boost the mergeSingleLevel, rather than key comparison. However, if the key has occupied only small range of the total 65536 cells, most of the cycles are wasted on the `isZero()` to find the next cell which is not zero in iterator++. The solution is to use min/max and update min/max when emplace. Then we can set the upper searching limit to max in iterator++. And just set min as the value of `begin()`, rather than searching the first cell that not equals to 0. We have tested the patch on 2x80 vCPUs server, Query 7 of ClickBench has gained 2.1x performance improvement. Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

…rface to update min/max. If the FixedHashTable.emplace() is not used to revise the hashtable value, then we should not continue the min/max optimization.

Add comment by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

Revise the method name by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

…Cell()

jiebinn · 2024-05-27T03:38:07Z

Hi @nickitat, thanks to review the PR. I'm trying to fix the CI failure by rebase the master branch and check that. Currently, there are two types of testing failures.

By the network connection error.
By the existing reported issue Broken upgrade check: New settings are not reflected in settings changes history #64308.
Do you think what else should I do to help the PR?

nickitat · 2024-05-28T12:12:11Z

Hi @nickitat, thanks to review the PR. I'm trying to fix the CI failure by rebase the master branch and check that. Currently, there are two types of testing failures.

By the network connection error.

By the existing reported issue Broken upgrade check: New settings are not reflected in settings changes history #64308.
Do you think what else should I do to help the PR?

I think we're fine. thanks for patience

jiebinn · 2024-05-28T23:36:25Z

Hi @nickitat, thanks to review the PR. I'm trying to fix the CI failure by rebase the master branch and check that. Currently, there are two types of testing failures.

By the network connection error.

By the existing reported issue Broken upgrade check: New settings are not reflected in settings changes history #64308.
Do you think what else should I do to help the PR?

I think we're fine. thanks for patience

Thanks to review the PR!

nickitat self-assigned this Apr 18, 2024

robot-clickhouse-ci-2 added the pr-performance Pull request with some performance improvements label Apr 18, 2024

nickitat added the can be tested Allows running workflows for external contributors label Apr 18, 2024

nickitat reviewed Apr 18, 2024

View reviewed changes

nickitat approved these changes Apr 19, 2024

View reviewed changes

jiebinn force-pushed the FixedHashTable branch from c3262e6 to 84543a0 Compare April 22, 2024 02:08

jiebinn force-pushed the FixedHashTable branch 2 times, most recently from c881cd8 to 7f76473 Compare May 9, 2024 03:32

jiebinn requested a review from nickitat May 14, 2024 03:00

nickitat reviewed May 16, 2024

View reviewed changes

jiebinn force-pushed the FixedHashTable branch from 1dff8aa to 77d40b5 Compare May 17, 2024 08:14

jiebinn force-pushed the FixedHashTable branch from 77d40b5 to 8bc31ea Compare May 17, 2024 15:31

jiebinn force-pushed the FixedHashTable branch 3 times, most recently from c5015e1 to 662f05c Compare May 19, 2024 15:30

jiebinn force-pushed the FixedHashTable branch 4 times, most recently from 4c9297a to b1dd2e5 Compare May 21, 2024 07:37

jiebinn force-pushed the FixedHashTable branch 3 times, most recently from 0b6ca65 to d40c5a0 Compare May 24, 2024 02:55

jiebinn and others added 8 commits May 24, 2024 19:35

Fix a bug if the container is empty

69960a5

Fix a bug if data will be inserted not by emplace().

60420f2

Add the use_emplace_to_insert_data flag. emplace() is the only inte…

7f960e4

…rface to update min/max. If the FixedHashTable.emplace() is not used to revise the hashtable value, then we should not continue the min/max optimization.

Update src/Common/HashTable/FixedHashTable.h

4e6f5fb

Add comment by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

Update src/Common/HashTable/FixedHashTable.h

ca88da1

Revise the method name by Nikita. Co-authored-by: Nikita Taranov <nickita.taranov@gmail.com>

Generate the seperate function firstPopulatedCell() and lastPopulated…

d1d57ca

…Cell()

Avoid UBSan warning while buf is nullptr

d40c5a0

nickitat added this pull request to the merge queue May 28, 2024

Merged via the queue into ClickHouse:master with commit a7543cd May 28, 2024
246 of 253 checks passed

robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 28, 2024

nikitamikhaylov mentioned this pull request May 31, 2024

Cleanup the changelog for 24.5 #64704

Merged

22 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit the array index of FixedHashTable by min/max #62746

Limit the array index of FixedHashTable by min/max #62746

jiebinn commented Apr 18, 2024 •

edited

robot-clickhouse-ci-2 commented Apr 18, 2024 •

edited by robot-ch-test-poll1

jiebinn commented Apr 18, 2024

nickitat Apr 18, 2024

jiebinn Apr 19, 2024 •

edited

nickitat Apr 18, 2024

jiebinn Apr 19, 2024

jiebinn commented Apr 22, 2024

jiebinn commented May 9, 2024 •

edited

jiebinn commented May 13, 2024 •

edited

nickitat May 16, 2024

jiebinn May 17, 2024

nickitat commented May 17, 2024

jiebinn commented May 17, 2024

jiebinn commented May 20, 2024 •

edited

azat commented May 20, 2024

jiebinn commented May 20, 2024

nickitat commented May 20, 2024

jiebinn commented May 22, 2024

jiebinn commented May 27, 2024

nickitat commented May 28, 2024

jiebinn commented May 28, 2024

Limit the array index of FixedHashTable by min/max #62746

Limit the array index of FixedHashTable by min/max #62746

Conversation

jiebinn commented Apr 18, 2024 • edited

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Modify your CI run:

Include tests (required builds will be added automatically):

Exclude tests:

Extra options:

Only specified batches in multi-batch jobs:

robot-clickhouse-ci-2 commented Apr 18, 2024 • edited by robot-ch-test-poll1

jiebinn commented Apr 18, 2024

nickitat Apr 18, 2024

Choose a reason for hiding this comment

jiebinn Apr 19, 2024 • edited

Choose a reason for hiding this comment

nickitat Apr 18, 2024

Choose a reason for hiding this comment

jiebinn Apr 19, 2024

Choose a reason for hiding this comment

jiebinn commented Apr 22, 2024

jiebinn commented May 9, 2024 • edited

jiebinn commented May 13, 2024 • edited

nickitat May 16, 2024

Choose a reason for hiding this comment

jiebinn May 17, 2024

Choose a reason for hiding this comment

nickitat commented May 17, 2024

jiebinn commented May 17, 2024

jiebinn commented May 20, 2024 • edited

azat commented May 20, 2024

jiebinn commented May 20, 2024

nickitat commented May 20, 2024

jiebinn commented May 22, 2024

jiebinn commented May 27, 2024

nickitat commented May 28, 2024

jiebinn commented May 28, 2024

jiebinn commented Apr 18, 2024 •

edited

robot-clickhouse-ci-2 commented Apr 18, 2024 •

edited by robot-ch-test-poll1

jiebinn Apr 19, 2024 •

edited

jiebinn commented May 9, 2024 •

edited

jiebinn commented May 13, 2024 •

edited

jiebinn commented May 20, 2024 •

edited