New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new commit statistics metrics #10993
base: main
Are you sure you want to change the base?
Conversation
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
03cf4dc
to
8ae2ad1
Compare
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
Seems two first tests faled due an environment problems:
|
|
||
TraceEventFields const& commitBatchTransactions = metrics.at("CommitBatchTransactions"); | ||
if (commitBatchTransactions.size()) { | ||
obj["commit_batch_transactions"] = addLatencyStatistics(commitBatchTransactions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this modified status json, you need to update documentation/sphinx/source/mr-status-json-schemas.rst.inc
and fdbclient/Schemas.cpp
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added new elements to fdbclient/Schemas.cpp
, but now there is no cluster.processes..roles objetcs in documentation/sphinx/source/mr-status-json-schemas.rst.inc
at all.
I can add the description of the new metrics, but I cannot describe all other elements of roles
.
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x
|
Result of foundationdb-pr-macos on macOS Ventura 13.x
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbodagala can you run a correctness for this PR before merging?
Ran a correctness test (with 100000 simulation tests). The test run stopped after doing 99994 test runs, at which point 10 simulation tests have failed (not all of these failures may have been caused by this change set though).
Majority of the failures are on tests in "tests/restarting/from_7.3.0/" directory:
|
Problem statement
Now it is hard to tune a fdb cluster for a write-intensive workload.
Description
While tuning a fdb cluster with a write-intensive application often the bottleneck is the commit latency: when trying to parallel degree of transactions payload, the commit latency grows and prevents increasing the transaction throughput.
There are lots of conditions and knobs influencing the commit latency: number of commit proxies, number of resolvers, number of tlog processes, commit batching knobs: MAX_COMMIT_BATCH_INTERVAL, COMMIT_TRANSACTION_BATCH_INTERVAL_MAX, COMMIT_TRANSACTION_BATCH_INTERVAL_SMOOTHER_ALPHA and others.
But for now, there is no any information, where is the root cause of the high commit latency, so it is unclear, what is to be changed.
Proposal
PR content
This PR implements this proposal: the following new metrics are logged and exposed in
status json
:The sum of the added *_latency_mean metrics should be equal to the commit_latency_mean metrics that already exists