New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: inline begin transaction #325
Conversation
Benchmarking the current implementation gives the following results:
This indicates that especially bursty write situations would profit from including the |
1308aec
to
ec5b49e
Compare
Same benchmarks against a real Cloud Spanner database. The total execution times are higher, as the benchmark is executed on my local laptop against a Cloud Spanner instance, making the network latency higher than would have been the case when the benchmark would also be executed from a VM instance in the same Google cloud project and region as the Cloud Spanner instance. The general picture is the same as when benchmarking against a mock server.
|
Benchmark for a one transaction per second scenario, 20% writes and 80% reads. Using prepared sessions
Using inlined BeginTransaction
|
I temporarily added 'Do not merge' as there is one scenario that I just thought of that hasn't been tested yet: If the first statement of a transaction is a streaming SQL statement, and that statement triggers a retry of the stream halfway (i.e. with a resume token), then the retry should not include a |
@olavloite I haven't looked at the implementation yet but there has been a lot of discussion recently about this because the C++ client was implemented this way. But there are cases where inlining the begin transaction call can lead to transactions being committed that have failed. googleapis/google-cloud-cpp#4516 is the C++ issue. They're in the process of confirming the idea for a fix so I think we should wait for that before proceeding with this change. |
93b72f2
to
af50669
Compare
I've looked into this issue and the problem is very subtle. Cloud Spanner treats constraint and non-constraint errors that occur during a read/write transaction differently. Assume the following initial state: CREATE TABLE T (K STRING(MAX), V INT64) PRIMARY KEY (K);
BEGIN TRANSACTION;
INSERT INTO T (K, V) VALUES ('One', 1);
COMMIT; Then we try to execute the following transaction in the Java client using the normal prepared transactions from the session pool: client
.readWriteTransaction()
.run(
new TransactionCallable<Long>() {
@Override
public Long run(TransactionContext transaction) throws Exception {
try {
// This fails with a unique key constraint violation exception.
transaction.executeUpdate(Statement.of("INSERT INTO T (K, V) VALUES ('One', 1)"));
} catch (SpannerException e) {
assertThat(e.getErrorCode()).isEqualTo(ErrorCode.ALREADY_EXISTS);
}
// This statement will also fail because the previous statement failed with a constraint violation.
transaction.executeUpdate(Statement.of("INSERT INTO T (K, V) VALUES ('Two', 2)"));
}
}); The above transaction will fail because a constraint violation during a read/write transaction will invalidate that entire transaction, even though the error is caught and ignored by the client. The following transaction will however succeed: client
.readWriteTransaction()
.run(
new TransactionCallable<Long>() {
@Override
public Long run(TransactionContext transaction) throws Exception {
try {
// This fails with an invalid argument exception (VALUES keyword missing).
transaction.executeUpdate(Statement.of("INSERT INTO T (K, V) ('One', 1)"));
} catch (SpannerException e) {
assertThat(e.getErrorCode()).isEqualTo(ErrorCode.INVALID_ARGUMENT);
}
// This statement will succeed as an INVALID_ARGUMENT error does not fail the entire transaction.
transaction.executeUpdate(Statement.of("INSERT INTO T (K, V) VALUES ('Two', 2)"));
}
}); If we include the The reason that a constraint violation exception (and other data exceptions) should not be allowed to commit in this case is that it technically contains two transactions, which again could cause some very subtle problems. Consider the following (not very efficient) example: client
.readWriteTransaction()
.run(
new TransactionCallable<Long>() {
@Override
public Long run(TransactionContext transaction) throws Exception {
try {
transaction.executeUpdate(Statement.of("INSERT INTO T (K, V) VALUES ('One', 2)"));
} catch (SpannerException e) {
if (e.getErrorCode() == ErrorCode.ALREADY_EXISTS) {
// The row already exists, update it to ensure it has V=2.
transaction.executeUpdate(Statement.of("UPDATE T SET V=2 WHERE K='One'"));
}
}
}
}); This transaction will always return an error when working with a prepared transaction. That is good. It will always commit and return successfully when including The problem with letting this transaction succeed is that the two statements are actually in two different transactions (each include a The proposed solution in googleapis/google-cloud-cpp#4516 is to do the following when the first statement of a transaction returns with an error:
I've added an integration test that shows the subtle difference here: java-spanner/google-cloud-spanner/src/test/java/com/google/cloud/spanner/it/ITTransactionTest.java Lines 582 to 618 in 1817afa
|
Thanks heaps @olavloite for the investigation and for the detailed explanation. From the last couple of commits, I think you have also added a fix. The last decision we need to make is whether we should remove the option and make this the default implementation. I want to follow up with the internal conversation to make sure that this idea is reasonable. I will then review it. Sorry about the time it has taken me to get to this. But hopefully we can merge this soon and further simplify the session pool. |
@olavloite based on a question the C++ team was asking, what do you return if the |
What should also be taken into consideration in that discussion is the following: There is one scenario where this implementation is less efficient than the current implementation that uses prepared sessions from the pool, and that is when a read/write transaction only does a blind write of one or more |
The current implementation returns the error from |
Codecov Report
@@ Coverage Diff @@
## master #325 +/- ##
============================================
- Coverage 84.04% 84.00% -0.05%
+ Complexity 2540 2500 -40
============================================
Files 140 140
Lines 13891 13726 -165
Branches 1329 1314 -15
============================================
- Hits 11675 11530 -145
+ Misses 1672 1652 -20
Partials 544 544 Continue to review full report at Codecov.
|
@@ -190,36 +198,7 @@ void ensureTxn() { | |||
ApiFuture<Void> ensureTxnAsync() { | |||
final SettableApiFuture<Void> res = SettableApiFuture.create(); | |||
if (transactionId == null || isAborted()) { | |||
span.addAnnotation("Creating Transaction"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved to the createTxnAsync
method to be re-usable.
} | ||
latch.addListener( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code is moved to CommitRunnable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a conversation IRL with @olavloite and reviewing the technical details, this looks good to me. We won't release this as is though, on the next PR we should completely remove prepared sessions in the pool. This way, the implementation can be simplified.
* feat: remove session pool preparing * fix: fix integration tests * test: fix malformed retry loop in test case * fix: review comments
private volatile SettableApiFuture<ByteString> transactionIdFuture = null; | ||
|
||
volatile ByteString transactionId; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The first PartialResultSet will be returned successfully, and then a DATA_LOSS exception will be returned. | ||
mockSpanner.setExecuteStreamingSqlExecutionTime( | ||
SimulatedExecutionTime.ofStreamException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// The first PartialResultSet will be returned successfully, and then a DATA_LOSS exception will be returned. | |
mockSpanner.setExecuteStreamingSqlExecutionTime( | |
SimulatedExecutionTime.ofStreamException( | |
// The first PartialResultSet will be returned successfully, and then a DATA_LOSS exception will | |
// be returned. | |
SimulatedExecutionTime.ofStreamException(Status.DATA_LOSS.asRuntimeException(), 1)); |
public List<ByteString> getTransactionsStarted() { | ||
return new ArrayList<>(transactionsStarted); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an auto-generated regeneration of the .pb.go files by cloud.google.com/go/internal/gapicgen. Once this PR is submitted, genmgr will update the corresponding CL at gocloud to depend on the newer version of go-genproto, and assign reviewers. Whilst this or any regen PR is open in go-genproto, gapicgen will not create any more regeneration PRs or CLs. If all regen PRs are closed, gapicgen will create a new set of regeneration PRs and CLs once per night. If you have been assigned to review this CL, please: - Ensure that CI is passing. If it's failing, it requires your manual attention. - Approve and submit this PR if you believe it's ready to ship. That will prompt genmgr to assign reviewers to the gocloud CL. Corresponding gocloud CL: https://code-review.googlesource.com/c/gocloud/+/53430
Description
The Spanner client library will normally prepare a fraction of the sessions in the session pool with a read/write transaction. These sessions are used when an application requests a read/write transaction. This eliminates the need for a BeginTransaction RPC to be executed as part of a transaction, as the call has already been executed in advance by the session pool.
This can however be less efficient in some specific cases:
Cloud Spanner also supports starting a new transaction as part of a query or DML statement. This eliminates the need for a separate round-trip to the server for starting a transaction. This PR adds the possibility to instruct the Spanner client to start read/write transactions by including this option in the first query/update statement of a read/write transaction, instead of preparing the read/write transactions in the session pool. This can improve the overall read/write transaction execution performance.
Including the
BeginTransaction
option with the first query or update statement of a transaction does require the client to block any other statement on the same transaction until the first statement has finished. This means that this strategy could perform less well in case of a read/write transaction that includes multiple parallel reads and/or writes.Technical Implementation
The change in this PR is implemented along the following lines: