Transaction: on retry, replays should compare checksums of prior numbered statements that succeeded #34
Labels
api: spanner
Issues related to the googleapis/python-spanner API.
priority: p2
Moderately-important priority. Fix may not be included in next release.
type: feature request
‘Nice-to-have’ improvement, new feature or different behavior or design.
Coming here from a project that plans on adding Cloud Spanner as a backend for Django.
In AUTOCOMMIT=off mode, we need to hold a Transaction for perhaps an indefinitely long time.
Cloud Spanner will abort:
a) Transactions when not used for 10seconds or more -- we can periodically send a SELECT 1=1 to keep it active
b) Transactions even when refreshed, can and will abort. This is because Cloud Spanner has a high abort rate
Thus we need to retry Transactions!
Current retry
The current code for retrying in this repository is to just re-invoke the function that was passed into *.run_in_transaction afresh with a new Transaction per
python-spanner/google/cloud/spanner_v1/session.py
Lines 290 to 321 in 23916c5
Recommended retry
However, the correct way to retry Transactions as @bvandiver explained to me
a) For every result returned by an operation on a Transaction, compute the checksum and add it a FIFO stack
b) At the point that a prior Transaction fails, that's the bottom of our stack
c) When retrying the Transaction from the first statement, compare its checksum with the same ordinal number/index on the FIFO stack -- if any of them don't match, abort the Transaction as not retryable
This is what the Java spanner-jdbc implementation does
Suggestion
The implementation of this feature when attempted outside of this package involves a whole lot of hacking since we need to consume the raw data sent to StreamedResultSets which requires then proto marshalling and wrapping StreamedResult -- quite non-ideal and will actually involve patches to python-spanner.
@bvandiver and I chatted again about this today and I also briefly raised this issue to @skuruppu this afternoon too.
The text was updated successfully, but these errors were encountered: