-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tikv panic for "assertion failed: row.lock.is_none()" #16818
Comments
Verfied the |
ref #16818 Signed-off-by: cfzjywxk <cfzjywxk@gmail.com>
@cfzjywxk @seiya-annie Did you tested "follwer-read=false" case? Is this because follower-read? |
This issue has also occurred in non-follower-read Jepsen test jobs. The panic occurs during the observation of the apply result process in Currently, we have added logging before the assertion. After reproducing the issue, we can further examine the logs to determine which transaction's write operation caused this situation. |
ref #16818 Signed-off-by: cfzjywxk <cfzjywxk@gmail.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
A recent reproducing, with more logs: "there is already on row=RowChange { write: None, lock: Some(Put(None, [83, 38, 116, 128, 0, 0, 0, 0, 0, 0, 198, 95, 114, 128, 0, 0, 0, 0, 0, 0, 3, 129, 128, 224, 206, 157, 222, 153, 159, 6, 154, 214, 2, 102, 6, 62, 102, 241, 217, 216, 0, 1, 99, 6, 62, 102, 241, 217, 216, 0, 2, 108, 6, 62, 102, 241, 118, 160, 0, 33, 2])), default: None }" which is Lock { lock_type: Pessimistic, primary_key: 7480000000000000C65F728000000000000003, start_ts: TimeStamp(449910201711591425), ttl: 43802, short_value: , for_update_ts: TimeStamp(449910201711591425), txn_size: 0, min_commit_ts: TimeStamp(449910201711591426), use_async_commit: false, secondaries: [], rollback_ts: [], last_change: Exist { last_change_ts: TimeStamp(449910200046977057), estimated_versions_to_last_change: 2 }, txn_source: 0, is_locked_with_conflict: false, generation: 0 } The related query is
But the to be inserted It's still not clear why there could be duplicate keys in a single apply batch. |
ref #16818 Refactor logs for the unexpected path, print both the exsiting row and input key/value. Signed-off-by: cfzjywxk <lsswxrxr@163.com>
After some investigations, one of the possbility is that pessimistic and prewrite locks with the same key are applied in a same apply batch and it is observed by the Consider a leader transfer situation
In this case
When the follower peer P3 apply logs, the pessimistic lock and prewrite locks could be consumed in a same apply batch, like apply.rs:646: [INFO] [for debug] on store_id=3 cmd_batch=[CmdBatch { level: None, cdc_id: ObserveId(0), rts_id: ObserveId(0), pitr_id: ObserveId(0), region_id: 1, cmds: [Cmd { index: 10, term: 6, request: header { region_id: 1 peer { id: 1 store_id: 1 } region_epoch { conf_ver: 3 version: 1 } } requests { cmd_type: Put put { cf: "lock" key: 6B30000000000000F9 value: 53046B300A0066000000000000000A6C000000000000000001 } }, response: header { current_term: 6 } }, Cmd { index: 11, term: 6, request: header { region_id: 1 peer { id: 1 store_id: 1 } region_epoch { conf_ver: 3 version: 1 } flags: 4 } admin_request { cmd_type: TransferLeader transfer_leader { peer { id: 3 store_id: 3 } } }, response: header { current_term: 6 } admin_response { cmd_type: TransferLeader } }, Cmd { index: 12, term: 6, request: header { region_id: 1 peer { id: 1 store_id: 1 } region_epoch { conf_ver: 3 version: 1 } term: 6 } requests { cmd_type: Put put { cf: "lock" key: 6B30000000000000F9 value: 50046B300AB8177602763066000000000000000A63000000000000000B } }, response: header { current_term: 6 } }] }], thread_id: 486 It's still not clear how the |
But the situation here is duplicate keys in different logs, while the panic happens when there are duplicate keys within a same log which is different.. |
Bug Report
What version of TiKV are you using?
["Git Commit Hash: b7103fa"] [thread_id=1]
What operating system and CPU are you using?
Steps to reproduce
bank-tbls-pessimistic-inplace-follower-neterr
txn-mode=pessimistic, follower-read=true
run bank workload,
injuct network fail
params:
--txn-mode=pessimistic --force-reinstall=true --update-in-place=true
--follower-read=true
--nemesis=schedules,partition-pd-leader,partition-half,partition-ring
--read-lock=update --os=image --time-limit=600 --version=master
--workload=bank-multitable --init-sql='set
@@tidb_enable_mutation_checker=1, @@tidb_txn_assertion_level=strict,
@@tidb_constraint_check_in_place_pessimistic=off'
What did you expect?
pass
What did happened?
The text was updated successfully, but these errors were encountered: