Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrency: allow {exclusive,shared} repl lock acquisition/discovery #110145

Merged
merged 2 commits into from
Sep 7, 2023

Conversation

arulajmani
Copy link
Collaborator

@arulajmani arulajmani commented Sep 6, 2023

Now that the lock table is able to track replicated locks with lock
strengths other than lock.Intent, this patch enables lock acquisition,
re-acquisition, and discovery of replicated {exclusive, shared} locks.

Most of this patch is about adding testing. One interesting behaviour
change worth calling out is that replicated exclusive locks do not block
non-locking readers, regardless of the reader's/lock holder's timestamp.
This is in contrast to unreplicated exclusive locks, which do.

This patch leaves things in a slightly transitional stage -- for now, we
forward a replicated lock's timestamp blindly on acquisition. In an
upcoming patch, we'll limit this to lock acquisition/discovery of just
intents. This is because the timestamp tracking is meaningless for other
lock strengths. Worse yet, the current behaviour opens us up to the
possibility of infinite lock discovery loops, where a request keeps
re-discovering a write intent but never blocking on it because of this
timestamp forwarding behaviour.

Informs #109673

Release note: None

This patch pulls timestamp assignment for replicated locks behind the
call to `.acquire`. In a future patch, we'll only assign/update this
field for intents. Doing this now means there'll be a single point in
the code that'll need updating then.

Epic: none

Release note: None
@arulajmani arulajmani requested a review from a team as a code owner September 6, 2023 22:18
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table.go line 1313 at r2 (raw file):

// blanket timestamp from this function is a pretty sharp edge. We should split
// this function out into two -- one that returns the holder's transaction, and
// another that fits the timestamp needs at the callers.

+1, I'm looking forward to seeing how this change looks. I suspect it will lead to some clarity, where the timestamp is only ever consulted on the non-locking read paths.


pkg/kv/kvserver/concurrency/lock_table.go line 1417 at r2 (raw file):

			// 2. We could only check the timestamp if the lock is being acquired with
			// strength lock.Intent, as we disregard the timestamp for all other lock
			// lock strengths when dealing with replicated locks.

"lock lock"


pkg/kv/kvserver/concurrency/lock_table.go line 2895 at r2 (raw file):

		// check that the lock acquisition is compatible.
		m := makeLockMode(acq.Strength, &acq.Txn, acq.Txn.WriteTimestamp)
		for e := kl.holders.Front(); e != nil; e = e.Next() {

With shared locks, there are now a number of places where we'll fall into quadratic behavior if we're not careful. This kind of verification is one of them. The verification is important to catch bugs, but do we want to run in production, or only in tests? How does that fit in with the more general test-only validation framework we were planning to add?

I think in general we do want to avoid quadratic behavior around shared lock acquisition and release, though we can probably start with a goal to avoid it at the storage layer (where costs are amplified) and then make improvements to the in-memory data structures after.


pkg/kv/kvserver/concurrency/lock_table.go line 2901 at r2 (raw file):

			if holderTxn.ID != acq.Txn.ID &&
				// ... which conflicts with the lock being acquired.
				lock.Conflicts(holder.getLockMode(), m, &st.SV) {

nit: pull out a holderMode variable to avoid calling holder.getLockMode() twice.


pkg/kv/kvserver/concurrency/lock_table.go line 2957 at r2 (raw file):

		// If lock(s) are already held on this key by other transactions, sanity
		// check that the discovered lock is compatible with them.
		m := makeLockMode(foundLock.Strength, &foundLock.Txn, foundLock.Txn.WriteTimestamp)

This is almost identical to the logic above. Should we extract it?


pkg/kv/kvserver/concurrency/lock_table_test.go line 471 at r2 (raw file):

				leaseSeq := roachpb.LeaseSequence(seq)
				str := lock.Intent // default replicated locks to write intents
				if d.HasArg("strength") {

Unrelated to this PR, but I noticed in concurrency_manager_test.go, we have the logic:

if d.HasArg("str") {  
    str = concurrency.ScanLockStrength(t, d)  
}

Is that broken? We're looking for "str" and then "strength". It looks like we never try to use it in tests.

Now that the lock table is able to track replicated locks with lock
strengths other than lock.Intent, this patch enables lock acquisition,
re-acquisition, and discovery of replicated {exclusive, shared} locks.

Most of this patch is about adding testing. One interesting behaviour
change worth calling out is that replicated exclusive locks do not block
non-locking readers, regardless of the reader's/lock holder's timestamp.
This is in contrast to unreplicated exclusive locks, which do.

This patch leaves things in a slightly transitional stage -- for now, we
forward a replicated lock's timestamp blindly on acquisition. In an
upcoming patch, we'll limit this to lock acquisition/discovery of just
intents. This is because the timestamp tracking is meaningless for other
lock strengths. Worse yet, the current behaviour opens us up to the
possibility of infinite lock discovery loops, where a request keeps
re-discovering a write intent but never blocking on it because of this
timestamp forwarding behaviour.

Informs cockroachdb#109673

Release note: None
Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR; addressed all comments -- should be good for another look!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 1313 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

+1, I'm looking forward to seeing how this change looks. I suspect it will lead to some clarity, where the timestamp is only ever consulted on the non-locking read paths.

The non-locking read path is actually covered by the logic in getLockMode(), which reaches into the {,un}replicatedLockHolderInfo to do something smart.

The only place we use this, right now, is to check if the lock's timestamp has advanced during lock re-acquisition/lock update. We unblock non-locking readers that are lower than the (now new) lock timestamp in such cases.


pkg/kv/kvserver/concurrency/lock_table.go line 2895 at r2 (raw file):
Thanks for the nudge, I forgot our earlier conversation to pull these behind a testing build check (at the least).

How does that fit in with the more general test-only validation framework we were planning to add?

I think this will be a strict subset of the more general test-only validation described in #108843.

I've pulled this logic out into its own function and made it such that its only triggered from testing builds. I also changed the error's verbiage such that we can cargo cult this into the more general purpose verification framework once that's a thing. That way, none of the test cases we're adding here will have to change!


pkg/kv/kvserver/concurrency/lock_table.go line 2957 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This is almost identical to the logic above. Should we extract it?

Done!


pkg/kv/kvserver/concurrency/lock_table_test.go line 471 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Unrelated to this PR, but I noticed in concurrency_manager_test.go, we have the logic:

if d.HasArg("str") {  
    str = concurrency.ScanLockStrength(t, d)  
}

Is that broken? We're looking for "str" and then "strength". It looks like we never try to use it in tests.

Yeah, that looks busted. I'd filed #109634 to add concurrency manager tests for shared locks (currently, we're only testing latches at that level). That would have caught this.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani)

@arulajmani
Copy link
Collaborator Author

Thanks for the quick reviews!

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Sep 7, 2023

Build succeeded:

@craig craig bot merged commit 4e79c65 into cockroachdb:master Sep 7, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants