Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set to invalid hashmap in concurrent situation #87

Open
woodliu opened this issue Apr 23, 2024 · 6 comments
Open

Set to invalid hashmap in concurrent situation #87

woodliu opened this issue Apr 23, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@woodliu
Copy link

woodliu commented Apr 23, 2024

func (m *Map[K, V]) Delete(key K) node.Node[K, V] and func (m *Map[K, V]) Set(n node.Node[K, V]) node.Node[K, V] can run concurrently, when Delete run into this line and Set run into this loop, there is chance that Set node to an invalid(shrinked) table.
I think this may acceptable in this situation.

And here is another question, in hash map set, there is a judgment if m.newerTableExists(t), does it used to lower the probability set to an invalid table? Because when concurrently run multi set, there still may create a new table even after this judgment if m.newerTableExists(t)
Thanks!

@woodliu woodliu added the enhancement New feature or request label Apr 23, 2024
@maypok86
Copy link
Owner

can run concurrently, when Delete run into this line and Set run into this loop, there is chance that Set node to an invalid(shrinked) table.

But this one has already taken a lock on the bucket in Delete and the rest will have to wait for the operation to complete.

Because when concurrently run multi set, there still may create a new table even after this judgment if m.newerTableExists(t)

To be honest, I didn't really understand how this is possible. There is also a lock on the bucket first. Can you explain in more detail, please?

@woodliu
Copy link
Author

woodliu commented Apr 24, 2024

But this one has already taken a lock on the bucket in Delete and the rest will have to wait for the operation to complete.

Sorry, i post the wrong code link for "Delete run into this line and Set run into this loop,", please recheck the link again. Here is an example to show how that will happend:

  1. When there is no table resizing, A set operation can run this loop, it will add or update the node in t.buckets[1]
  2. At the same time, a delete operation just starts to run resize for t.buckets[2](they are changing the different bucket and using the different locks), because there is no other table resizing, it will resize the table later. So there is a chance that delete creates a brand new shrinked table for hashmap, and set will add or update the node in the old table.

To be honest, I didn't really understand how this is possible. There is also a lock on the bucket first. Can you explain in more detail, please?

This situation just like above, two set operateions are changing the different table buckets, they use the different locks, so one set can run the loop to add or update the node in a bucket, and another set can start resize when add or update node in another bucket because there is no table resizing. So the second set will create a new table for hashmap, and the first set will change the old table, so i think if m.newerTableExists(t) is a soft limit.

@maypok86
Copy link
Owner

So there is a chance that delete creates a brand new shrinked table for hashmap, and set will add or update the node in the old table.

If these operations are performed at the same time, then the lock for t.buckets[1] has already been taken, which means that when copying entries from this bucket to a new table, the goroutine will have to wait for the bucket to be unlocked.

So the second set will create a new table for hashmap, and the first set will change the old table, so i think if m.newerTableExists(t) is a soft limit.

Yeah, this is a more interesting case. The main magic here is done by resizing and locking the bucket at the beginning of the Set execution.

  1. If the Set started before resize, then resize will wait for the execution of Set to complete due to locking and copy the final result.
  2. If the resize started before the Set, then when checking the resizing, we will delay the execution of the Set until the completion of the execution of the resize.

@woodliu
Copy link
Author

woodliu commented Apr 25, 2024

Wow, i missed the lock in copyBuckets👍🏻.
So if m.resizeInProgress() and if m.newerTableExists(t) are soft limit in set, right?
As the bucket lock can make sure the resize and set won't change the same bucket, how about remove the code below from set, does it make sense?

		// the following two checks must go in reverse to what's
		// in the resize method.
		if m.resizeInProgress() {
			// resize is in progress. Wait, then go for another attempt.
			rootBucket.mutex.Unlock()
			m.waitForResize()
			goto RETRY
		}
		if m.newerTableExists(t) {
			// someone resized the table. Go for another attempt.
			rootBucket.mutex.Unlock()
			goto RETRY
		}

@maypok86
Copy link
Owner

So if m.resizeInProgress() and if m.newerTableExists(t) are soft limit in set, right?

What do you mean by "soft limit"?

how about remove the code below from set, does it make sense?

These checks are necessary for the case when we have already started to perform resize. Since we could already copy the desired bucket to a new hash table. Because of this, we may lose the entry. It seems that we should not allow this to happen :).

@woodliu
Copy link
Author

woodliu commented Apr 26, 2024

What do you mean by "soft limit"?
soft limit means lower the probability that access the invalid table. You have already explained this in the below answer.

These checks are necessary for the case when we have already started to perform resize. Since we could already copy the desired bucket to a new hash table. Because of this, we may lose the entry. It seems that we should not allow this to happen :).
Awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants