nsqlookupd: fix write lock starvation #1208

andyxning · 2019-11-01T13:38:14Z

This PR will fix register db write lock starvation.

We have encountered a circumstance:

we have many topics with many ephemeral channels.
our cluster fo nsqd is large with more than 600+ instances.
some channels will register and unregister itself normally since the consumer client can break because timeout is readed or other various situations.

And, finally, the register db is large.

So the lookup api will consume some little more time to respond. In the case where a lookup api get the read lock fo register db, all the write lock, for example, register/unregister channels, will be blocked. And since the read lock can be acquired simultaneously by lookup requests, the operations that needs write lock will all be blocked and the memory and goroutine will accumulated.

This PR will add a cache for the FindProducers resposne. TTL is 1 minute, the purge operation will be operated at 5 minutes interval.

Corns:

This will add some pressure on the memory since a new memory will be allocated.
newly added nsqds instance may have some more time to be discovered by consumer.

nsqlookupd/registration_db.go

ploxiln · 2019-11-01T22:15:03Z

nsqlookupd/registration_db.go

+		return val.(Producers)
+	}
+
+	r.cachedMutex.RUnlock()


https://github.com/patrickmn/go-cache#go-cache says "the cache can be safely used by multiple goroutines" so I don't think you need this lock at all

This is mainly used to avoid multiple set operations happen.

I see, but you still don't need the read lock for that, only the write lock (which could be a plain mutex) for the following section, where you check again if the result is in the cache, before running the expensive loop.

That expensive loop, which is the main thing you are trying to avoid, is:

for k, producers := range r.registrationMap { if !k.IsMatch(category, key, subkey) { continue } for _, producer := range producers {

I think that there is probably a better way to structure the registration map, or maintain pre-computed results along side it or embedded in it, which are updated or invalidated whenever the registration map is updated, to get better performance with huge numbers of producers / topics / channels / whatever.

According to the sync.Map doc:

(1) when the entry for a given key is only ever written once but read many times, as in caches that only grow, or (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys.

the item one said that sync.Map suites for a situation where only one written happens which is definitely not the truth for this nsqlookupd registeration map usage.

but you still don't need the read lock for that, only the write lock (which could be a plain mutex) for the following section, where you check again if the result is in the cache, before running the expensive loop.

@ploxiln Thanks, i understand. Will do.

Despite being less than optimal for this use-case, a sync.Map can have entries inserted and updated concurrently while a query is iterating over the entire structure. So it might resolve the initial complaint about the write lock being starved.

I consider the addition of this cache to be a hacky work-around. It adds up to a 1 minute delay to updates, for all users. It may be OK for many consumers, but probably not all, and not for nsqd or nsqadmin.

nsqlookupd is already just a big in-memory data structure. It could probably be re-structured to process your extreme case faster. That would admittedly require more time and effort to understand the current structure and queries. So a cheap quick fix, without those problems, might be a sync.Map.

andyxning · 2019-11-04T10:28:17Z

@ploxiln @jehiah PTAL.

ploxiln · 2019-11-04T16:11:29Z

nsqlookupd/registration_db.go

 	removed := false
 	if _, exists := producers[id]; exists {
 		removed = true
 	}

 	// Note: this leaves keys in the DB even if they have empty lists
 	delete(producers, id)
+
+	r.registrationMap.Store(k, producers)


I don't think you need to store the producers map again after modifying it, because map is a reference type.

I think this is true for AddProducer() too (registrationMap.LoadOrStore() ensured that the producers map returned for you to operate on is the one already in the registrationMap).

ploxiln · 2019-11-04T16:12:01Z

commit title typo: "startvation" -> "starvation"

ploxiln · 2019-11-04T19:26:53Z

I thought of another problem, unfortunately. The values, which are plain maps, are not safe for concurrent updates and reads.

By the way, in a few places you do val.(map[string]*Producer). We have a ProducerMap type defined already, so I think you could do val.(ProducerMap).

So anyway, while concurrently updating and iterating over the RegistrationDB is OK (now that it is a sync.Map), updating the ProducerMap in a value is technically not OK while a lookup is iterating over it concurrently. One possible fix is to add a RWMutex to every ProducerMap, changing it to:

type ProducerMap struct {
    sync.RWMutex
    m map[string]*Producer
}

Another idea is to make every ProducerMap a sync.Map instead.

Both of these ideas sound like they will add a significant amount of overhead, more than just one top-level sync.Map. Maybe making all these ProducerMap values sync.Map will still come out over-all helpful for your usecase, I don't know. But it is sounding more iffy.

If sync.Map everywhere is too much overhead, the next step is to think about the data processing that nsqlookupd is doing that is expensive. I think it's the "iterate over everything" which happens for some kind of lookups. And I think this can be avoided with another data structure along-side. But we have to understand when and why this happens. It's for the *:

func (r *RegistrationDB) needFilter(key string, subkey string) bool {
	return key == "*" || subkey == "*"
}

channels := s.ctx.nsqlookupd.DB.FindRegistrations("channel", topicName, "*").SubKeys()

topics := s.ctx.nsqlookupd.DB.FindRegistrations("topic", "*", "").Keys()

So if we could come up with an efficient way to store information needed to serve these queries, and keep it consistent as registrations are added and removed, we could avoid ever needing to do the expensive looping ... (The other thing I might worry about is LookupRegistrations(id).)

Accelerating each one of these queries is sort of like adding another index. It is more complicated to keep all these indexes consistent as registrations are added/removed but it may be worth it so that all these queries are fast at your scale, and still accurate.

andyxning · 2019-11-05T02:18:45Z

I thought of another problem, unfortunately. The values, which are plain maps, are not safe for concurrent updates and reads.

That's quite true. I have encountered this while debugging this PR code. That's why i add a lock before ProducerMap2Slice:

		r.RLock()
		defer r.RUnlock()
		return ProducerMap2Slice(val.(map[string]*Producer))

By the way, in a few places you do val.(map[string]*Producer). We have a ProducerMap type defined already, so I think you could do val.(ProducerMap).

I use map[string]*Producer because the original code uses it. I will change this to ProducerMap since either is fine.

ploxiln · 2019-11-05T02:50:55Z

nsqlookupd/registration_db.go

+	r.registrationMap.Range(func(k, v interface{}) bool {
+		if k.(Registration).IsMatch(category, key, subkey) {
+			producers := v.(ProducerMap)
+			for _, producer := range producers {


I think you need to r.RLock() around this

If you take the r.RLock() around the whole loop, that puts this back close to where it started. If you can take the RLock() only after matching a Registration and before iterating over the ProducerMap, and then releasing it after finishing with that ProducerMap, it could avoid starving the write lock.

Yes. Actually after finally finished this PR, i have thought about this. The PR now seems similar to the original one. But, some write lock has been erased.

If you can take the RLock() only after matching a Registration and before iterating over the ProducerMap, and then releasing it after finishing with that ProducerMap, it could avoid starving the write lock.

Not quite sure about this. Since read lock can be acquired simultaneously and acquiring and releasing read lock frequently may utilize more CPU.

yeah ... that's why I said this strategy is "sounding more iffy"

as I see it, it's either go back to the original, or take the read lock just around accessing each ProducerMap ("thrashing" the lock I guess), or turn all these ProducerMap into sync.Map themselves

ploxiln · 2019-11-05T02:52:21Z

nsqlookupd/registration_db.go

 	results := Registrations{}
-	for k, producers := range r.registrationMap {
+	r.registrationMap.Range(func(k, v interface{}) bool {
+		producers := v.(ProducerMap)
 		if _, exists := producers[id]; exists {


I think you also need the r.RLock() here

andyxning · 2019-11-05T02:53:57Z

@ploxiln Actually i have test this PR in our clusters for about 1 day. Every thing is fine. The worry about concurrent read/write access for the value of ProducerMap also hits me for a quite long time. But after a second thought about sync.Map's Range and the source code for sync.Map, it seems that it is safe to use like this without lock.

ploxiln · 2019-11-05T04:06:08Z

you pushed the wrong branch over this one :)

andyxning · 2019-11-05T04:15:46Z

@ploxiln Done.

mreiferson · 2020-06-14T21:01:45Z

Where are we at on this one? Seems like the conversation has stalled at having to take a lock around a large critical section when iterating over registrationMap to handle * cases?

First, it doesn't look like we ever actually call FindProducers with a *, so we can simplify that code and remove the pathological loop:

nsq/nsqlookupd/registration_db.go

Lines 138 to 155 in 55de623

    
           r.RLock() 
        
           results := make(map[string]struct{}) 
        
           var retProducers Producers 
        
           r.registrationMap.Range(func(k, v interface{}) bool { 
        
           	if k.(Registration).IsMatch(category, key, subkey) { 
        
           		producers := v.(ProducerMap) 
        
           		for _, producer := range producers { 
        
           			_, found := results[producer.peerInfo.id] 
        
           			if found == false { 
        
           				results[producer.peerInfo.id] = struct{}{} 
        
           				retProducers = append(retProducers, producer) 
        
           			} 
        
           		} 
        
           	} 
        
           	return true 
        
           }) 
        
           r.RUnlock()

Then, for FindRegistrations, which is called with *, seems like to @ploxiln's point, the best path forward is actually to maintain two additional maps that handle Key or SubKey lookups for *. The locking can be simplified after that, since the critical section will be much smaller.

mreiferson · 2020-06-14T21:49:05Z

see also #584, where some of that code referenced above was added.

mreiferson · 2020-06-15T00:26:08Z

Then, for FindRegistrations, which is called with *, seems like to @ploxiln's point, the best path forward is actually to maintain two additional maps that handle Key or SubKey lookups for *. The locking can be simplified after that, since the critical section will be much smaller.

Or if we want to get fancy, something like https://github.com/derekparker/trie is probably more appropriate. We'd have to change how we generate keys and choose a separator (e.g. :, something that isn't a valid topic/channel character) to produce string keys: topic:foo:*

jehiah reviewed Nov 1, 2019

View reviewed changes

nsqlookupd/registration_db.go Outdated Show resolved Hide resolved

ploxiln reviewed Nov 1, 2019

View reviewed changes

jehiah added the perf label Nov 2, 2019

mdh67899 approved these changes Nov 3, 2019

View reviewed changes

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch 8 times, most recently from 7aa646f to 0ab3fe3 Compare November 4, 2019 10:20

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 0ab3fe3 to 1a3e5a8 Compare November 4, 2019 13:16

ploxiln reviewed Nov 4, 2019

View reviewed changes

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 1a3e5a8 to b3847f1 Compare November 5, 2019 02:38

ploxiln reviewed Nov 5, 2019

View reviewed changes

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from b3847f1 to 071a571 Compare November 5, 2019 03:00

nsqlookupd: fix write lock starvation

55de623

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch 2 times, most recently from 55de623 to 383fba1 Compare November 5, 2019 03:45

andyxning force-pushed the nsqlookupd_fix_write_lock_starvation branch from 383fba1 to 55de623 Compare November 5, 2019 04:08

maplessssy mentioned this pull request Aug 17, 2020

nsqd: discover topic/channel paused state on new topic discovery #1274

Open

ploxiln mentioned this pull request Sep 28, 2021

consumer: reuse lookupd http client nsqio/go-nsq#333

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nsqlookupd: fix write lock starvation #1208

nsqlookupd: fix write lock starvation #1208

andyxning commented Nov 1, 2019 •

edited

ploxiln Nov 1, 2019

andyxning Nov 3, 2019

ploxiln Nov 3, 2019

andyxning Nov 3, 2019 •

edited

andyxning Nov 3, 2019

ploxiln Nov 3, 2019

andyxning commented Nov 4, 2019

ploxiln Nov 4, 2019

ploxiln commented Nov 4, 2019

ploxiln commented Nov 4, 2019

andyxning commented Nov 5, 2019

ploxiln Nov 5, 2019

andyxning Nov 5, 2019

ploxiln Nov 5, 2019

andyxning Nov 6, 2019 •

edited

ploxiln Nov 6, 2019

ploxiln Nov 5, 2019

andyxning Nov 5, 2019

andyxning commented Nov 5, 2019

ploxiln commented Nov 5, 2019

andyxning commented Nov 5, 2019

mreiferson commented Jun 14, 2020 •

edited

mreiferson commented Jun 14, 2020

mreiferson commented Jun 15, 2020

nsqlookupd: fix write lock starvation #1208

Are you sure you want to change the base?

nsqlookupd: fix write lock starvation #1208

Conversation

andyxning commented Nov 1, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyxning Nov 3, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyxning commented Nov 4, 2019

Choose a reason for hiding this comment

ploxiln commented Nov 4, 2019

ploxiln commented Nov 4, 2019

andyxning commented Nov 5, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyxning Nov 6, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyxning commented Nov 5, 2019

ploxiln commented Nov 5, 2019

andyxning commented Nov 5, 2019

mreiferson commented Jun 14, 2020 • edited

mreiferson commented Jun 14, 2020

mreiferson commented Jun 15, 2020

andyxning commented Nov 1, 2019 •

edited

andyxning Nov 3, 2019 •

edited

andyxning Nov 6, 2019 •

edited

mreiferson commented Jun 14, 2020 •

edited