Locking strategy for scheduler #242

cahirwpz · 2017-04-06T15:15:28Z

After spending a few hours reading FreeBSD sources I wasn't able to figure out what is happening to td_lock and what are the interactions between 4BSD scheduler and threads.

Same parts of NetBSD kernel seems to be implemented in cleaner way. Additionally I was able to find some extensive in-code documentation about locking strategy. I post my findings here...

Having a first look at l_mutex and td_lock field it is not apparent why it's a pointer. Rather shocking fact tells that over the life of a thread the pointer changes frequently! Long comment in kern_lwp.c tries to shed some light on it.

Seems like l_mutex may point to spc_mutex and spc_lwplock, which are both constituent of schedstate_percpu. When a thread is created by lwp_create its l_mutex field gets initialized to spc_lwplock. Note, that scheduler and thread related code is littered with KASSERT(mutex_owned(l->l_mutex)) check. There are special routines to obtain and release l_mutex lock – namely lwp_lock, lwp_unlock, lwp_setlock and lwp_unlock_to.

According to in-code documentation the locking order is spc::spc_lwplock > sleeptab::st_mutex > tschain_t::tc_mutex > spc::spc_mutex.

... more to come, stay tuned.

The text was updated successfully, but these errors were encountered:

cahirwpz · 2020-04-07T19:27:37Z

I feel that p. 133 of FreeBSD book has a comment that somehow is related to this issue:

Historically a global scheduling lock was used, but it was a bottleneck. Now each thread uses a lock tied to its current state to protect its per-thread state. For example, when a thread is on a run queue, the lock for that run queue is used; when the thread is blocked on a turnstile, the turnstile’s lock is used; when a thread is blocked on a sleep queue, the lock for the wait channels hash chain is used.

cahirwpz · 2020-04-14T21:57:32Z

https://lists.freebsd.org/pipermail/freebsd-arch/2013-September/014794.html

Think about td_lock like something what is lent by current thread owner. If a thread is running, it's owned by scheduler and td_lock points to scheduler lock. If a thread is sleeping, it's owned by sleeping queue and td_lock points to sleep queue lock. If a thread is contested, it's owned by turnstile queue and td_lock points to turnstile queue lock. And so on. This way an owner can work with owned threads safely without giant lock. The td_lock pointer is changed atomically, so it's safe.

take a thread that is asleep on a sleep queue. td_lock points to the relevant SC_LOCK() for the sleep queue chain in that case, so any other thread that wants to examine that thread's state ends up locking the sleep queue while it examines that thread. In particular, the thread that is doing a wakeup() can resume all of the sleeping threads for a wait channel by holding the one SC_LOCK() for that wait channel since that will be td_lock for all those threads.

cahirwpz · 2020-04-20T12:26:12Z

Another useful source: "Solaris™ Internals: Solaris 10 and OpenSolaris Kernel Architecture", Second Edition, chapter 3.4:

The actual lock backing the thread lock depends on the thread's state. A thread in TS_ONPROC state has its lock in the CPU on which it is running. A TS_RUN thread's lock is in the dispatch queue the thread is on, and a TS_SLEEP thread's lock resides in the corresponding sleep queue. Setting the thread state with THREAD_SET_STATE sets the thread's thread lock to the appropriate place based on the new state.

A kernel thread's t_lockp may also reference the transition_lock, the stop_lock, or a sleep queue lock. The lock names give us a good indication of their use; a thread's t_lockp is set to the transition lock when the thread's state is changing. The transition lock is necessary because thread state changes often result in changes to the thread's t_lockp. For example, when a thread transitions from running (TS_ONPROC) to sleep (TS_SLEEP), the t_lockp is set to the lock associated with the sleep queue on which the thread is placed. If a thread is migrated to another processor, the address of the dispatcher lock changes (since dispatch queues are per-processor), resulting in a change to the thread's t_lockp. The transition lock provides a simple and safe mechanism for protecting thread state during such transitions.

The stop lock is used when a thread is being created, which is the initial state of a thread. Threads can also be stopped when executed under the control of a debugger.

cahirwpz added the not ready label Apr 6, 2017

cahirwpz mentioned this issue Apr 6, 2017

Fix deadlock on zombie_threads_mtx #234

Merged

cahirwpz mentioned this issue May 3, 2017

Moved uspace field from thread_t to proc_t #289

Merged

cahirwpz added research and removed not ready labels Feb 14, 2019

cahirwpz mentioned this issue Apr 20, 2020

Fix potential lost wakeup in cv_wait(_timed). #673

Merged

cahirwpz mentioned this issue Sep 16, 2020

New locking scheme for dispatcher & scheduler #756

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Locking strategy for scheduler #242

Locking strategy for scheduler #242

cahirwpz commented Apr 6, 2017 •

edited

cahirwpz commented Apr 7, 2020

cahirwpz commented Apr 14, 2020 •

edited

cahirwpz commented Apr 20, 2020 •

edited

Locking strategy for scheduler #242

Locking strategy for scheduler #242

Comments

cahirwpz commented Apr 6, 2017 • edited

cahirwpz commented Apr 7, 2020

cahirwpz commented Apr 14, 2020 • edited

cahirwpz commented Apr 20, 2020 • edited

cahirwpz commented Apr 6, 2017 •

edited

cahirwpz commented Apr 14, 2020 •

edited

cahirwpz commented Apr 20, 2020 •

edited