Feedback on v3/docs/THREADING.md #94

jimmo · 2023-01-17T00:54:52Z

Thanks for writing this guide @peterhinch !

1. Introduction:

I think it's worth adding micropython.schedule as another context, or possibly more accurately splitting (1) into:

Hard ISR
Soft ISR or scheduler callback

I've seen a few examples of asyncio code that was running into issues due to use of micropython.schedule (and one example where essentially the entire program ran in scheduler context). But the main reason is that Hard and Soft/Schedule have very different implications for syncronisation.

It's of course an extra challenge that there are two "schedulers" at play here, the asyncio scheduler, and micropython.schedule.

1.1 Interrupt Service Routines

"The ISR and the main program share a common Python virtual machine (VM)." -- It might be important to clarify what this means. I'll come back to this in (1.3).

"Consequently a line of code being executed when the interrupt occurs will run to completion before the ISR runs." -- The VM has basically no concept of "line of code", rather it only deals with bytecode instructions. Technically a soft ISR (or scheduled task) can only run at certain bytecode boundaries, but these do not necessarily align with lines of code.

In the case of a hard ISR, the ISR can occur anywhere, even in the middle of a single bytecode instruction. For example, if you wrote a + b where a and b are lists, then an ISR in the middle of the op could modify both a and b, but the result could be old_a + new_b.

"ISR code should not call blocking routines and should not wait on locks." -- In general, an ISR must not wait on a lock because it's impossible for something else to unlock it. The exceptions would be to wait on a lock with zero timeout (i.e. optimistically try to acquire the lock and abort otherwise), or maybe in a soft/scheduler context when the lock is set by a hard context.

1.2 Threaded code on one core

It might be worth clarifying that the GIL protects built-in data structures that are atomic at the bytecode level. However, any user-defined data structure gets no benefit from the GIL. (And ultimately, this is why it's not safe to call any asyncio methods like create_task from another context).

1.3 Threaded code on multiple cores

"There is no common VM hence no common GIL. " -- I guess it depends what you mean by "common VM" here, but I'd argue that there is a common VM, it just doesn't have a GIL.

The issue here is that in all circumstances, the VM is common (i.e. they share global state), but there can be multiple concurrent stacks. Hard interrupts are like the VM called your Python function while in the middle of running a bytecode instruction, soft interrupts are the VM calling your Python function in between instructions (note in both cases they share the same stack).

"In the code sample there is a risk of the uasyncio task reading the dict at the same moment as it is being written". This particular example may actually be completely fine, depending on the scenario.

If what you need is for the three read_data calls and the update of the dict to be atomic, then yes, you need a lock. i.e. if a reader must always see d["z"] that matches the same read cycle as d["x"].

But if you're happy to treat the three slots in the dictionary (and their corresponding values) as independent (maybe you're reading three temperature sensors, and the code just needs to know the latest), then there's no issue with this code. No locking required. This is because concurrently updating a dictionary entry is safe.

Where this falls apart is if you have concurrent threads that modify the dictionary in a way that isn't safe. e.g. one thread adding/removing items. (Simple example: one thread triggers a resize&rehash of the dictionary).

The same argument applies here to modification of global variables (which are just dictionaries with nice syntax). Two concurrent threads can modify global variables safely, as long as no thread does del x or creates a new global.

2.1 A pool

"It is possible to share an int or bool because at machine code level writing an int is "atomic": it cannot be interrupted. In the multi core case anything more complex must be protected to ensure that concurrent access cannot take place. The consequences even of reading an object while it is being written can be unpredictable. "

I think what you're saying here is exactly correct, but I think the emphasis on the object type might be confusing for people that don't understand the details.

So for example, I can have a global variable (or dict entry) that is a complex object, and a concurrent writer that is updating that entry. It doesn't matter that it's a complex objectl, the "update" operation is just updating a pointer, which is atomic.

What matters is that it's not safe for me to write code that reads multiple fields from the object, i.e. d["x"].a followed by d["x"].b if someone concurrently modifies d["x"] then I'll read a and b from different things.

What the locking in this example ensures is that the consumer always sees x,y,z together from the same "cycle". (And I believe this is exactly the point you're trying to make, this is a very difficult guide to write because there are so many independent concepts going on here).

"As stated above, if the producer is an ISR no lock is needed or advised." -- However, in the case, a consumer must disable IRQs in order to achieve the consistency that the lock provided, otherwise the producer could modify the state mid-read.

Sorry, have to stop there for now, but will come back to it soon.

The text was updated successfully, but these errors were encountered:

peterhinch · 2023-01-17T14:41:21Z

Thank you for that - very informative. I will push an update soon.

peterhinch · 2023-01-18T16:23:14Z

@jimmo I have pushed a provisional update but I would like clarification on the following:

While I understand your comments re stacks I'm unclear how these observations affect the Python coder.
In the case of threaded code on a single core, does the common GIL ensure that adding or deleting dict items is safe?
In the multi core case please could you clarify the implications of creating globals on different cores. Most applications will do this: firmware applications instantiate singleton objects representing hardware, uayncio and many user classes have global state. I guess the key is to carefully control concurrency with locks or with startup sequencing?

Item 3 strikes me as quite significant and one I should emphasise in the doc.

jimmo · 2023-01-19T06:04:17Z

Thanks Pete, the update looks good. (Just one comment below)

While I understand your comments re stacks I'm unclear how these observations affect the Python coder.

Yes you're probably right. I think the change you've made solves the issue I was trying to highlight.

In the case of threaded code on a single core, does the common GIL ensure that adding or deleting dict items is safe?

Yes, with one exception -- hard ISRs.

In your updated version you write
"## 1.1 Hard Interrupt Service Routines 1. The ISR and the main program share the Python GIL. This ensures that built
in Python objects (list, dict etc.) will not be corrupted if an ISR runs
while the object is being modified."

This is unfortunately not the case -- the hard ISR inherits the GIL state from the code that it interrupted.

So this means that, for example, the main thread (or just the main code, if threads aren't enabled) could trigger a re-hash of a dictionary, and then if the hard ISR tried to read from that dictionary, it would see invalid state.

A particularly scary variant of this is if the hard ISR races a re-hash of the globals dict. For example:

def foo():
    global x
    x = 1 # assuming that adding x to globals happens to cause a re-hash

def hard_isr():
    a = x

def main():
    ...
    ... some time later, call foo for the first time
    foo()

So if in the situation that the hard isr is triggered just as foo is called, and on the off chance that creating the 'x' global triggers a rehash, then that would be bad. Very unlikely, but you can imagine that there would be variants of this that would make this more likely.

In general, any sort of mutation of a global data structure shared by hard ISRs must be protected with machine.disable_irqs.

A simpler example is having a hard ISR consume a list that the main program is appending to. (Or the reverse, but that would generally be impossible because the hard ISR cannot allocate).

It's important to re-iterate though that this only applies to mutation of the data structure. Individual elements are fine (which is why it's ok to have a global flag).

Also worth noting that a ring buffer is not necessarily hard-ISR-safe because for example inserting the item could race with incrementing the indices, so again, disable_irq might be necessary.

In the multi core case please could you clarify the implications of creating globals on different cores. Most applications will do this: firmware applications instantiate singleton objects representing hardware, uayncio and many user classes have global state. I guess the key is to carefully control concurrency with locks or with startup sequencing?

Item 3 strikes me as quite significant and one I should emphasise in the doc.

Much of what I just wrote for hard ISRs above also applies here, except _thread.allocate_lock rather than machine.disable_irqs. Yep, it's pretty hard to write "regular Python" on the rp2 and not accidentally run into some pretty scary concurrency edge cases.

(Honestly, I'm not convinced that having GIL-free concurrency was necessarily a good idea, maybe instead we should have had a way to offload self-contained work onto the second core, e.g. DSP routines or similar. An independent VM would probably be infeasible for the overheads, and MicroPython currently isn't designed for that)

peterhinch · 2023-01-19T10:37:30Z

Thanks for that, I've updated the doc. The issue with globals is something new to me!

a ring buffer is not necessarily hard-ISR-safe because for example inserting the item could race with incrementing the indices

Are you sure? I've seen a lot of code that relies on the notion that a ringbuf with a single producer and a single consumer is thread safe. I'm pretty sure there's some in MP source...

The producer controls the write pointer and the consumer controls the read pointer. The producer updates the buffer entry and then increments the write pointer. The consumer reads an entry and then increments the read pointer. The only conflict is when the producer checks for full state or the consumer checks for empty state, and these fail safe because of the post-increment. (Assuming that a pointer increment is atomic).

Please could you explain how a race can occur.

peterhinch · 2023-02-04T10:46:59Z

@jimmo I feel like I've been told that De Morgan's Theorem is wrong. The thread-safety of a ringbuf* is axiomatic; if not I'm responsible for a heap of broken code. Please put me out of my misery and explain the above! :)

Subject to simple constraints.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback on v3/docs/THREADING.md #94

Feedback on v3/docs/THREADING.md #94

jimmo commented Jan 17, 2023

peterhinch commented Jan 17, 2023

peterhinch commented Jan 18, 2023 •

edited

jimmo commented Jan 19, 2023

peterhinch commented Jan 19, 2023 •

edited

peterhinch commented Feb 4, 2023

Feedback on v3/docs/THREADING.md #94

Feedback on v3/docs/THREADING.md #94

Comments

jimmo commented Jan 17, 2023

1. Introduction:

1.1 Interrupt Service Routines

1.2 Threaded code on one core

1.3 Threaded code on multiple cores

2.1 A pool

peterhinch commented Jan 17, 2023

peterhinch commented Jan 18, 2023 • edited

jimmo commented Jan 19, 2023

peterhinch commented Jan 19, 2023 • edited

peterhinch commented Feb 4, 2023

peterhinch commented Jan 18, 2023 •

edited

peterhinch commented Jan 19, 2023 •

edited