Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tracking] fixing stdio #20361

Open
1 of 14 tasks
maribu opened this issue Feb 8, 2024 · 10 comments
Open
1 of 14 tasks

[tracking] fixing stdio #20361

maribu opened this issue Feb 8, 2024 · 10 comments
Assignees

Comments

@maribu
Copy link
Member

maribu commented Feb 8, 2024

From the matrix: "stdio in RIOT is a hot mess". So let's fix this. This lists the tasks needed for sane stdio

@maribu
Copy link
Member Author

maribu commented Mar 27, 2024

@maribu
Copy link
Member Author

maribu commented May 10, 2024

Context

It seems that #20067 was only the tip of the iceberg. Even after stdio initialization done, there can still be races in newlib that can corrupt memory and, therefore, cause "interesting surprises" at runtime (e.g. the hardfault I observed in #19926 (comment)).

The Issue

  • newlib's stdio, when not providing the locking infrastructure needed for newlib's reentrant functions, is not thread safe. There still are data races, memory corruptions and faults when concurrently using printf() and puts()
  • stdio backends such as UART are not thread safe (I'm almost certain that this is intentional and the intent was to let upper layers deal with that)

Possible Solutions

Implement Reentrancy in Newlib

The most straight forward implementation would be to just implement the locking interface newlib requires for reentrancy. This has some downsides, though:

  • a per thread data structure is needed, increasing RAM consumption noticably
  • an ongoing write to stdio will not be interruptible, a low prio thread doing lots of stdio could increase latency for high prio threads, even if the only write a single char to stdio
  • no more DEBUG(), puts() or printf() from IRQ context (as locking will not work)

Use a Non-Blocking and Thread-Safe stdio + Thread-Safe Backends

  • We could use stdio implementations that are lock-free and thread-safe
  • However, e.g. UART still is not thread-safe
    • We would need to fix every periph_uart implementation
    • For DEBUG() / printf() / puts() etc. we would need non-blocking thread-safe versions
      • this can be a bit tricky to implement when using DMA or automatic power-on and power-off of the UART peripheral in TX only mode (no RX callback set)

Use a Non-Blocking and Thread-Safe stdio + a stdio Thread

  • We could use a non-blocking and thread-safe stdio implementation that just writes to a ringbuffer
  • An event handler thread could be used to drain the ringbuffer and write to UART / ...
  • DEBUG() from IRQ would mostly work (when there is space in the ringbuffer without having to block for the even thandler to drain something)
  • Would pull in an event handler thread even from hello-world and a ring buffer

@chrysn
Copy link
Member

chrysn commented May 10, 2024

no more DEBUG(), puts() or printf() from IRQ context (as locking will not work)

As I understood, that was always the guidance, accompanied by "if you know very precisely what your building blocks are, it may still work", which is a caveat people chose to ignore.

@maribu
Copy link
Member Author

maribu commented May 10, 2024

I am aware. I would be in favour to forbid use of stdio from IRQ outright.

Maybe we now have a critical mass of people to push for that again, though.

@chrysn
Copy link
Member

chrysn commented May 10, 2024

So far there was always someone who said that "yeah but it works in my cases"; IIRC from the semantics and fixes PR, last time that was @benpicco.

As I understand this issue, here it's not even about RIOT's stdio (stdio_write etc) having weird properties, but on top of that, the C library functions have non-reentrancy that makes things worse even if the individual stdio impl is good. So maybe we should clarify whether we talk about libc provided functions (printf etc) or RIOT provided functions (stdio_write) in this issue here. Then we're in a better situation to decide whether those cases are relevant for this issue.

@maribu
Copy link
Member Author

maribu commented May 10, 2024

So far there was always someone who said that "yeah but it works in my cases"; IIRC from the semantics and fixes PR, last time that was @benpicco.

Also @kaspar030 was "over my dead body" regarding no stdio from IRQ context. But I think he may no longer care.

So maybe we should clarify whether we talk about libc provided functions (printf etc) or RIOT provided functions (stdio_write)

We should, but we also need to keep both in mind. E.g. newlib can provide reentrancy (and will work fine with that - at the cost of higher latency and higher memory consumption). But without reentrancy, newlib's stdio is racy and will occasionally crash.

On the other hand stdio_uart is not thread safe and would need something like newlib's reentrancy feature to prevent concurrent calls to stdio_write().

I think it would absolute be possible to write C lib stdio (printf(), puts(), ...) that is thread safe but could allow another thread to call puts() without blocking even if it just preempted a thread in the middle of puts(). But that would require stdio_write() to also be thread-safe and non-exlusive to work as we would like (e.g. it would mix the output from the threads, but not loose chars). (I think most non-DMA UART drivers will be almost up for the task anyway and may only need a sync before writing the first byte added in case the hardware is still busy shifting out a byte from the thread that got just preempted.) Then we would have the current behavior, except for the occasional crashes fixed.

@benpicco
Copy link
Contributor

I still think we don't need to worry about stdio too much.
It's mainly a debug tool - in 99% of all RIOT deployments, nobody is going to see stdio ever again.
And for debugging, stdio from ISR can be useful, e.g. just to print that some code path has bean reached.

@maribu
Copy link
Member Author

maribu commented May 10, 2024

I agree that stdio is mostly a debug tool. Still, random crashes due to stdio is a pita. (Admittedly, the chance of a crash due to a race is low, but it is still something we need to fix. And there are firmwares that get the timing just right for a reliable crash on startup.)

@chrysn
Copy link
Member

chrysn commented May 11, 2024

It may be a debug tool, but it's also one of the earliest things people are introduced to with RIOT; judging from the various telnet-ish and Bluetooth-ish transports that are around, apparently the shell is pretty widely used. (And race conditions are not something we should have in our code base)

@maribu
Copy link
Member Author

maribu commented May 11, 2024

It is also not only newlib's stdio that is not thread-safe, but also avrlibc's. E.g.

int
printf_P(const char *fmt, ...)
{
	va_list ap;
	int i;

	va_start(ap, fmt);
	stdout->flags |= __SPGM;
	i = vfprintf(stdout, fmt, ap);
	stdout->flags &= ~__SPGM;
	va_end(ap);

	return i;
}

So when thread A is doing printf_P() to print from PROGMEM and thread B preempts thread A, the address of the format string in a call to printf() from A will be treated as referring to PROGMEM rather than as referring to RAM.

I now imagine in thousand of years archeologists debating the reasoning behind the design choice to set a flag in a FILE * to encode something completely unrelated such as the address space of a format string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants