ztimer problem statement and design document

Introduction

This document describes the reasons for introducing ztimer and why I think it should replace xtimer.

Problem statement

RIOT needs an easy to use, efficient and flexible timer system that allows precise high-frequency (microsecond scale) timings alongside low-power timers.

For example, an application might include a driver requiring high-frequency (microsecond scale) timings, but at the same time needs to run on batteries and thus needs to make use of low-power timers that can wake up the device from deep sleep.

Requirements

Some time ago, the timer task force collected the following requirements:

General requirements:

very efficient timers for use in time-critical drivers
easy-to-use interface (unified interface)
work with varying MCU timer widths (16, 24, 32-bit timers)
adaptable to varying configurations of timers, RTTs, RTCs (use RTC if available for super-long-time timers)
- this means that applications and / or system modules need to be able to set timers on different timer hardware (configurations) simultaneously

API (necessary functionality):

Wait for a time period (e.g. timer_usleep(howlong))
receive message after given period of time (e.g. timer_msg(howlong))
receive message periodically
Await a point in time
Wait for a (past) time + x

current state

RIOT's current high-level timer system is implemented in the xtimer module.

All API requirements are met by xtimer. No known application's needs could not be fulfilled by the API.

Regarding the general requirements, current xtimer is not "adaptable to varying configurations of timers, RTTs, RTCs". While xtimer can be configured to use low-power timer hardware instead of the (default) high-precision timer hardware, an application needs to choose either low-power, low precision timers or high-precision timers that prevent the system from sleeping, system wide. If xtimer is configured to use a low-power timer, it forces it to quantize all timer values to the frequency of the low-power timer, which usually has a low frequency (usually 32kHz). Worse, much code in RIOT uses xtimer with the expectation of it being high-precision, and breaks in unknown ways if suddenly used with low-frequency timers through the microsecond based API. The API does not reflect this distinction.

Apart from that, xtimer has some issues, some which have PR's trying to fix them:

it is not ISR safe (#5428, #9530, #11087)
it sets target timers using absolute target times, making it prone to underflows (#9530 fixes this?)
it doesn't have unittests (#10321)
it has some underflow bugs and race conditions (#7116, #9595, #11087)
there's a PR trying to use multiple backend drivers (#9308)
it forces all timers to 1us ticks at the API level, which in turn forces internal 64bit arithmetic
its optional non-us tick support has undefined and fixed (generally, inflexible) arithmetic for conversion
its monolithic design does not allow an application to use timer hardware with custom configuration alongside the default configuration
there's no clear path for implementing higher-level functionality while keeping the API, e.g., for implementing a network-wide synchronized clock
xtimer is depending on correct per-platform configuration values (XTIMER_BACKOFF, XTIMER_ISR_BACKOFF, which are hard to understand)
subjectively, it's code is a mess of microsecond and tick types and complex interactions

Even if #9530, #10321, #9308 (the PR's tackling xtimer's isr safety and some of it's proneness to underflows, adding multiple backends, and unittests) would have been reviewed, tested and merged, there'd still be conceptual issues (repeated here):

it forces all timers to 1us ticks at the API level, which in turn forces internal 64bit arithmetic
subjectively, it's code is a mess of microsecond and tick types and complex interactions. This would be multiplied by the number of used timers.
there's no clear path for implementing higher-level functionality while keeping the API, e.g., for implementing a network-wide synchronized clock

With API additions, the forced 1us timer base could be removed, making the 64bit arithmetic unnecessary if done right. This would require a tremendous effort and probably touching every line in xtimer.

Intermediate conclusion

xtimer has both architectural flaws and serious implementation issues. Fixing those would pose a tremendous effort, which would exceed the cost of starting from scratch with all (updated) requirements in mind.

ztimer

ztimer solves all of xtimer's identified issues.

it has a modular architecture allowing
- arbitrary timer configurations
- clear separation between backend hardware code and ztimer core code
- clear separation between ztimer core code and modules like tick conversion, network synchronization
- application specific timer configuration alongside default system configuration
it has a mock clock that is used for unittesting
it only ever sets relative time stamps, thus underflows are mostly avoided
it only uses 32bit ticks, saving memory and cycles
it correctly handles timers with arbitrary limits in an interrupt safe way

Concretely, it already fixes the following xtimer issues:

it's isr safe
it doesn't underflow
it allows multiple backends, and it allows using them at the same time
it allows the system to sleep
it supports arbitrary backend clock frequencies
it's code has a much better layout and is (subjectively) much cleaner
it allows periph_timer, periph_rtt and periph_rtc to be used as backend
it offers a virtual (mock) clock for unittesting

It achieves this with ~600 SLOC (compared to xtimer's ~900 SLOC) (but xtimer has some features that ztimer doesn't have yet). When compiled, ztimer's base configuration (1MHz periph timer, no conversion) uses ~25% less code than xtimer (measured using direct port of tests/xtimer_msg to ztimer).

ztimer design trade-offs:

its API only allows 32bit timeouts, vs. full 64bit in xtimer. This allows much simpler (and efficient) arithmetics and uses less memory. It has been observed that no known hardware can provide both 64bit range and precision, and more importantly, no application needs it. By offering multiple time bases (e.g. milliseconds for long timeouts and microseconds for short ones) with full 32bit range each, all application's actual needs should be met. If not, a 64bit implementation could be added on top of ztimer, only causing overhead for applications that actually use it.
ztimer stores a timer's offset relative to the previous timer. This means the time left until a timer expires cannot be calculated by taking absolute target time - now, but needs to be summed up. Essentially, a hypothetical ztimer_left(clock, timer) becomes an O(n) operation.
ztimer disables, as does xtimer, IRQs while manipulating its timer linked lists. Thus the time spent with ISRs disabled is depending on the amount of timers in the list, making ztimer real-time unsafe. (there are ideas to fix this, see Outlook section below)
other than the implicit guarantees given by the clock backends, ztimer does not handle more specific power management.

Development history

ztimer has been designed and implemented by me (the original xtimer developer and its maintainer) and @gebart (who did the most substantial changes to xtimer, e.g., arbitrary frequency support).

High level design

ztimer's API is the same as xtimer's, apart from every function having a new first parameter specifying the clock to work on. e.g., xtimer_now(); -> ztimer_now(ZTIMER_USEC); xtimer_set(t, interval); -> ztimer_now(ZTIMER_USEC, t, interval);
every ztimer_clock_t handles multiplexing (allowing multiple timers to be set on it)
every ztimer_clock_t handles extension to 32bit in an ISR save way
every ztimer_clock_t shares basic ISR handling, multiplexing and 32bit extension using shared code
every ztimer_clock_t needs to implement three functions (now(), set(), cancel())
frequency conversion is done using a "stacked" conversion module, e.g., a ztimer_rtt_t that runs at 32768kHz would be the "lower" or "parent" timer of an instance of "ztimer_convert_frac_t", which would internally convert all now() or set() calls to a desired target frequency (e.g., 1000Hz)
in order to hide application developers from the multitude of timer configurations, ztimer provides by convention a couple of default clocks (ZTIMER_USEC, ZTIMER_MSEC, ZTIMER_SEC, ...), which are configured as precise or as low-power as needed. E.g., ZTIMER_USEC will be using a high-frequency periph_timer on most platforms, while ZTIMER_MSEC will use an RTT (and convert it's frequency to 1kHz if necessary). If no RTT is available, ztimer falls back to converting ZTIMER_USEC to 1kHz and provide that as ZTIMER_MSEC.

Power management considerations

currently, ztimer is pm_layered agnostic. If a timer is set on a periph_timer, this would probably not prevent sleep (timer would not trigger), whereas if a ztimer is set on a rtt, it would behave as expected (timer hardware keeps running in sleep, timer isr wakes up MCU).
(TODO) if a timeout has been set (e.g., ztimer_set(clock, timeout)), the backend device blocks sleeping if necessary. IMO this is the minimum requirement, but still needs to be implemented.
Idea: we specify that by convention, ZTIMER_MSEC (and ZTIMER_SEC) keep running in sleep mode, whereas ZTIMER_USEC stops when the MCU enters sleep (unless a timeout is scheduled). This is current behaviour if ZTIMER_USEC is using periph_timer as backend and ZTIMER_MSEC is using RTT/RTC.

This would mean that before = ztimer_now(clock); do something; diff = ztimer_now(clock) - before; only works if either do_something does not schedule away the thread causing sleep or a clock is used that runs in sleep mode.
the behaviour could be accessible either through defines (ZTIMER_USEC_LPM_MODE or ZTIMER_USEC_DEEPSLEEP ...), or be made part of the ztimer API
in addition, we could add functions to explicitly tell the clocks to stay available until released, e.g., ztimer_acquire(clock); before = ztimer_now(clock); do something; diff = ztimer_now(clock) - before; ztimer_release(clock);. Once the "if timer is scheduled, don't sleep" is implemented, this could also be worked around by: ztimer_set(clock, dummy, 0xffffffff); ...; ztimer_cancel(clock, dummy);
if something like ztimer_acquire(clock) and ztimer_release_clock() would be added, timer hardware could be turned off when not used.
currently ztimer does not handle the cases where an MCU is not fully ready after wakeup from deep sleep, at timer callback execution time. E.g., "puts(".") first thing in an rtc alarm might fail if the clocks are still spinning up.

Outlook

if deemed necessary, a 64bit implementation can be added
if deemed necessary, ztimer_set_absolute() (setting a timeout to an absolute target time) can be added, though it would be an O(n) operation
it should be possible to make ztimer realtime safe, by
- using a mutex for list protection
- providing per-priority virtual clocks and callback handler threads

Implementation details

Pasting the documentation from ztimer.h here, which contains more implementation details:

Introduction

ztimer provides a high level abstraction of hardware timers for application timing needs.

The basic functions of the ztimer module are ztimer_now(), ztimer_sleep(), ztimer_set() and ztimer_remove().

They all take a pointer to a clock device (or virtual timer device) as first parameter. RIOT provides ZTIMER_USEC, ZTIMER_MSEC, ZTIMER_SEC by default. These clocks allow multiple timeouts to be scheduled. They all provide 32bit range.

ztimer_now() returns the current clock tick count as uint32_t.

ztimer_sleep() pauses the current thread for the passed amount of clock ticks. E.g., ztimer_sleep(ZTIMER_SEC, 5); will suspend the currently running thread for five seconds.

ztimer_set() takes a ztimer_t object (containing a function pointer and void* argument) and an interval as arguments. After at least the interval (in number of ticks for the corresponding clock) has passed, the callback will be called in interrupt context. A timer can be cancelled using ztimer_remove().

Example:

#include "ztimer.h"

static void callback(void *arg)
{
   puts(arg);
}

int main()
{
    ztimer_t timeout = { .callback=callback, .arg="Hello ztimer!" };
    ztimer_set(ZTIMER_SEC, &timeout, 2);

    ztimer_sleep(ZTIMER_SEC, 5);
}


# Design

## clocks, virtual timers, chaining

The system is composed of clocks (virtual ztimer devices) which can be
chained to create an abstract view of a hardware timer/counter device. Each
ztimer clock acts as a filter on the next clock in the chain. At the end of
each ztimer chain there is always some kind of counter device object.

Each clock device handles multiplexing (allowing multiple timers to be set)
and extension to full 32bit.

Hardware interface submodules:

- @ref ztimer_rtt_init "ztimer_rtt" interface for periph_rtt
- @ref ztimer_rtc_init "ztimer_rtc" interface for periph_rtc
- @ref ztimer_periph_init "ztimer_periph" interface for periph_timer

Filter submodules:

- @ref ztimer_convert_frac_init "ztimer_convert_frac" for fast frequency
  conversion using the frac library
- @ref ztimer_convert_muldiv64_init "ztimer_convert_muldiv64" for accurate
  but slow frequency conversion using 64bit division


A common chain could be:

1. ztimer_periph (e.g., on top of an 1024Hz 16bit hardware timer)
2. ztimer_convert_frac (to convert 1024 to 1000Hz)

This is how e.g., the clock ZTIMER_MSEC might be configured on a specific
system.

Every clock in the chain can always be used on its own. E.g. in the example
above, the ztimer_periph object can be used as ztimer clock with 1024Hz
ticks in addition to the ztimer_convert_frac with 1000Hz.


## Timer handling

Timers in ztimer are stored in a linked list for which each entry stores the
difference to the previous entry in the timer (T[n]). The list also stores
the absolute time on which the relative offsets are based (B), effectively
storing the absolute target time for each entry (as B + sum(T[0-n])).
Storing the entries in this way allows all entries to use the full width of
the used uint32_t, compared to storing the absolute time.

In order to prevent timer processing offset to add up, whenever a timer
triggers, the list's absolute base time is set to the *expected* trigger
time (B + T[0]). The underlying clock is then set to alarm at (now() +
(now() - B) + T[1]). Thus even though the list is keeping relative offsets,
the time keeping is done by keeping track of the absolute times.


## Clock extension

The API always allows setting full 32bit relative offsets for every clock.

In some cases (e.g., a hardware timer only allowing getting/setting smaller
values or a conversion which would overflow uint32_t for large intervals),
ztimer takes care of extending timers.
This is enabled automatically for every ztimer clock that has a "max_value"
setting smaller than 2**32-1. If a ztimer_set() would overflow that value,
intermediate intervals of length (max_value / 2) are set until the remaining
interval fits into max_value.
If extension is enabled for a clock, ztimer_now() uses interval
checkpointing, storing the current time and corresponding clock tick value on
each call and using that information to calculate the current time.
This ensures correct ztimer_now() values if ztimer_now() is called at least
once every "max_value" ticks. This is ensured by scheduling intermediate
callbacks every (max_value / 2) ticks (even if no timeout is configured).


## Reliability

Care has been taken to avoid any unexpected behaviour of ztimer. In
particular, ztimer tries hard to avoid underflows (setting a backend timer
to a value at or behind the current time, causing the timer interrupt to
trigger one whole timer period too late).
This is done by always setting relative timeouts to backend timers, with
interrupts disabled and ensuring that very small values don't cause
underflows.


## Configuration and convention

As timer hardware and capabilities is diverse and ztimer allows configuring
and using arbitrary clock backends and conversions, it is envisioned to
provide default configurations that application developers can assume to be
available.

These are implemented by using pointers to ztimer clocks using default names.

For now, there are:

ZTIMER_USEC: clock providing microsecond ticks

ZTIMER_MSEC: clock providing millisecond ticks, using a low power timer if
             available on the platform

ZTIMER_SEC:  clock providing second time, possibly using epoch semantics

These pointers are defined in `ztimer.h` and can be used like this:

    ztimer_now(ZTIMER_USEC);


## Differences to xtimer

- the addition of a "clock" parameter makes it possible to work with
  multiple differently clocked timers and backends

- the API is 32bit only

- much of ztimer can be unittested on a mock clock

- the internal timer list uses relative timers and thus doesn't need 64bit
  arithmetic or storage

- ztimer_extend extends <32bit timers in an ISR safe way

- has much reduced number of configuration tunables (no XTIMER_BACKOFF,
  XTIMER_ISR_BACKOFF)

- ztimer always executes callbacks in ISR context, whereas xtimer might
  trigger very short timers in ISR context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ztimer problem statement and design document

Introduction

Problem statement

Requirements

current state

Intermediate conclusion

ztimer

ztimer design trade-offs:

Development history

High level design

Power management considerations

Outlook

Implementation details

Introduction

Home

Supported platforms

Further Information

Clone this wiki locally