-
Notifications
You must be signed in to change notification settings - Fork 6.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Introduce Clock Management Subsystem #70467
[RFC] Introduce Clock Management Subsystem #70467
Conversation
The clock management subsystem requires some additional devicetree macros be generated, in order to enable SOC clock drivers to access clock setpoint data. The following new macros are generated: - DT_N{node_path}_CLOCK_STATE_{idx}_{clk_id}_EXISTS: defined to 1 if a clock node with the clock-id property "clk_id" is referenced in the setpoint index "idx" for the node at "node_path". Used by SOC clock management code. - DT_N{node_path}_CLOCK_STATE_{idx}_{clk_id}_IDX: set to the phandle index within the setpoint property at index "idx" for the node at "node_path", if DT_N{node_path}_CLOCK_STATE_{idx}_{clk_id}_EXISTS is defined for the given "clk_id". Used by SOC clock management code. - DT_CLOCK_ID_{clk_id}_USED: defined to 1 if some node in the DTS references a clock node with the clock-id property "clk_id" in its "clocks" property. Used by SOC clock management code. - DT_FOREACH_CLOCK_ID(fn): expands to an invocation of "fn" with each "clock-id" present in the devicetree. Used by generic clock callback code. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add script to generate clock management code based on SOC clock management implementation. This script has the following steps: 1. Identifies clock setpoint and clock subsystem handler implementations within the SOC clock management file set during the build 2. Iterates though each node in the devicetree, and: a) for each clock referenced in the "clocks" property, generate a function based on the clock subsystem handler template b) for each clock setpoint on the node, generate a function based on the clock setpoint handler template 3. Write the file "clock_mgmt_soc_generated.c", which contains the original contents of the SOC clock management file, but replaces the clock setpoint and clock subsystem handler templates with generated functions If the clock setpoint and subsystem handlers were instead replaced with macros that expanded to functions, the GCC preprocessor could fill the role of this script. The reason for implementing it is to simplify debugging errors with SOC clock handler implementations, as diagnosing issues with large functions defined in macros can be very difficult. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Introduce clock management driver class. Clock management is intended to abstract clocks via "setpoints", which correspond to a clock configuration for a given power state (usually only for a specific peripheral clock). The clock management driver system supports setting clock setpoints, querying clock rates, and subscribing to callbacks on clock rate changes. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add common CMake and KConfig infrastructure for clock management, as well as code for handling clock management callbacks. Clock management callbacks are handled via a linker registration system. A singly linked list is defined for each clock output node in the clock_callbacks.c file. However, the linker will discard all these linked lists, unless a driver references one within the clock callback registration function. Therefore, only the singly linked lists needed for clock callbacks will actually be included in the final image. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add generic clock node bindings for common clock controller elements. These bindings include the following: - clock-div: for clock dividers - clock-gate: for clock gates - clock-multiplier: for clock multipliers - clock-mux: for clock multiplexers capable of selecting from multiple inputs to generate an output - clock-output: clock output, a clock node which will have a defined output frequency and can drive clocks used within the system - clock-source: a fixed or configurable clock source output (such as an internal 1MHz oscillator) - clock-device: similar to pinctrl-device, defines the clock setpoint properties used by nodes that leverage the clock management subsystem Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add documentation for clock management subsystem, which describes how to create a clock management SOC implementation and utilize clock management within drivers Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add devicetree clock definitions for the LPC55S6x series. These clock definitions describe the full clock tree for the SOC, and reuse generic compatibles where possible. A NXP specific PLL compatible was added since the PLL has non-generic configuration parameters. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Add code to handle clock setpoints and subsystems on the LPC55xxx series SOCs. The clock setpoint implementation handles all clocks in the clock tree, but should optimize to a few function calls for most setpoints, as should the clock subsystem handler. Note that the clock subsystem handler currently only implements support for the Flexcomm clock frequencies, as this will be the initial driver ported to the clock management framework. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Convert flexcomm driver to use clock management API. This conversion changes the "clock" properties for the flexcomm nodes, as they no longer query clock rate using the clock control API. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
Enable clock management for the LPCxpresso55s69 board. Signed-off-by: Daniel DeGrasse <daniel.degrasse@nxp.com>
# Helper macro to set clock management driver | ||
macro(zephyr_clock_mgmt_driver_ifdef toggle driver) | ||
if(${toggle}) | ||
set(clock_mgmt_driver ${CMAKE_CURRENT_SOURCE_DIR}/${driver}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit confused by this, it seems it's only possible to have 1 clock mgmt driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes- I should have made that more clear in the RFC body. Like pin control, this subsystem is designed to be single instance. The idea is that an soc clock control driver is capable of handling all SOC clocking configurations and clock subsystems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this work for complex systems involving external clock generators? Or FPGAs with multiple clocking subsystems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both good questions. For an FPGA with multiple clocking subsystems, how would this differ from an SOC? My understanding is that once a given bitstream is loaded to an FPGA, you could then load a firmware that handled the clock tree implemented within that bitstream.
The external clock generator issue is a bit more complicated, and a good case to motivate multiple clock devices. The core reason I wanted to avoid this (as mentioned above) was that I want to use the phandle references within a node setpoint to refer to clock nodes, not the clock device. I guess a general question would be does this approach make sense? I like the simplicity it lends to most configurations, but it comes with tradeoffs in terms of new tooling (as I hope this PR makes clear)
Since we are already changing the devicetree tooling for this PR, perhaps I could add code to find the "clock controller" device for a given clock node (which should be above the clock node in the device tree hierarchy), and generate a reference to that node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this work for complex systems involving external clock generators? Or FPGAs with multiple clocking subsystems?
My interpretation of this RFC is that it is an SOC clock management framework. What would need to be addressed regarding external clock sources in this system? Wouldn't the external devices still be configured by some driver?
Do I understand correctly that a "setpoint" introduced in this RFC describes the full configuration of a given clock, meaning its frequency, accuracy, and precision? If that's the case I don't think the proposed solution is flexible enough. Let's discuss an example: In an MCU we have peripheral X and peripheral Y, both clocked by the same clock source (CS). CS can be configured in modes: A, B, and C. Modes A and C result in good accuracy, modes B and C in high frequency, while A and B in low power. During one operation, the device driver X requires good accuracy, while the device driver Y requires high frequency. If I understand this proposal correctly, the device driver X would call I think instead of using the predefined clock setpoints, the device drivers should report their requirements regarding their clock sources: |
Hi @hubertmis, thanks for the feedback here- yes, you are understanding the idea behind setpoints correctly. This is a good point- conflicts are easily possible with this solution. However, one of the core ideas behind setpoints (and this solution as a whole) is to avoid runtime calculation wherever possible. This is why the API doesn't include something like For a case like you mentioned, it is the responsibility of the user to determine that both devices ought to use clock mode "C", and set that in the devicetree setpoints. Of course this puts some complexity on the user that otherwise could be handled by software, but I expect that tooling can be written to do these types of "clock solver" problems for the user, and generate logical setpoints for their devicetree. |
In the example I sketched device driver X should use mode "A" whenever possible, device driver Y should use mode "B" whenever possible, and both should use mode "C" only while both are active. Considering the separation of concerns, device driver X should not have visibility in device driver Y activity, so they should never explicitly request mode "C". I think the API should operate on clock properties, not setpoints. In simple devices, to minimize processing overhead, the implementation hidden below the API could be as simple as
While in more complicated devices like the one I sketched the calculations in the clock_mgmt driver would need to be accordingly more complex. |
I agree that drivers should not have visibility into each other's operation/activity. In my view, an application enabling device X and device Y would configure the setpoint for each device to request clock mode C. The base board devicetree would likely setup device X with a setpoint to use mode A, and device Y with a setpoint to use mode B. Then, an application enabling device X and device Y at the same time would provide an overlay to select clock mode C. I want to emphasize that keeping this framework very simple was an intentional choice- I want to avoid complicated clock algorithms, because I feel that implementations like that will use up excessive flash space on embedded devices. Do you feel the tradeoff there is worth it? If so, I'm happy to look at expanding this framework to support managing clock dependencies with runtime code |
I understand your proposal, but I see a few obstacles we should fix:
I think it depends. We have simple devices without power constraints, with limited processing power, and a simple clock tree. In these devices, we should focus on performance and create low-overhead clock drivers, avoiding unnecessary algorithms occupying memory and CPU cycles. But Zephyr also runs in more complex devices with greater memory size, complicated clock trees, and run-time quality requirements that cannot be hardcoded in the device tree. In these devices, the clock tree must be reconfigured based on a run-time algorithm. The clock API in Zephyr must be suitable for all types of supported devices to allow portability of the device drivers and applications. The consequence is that it won't be optimal for either, but achieving portability is worth this price. We should find a solution that is as close to optimal as possible for all use cases supported by Zephyr. |
This feels like a case where device power management could be used- IE if device X does into sleep state, it selects a setpoint that uses clock mode B (since device Y is now the only consumer). Device Y could have a sleep setpoint that uses clock mode A, and both devices could select a setpoint with clock mode C when they wake from sleep. This way when one of the devices becomes inactive, the lower power clock mode is selected.
Personally, I think this is totally normal when building a devicetree for a new application- devicetree is intended to describe the hardware, so you need to understand the hardware to write your devicetree. If you want to enable SPI or I2C on your board, you need to determine which pins the hardware will use for SPI/I2C based on your board design. Do you feel that clock setpoints would place a more significant burden than understanding pin control for a given device does?
This is a good point- Zephyr runs on cores capable of running Linux, all the way down to embedded devices under 100MHz. What I have yet to determine is how to create a clock framework that can optimize effectively on smaller devices, while still accommodating larger ones. I guess a few key questions here:
|
Right, but it does not scale for a system in which we have more, e.g. 3 devices clocked from the same source: X and Z requiring high accuracy (modes A or C), while Y requires high frequency (mode B). Also, it does not cover the case when all of the devices are disabled and their clock source can get disabled as well.
I see creating a device as an activity separated from designing application logic. A device tree should be created by a hardware expert who knows the SoC internals well. The application logic can be designed by another department, or by another company. Or it can be integrated using modules from multiple software vendors. So the device tree creator should not need to know if device drivers X, Y, and Z are going to be used simultaneously or not. This information should not leak to the device tree.
Yes. If in the SOC I use peripheral X and Y in separate application states, I would need to configure clock setpoints like "sleep = disabled; active = A". If I use X and Y simultaneously I would set "sleep = B; active = C" (which I think is still incorrect). So the device tree would not reflect only the device, but also the potential application states.
This idea is one of the valid solutions for the smaller systems. But I don't see a good way to scale it up for the complex systems. That's why I think we should find another good solution for small systems that scales up well.
We need a framework capable of a runtime component. In small systems, in which compile-time setpoints configuration is enough, the framework would need to create zero or minimal overhead for performance and memory consumption. |
This is a good point, thank you for raising it. I think we could solve this with some form of reference counting on each clock- what are your thoughts there? If we stuck with the existing implementation this would be implemented in the SOC clock driver level.
I see your point here. What would your thoughts be on an system like the following be?
The above is only a rough idea, just wanted to get your feedback before I investigated how we could implement it. |
Yes, we need to count requests for this kind of implementation. However, my point is that one size does not fit all. We should create a flexible API/framework with the possibility to implement the clock drivers differently for different MCUs or families. Some implementations should be as simple as setting the matching setpoint (if there is only one requestor of the given clock in the system), others more complicated ones need to count requests for a given setpoint, and others even more complicated need to switch setpoints runtime depending on multiple requests.
Could you post a sketch of the device tree nodes and the API illustrating the points you described? I want to make sure I understood your proposal before I comment. |
Sure- I've tried to write out the changes in more detail below. The TL;DR of the change is that we essentially do the following:
If we want to implement this, I will likely do so in a separate RFC, as I think the underlying implementation is different enough to justify creating another PR Driver APIThe sole change to the driver API is that the clock management callback handlers now accept a user data pointer as well. This is done to make setup of callbacks within the SOC level clock implementation (described below) simpler. Drivers still use setpoints to configure clocks, and all clock management is abstracted from the driver. (all the driver is aware of is the power state the setpoint targets). Application devicetreeApplication devicetrees will still use setpoints, but clock outputs will now have a single clock cell, which allows a frequency to be specified. Therefore, an application devicetree may look like the following:
Setting frequency like this will require a developer to enable Alternatively, an application looking to optimize for reduced resource utilization could do the following:
The second setting here would configure the "uart_clk_mux" clock node directly, to select a different input source. SOC devicetreeThe SOC devicetree will have the following changes:
For example, a section of the LPC55S69 devicetree might look like so:
Internal Clock Management InfrastructureRather than using the templated setpoint functions like are currently present in this RFC, the implementation will use a The "struct clk" definition will look similar to the following:
Note that many common fields in the Linux clock framework are absent- this was intentional. I want to avoid common fields SOCs will supply clock drivers, which can use devicetree to define clock structures. This will look similar to the The clock management framework will expose a set of APIs to clock drivers only, which should not be used by
The clock management framework will then define structures like the following to enable wrapping internal clock APIs:
Initializing these setpoints will require some SOC specific implementation. I think the easiest way to do this will be
And the node is defined in the clock tree like so:
Then the SOC clock driver would need to define a macro like the following:
SOC implementationSOC implementation will include definitions for Such an implementation for a mux might look like the following:
|
Thank you. The new proposal looks much more scalable, including most of the needs of the complex systems. The last remaining limitation I see is that the solution focuses mostly on the clock frequency and ignores other clock properties like accuracy or precision. I'm curious if the device tree properties you proposed could be extended to something like
Then the driver could select the "precise" clock for fast baud rates, "default" for slower operations, and "sleep" while inactive. The accuracy control makes sense in some systems, but in others, it does not. So the list of clock properties described in Then the functions would differ a little because they would be not only about setting and getting clock rate, and notifications on parent rate changes, but also about other properties that would need to be included preferably in generic function calls. Could other clock properties than the frequency be included in the framework you propose? |
Yes, I think this would be possible. Essentially, the For example, I imagine we will have a generic "clock-output" compatible for most cases, with one "frequency" cell. The "clock-output" driver will look something like the following:
Then there would be a definition in the clock framework like so:
This would be the generic driver for clock outputs, but a vendor could easily provide a different compatible for their clock outputs, and a different driver. Such a driver's implementation for the
If a vendor needs to pass the accuracy or precision parameter to parent clocks, they could instead call |
| | | power or sleep modes | | ||
+-------------+-------------------------------------+-------------------------+ | ||
|
||
Devicetree Representation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is out of sync with the actual dts changes and bindings you are introducing.
Try to figure out if you're suggesting we go away from clocks
properties or not.
Closing this version in favor of #72102, as that seems like a more logical direction |
Introduction
This PR proposes a clock management subsystem. The eventual goal would be to replace the existing clock control drivers with implementations using the clock management subsystem. The subsystem abstracts clock management via "setpoints" and "subsystems".
A "setpoint" is defined as a clock tree state, which will typically contain clock settings for a given peripheral but may configure clock generators such as PLLs as well
A "subsystem" is defined as a clock output- IE a clock source that can be used directly by peripherals on the system.
Setpoints are defined in terms of power states, with a
DEFAULT
andSLEEP
state defined by default. Drivers can define additional custom states if needed for their use case.Drivers can both query clock rates from subsystems, and register for a callback when the state of a subsystem changes. This allows a driver to respond to clock rate changes that occur as a consequence of another driver or subsystem applying a clock setpoint.
Problem description
The core goal of this change is to provide a more device-agnostic way to manage clocks. Although the clock control subsystem does define clocks as an opaque type, drivers themselves still often need to be aware of the underlying type behind this opaque definition, and there is no standard for how many cells will be present on a given clock controller, so implementation details of the clock driver are prone to leaking into drivers. This presents a problem for vendors that reuse IP blocks across SOC lines with different clock controller drivers.
Beyond this, the clock control subsystem doesn't provide a simple way to configure clocks.
clock_control_configure
andclock_control_set_rate
are both available, butclock_control_configure
is ripe for leaking implementation details to the driver, andclock_control_set_rate
will likely require runtime calculations to achieve requested clock rates that aren't realistic for small embedded devices (or leak implementation details, ifclock_control_subsys_rate_t
isn't an integer)Proposed change
This proposal provides the initial infrastructure for clock management, as well as an implementation on the LPC55S69 and an initial driver conversion for the Flexcomm serial driver (mostly for demonstration purposes). Long term, the goal would be to transition all SOCs to this subsystem, and deprecate the clock control API. The subsystem has been designed so it can exist alongside clock control (much like pinmux and pinctrl) in order to make this transition smoother.
The application is expected to assign clock setpoints within devicetree, so the driver should have no awareness of the contents of a clock setpoint, only the target power state. Subsystems are also assigned within the SOC devicetree, so drivers do not see the details of these either.
Detailed RFC
Driver/Application Side
From the driver side, clocks are fully abstracted to a set of subsystems and setpoints. The driver has no awareness of the underlying clock device.
The clock management subsystem provides the following functions for drivers to interact with clocks:
clock_mgmt_get_rate
: Read the rate of a clockclock_mgmt_apply_setpoint
: Apply a clock setpointclock_mgmt_init_callback
: initialize a clock callback structureclock_mgmt_add_callback
: add a clock callback structure, so callbacks will be issued for a given subsystemclock_mgmt_remove_callback
: remove a clock callback structureBeyond this, the driver should only interact with the clock subsystem to define data structures for it, by calling
CLOCK_MGMT_DEFINE
andCLOCK_MGMT_INIT
macros within its initialization macro (this is intentionally very similar to pinctrl).The devicetree definitions for clocks are where hardware specific details are described. For example, a node might have the following properties:
Note that the cells for each clock setpoint node are intentionally device specific. It is expected that this values will be used to configure settings like multiplexer selections or divider values directly.
The driver could then interact with the clocks defined in devicetree like so:
SOC Implementation Side
Note- much like pinctrl, SOCs can only have one clock management driver. This design choice was intentional- by implementing clock management like this, nodes do not need to reference a device driver phandle when configuring clocks but can instead reference clock nodes
Clock nodes are heavily based around the concept of a "clock-id". Every clock node must have this property. It is used in a similar manner to how a node label might be used by an application, but is specific to the clock subsystem.
The SOC clock tree might look like so:
Note that the number of specifier cells is node specific, as is the significance. For example, the 6 clock cells on the PLL will configure standard features such as the PLL multiplier and divider, as well as parameters specific to this PLL implementation.
In order to minimize flash usage, both setpoint and subsystem handlers are defined as functions. These functions are implemented by the SOC clock management driver, and are expected to heavily utilize devicetree to result in optimized function implementations. For example, a function for setpoints might look like so:
For a setpoint that only configures the
CLOCK_DIV
clock ID to3
, the this would optimize to the following:By leverage devicetree macros and GCC optimization, most setpoint functions should come out to a few function calls, rather than parsing a large C structure to determine how the clock tree should be configured.
A similar concept can be used for the subsystem handlers:
At this point, you may have noticed the "function" definitions look a bit strange. This is because these functions are not truly functions that will be compiled into the final image, but something I describe as "templates". These templates are the same concept as defining a function within a macro, like so:
The reason for these "templates" is that macros like the above are awful to debug. Therefore, the build system will automatically take the function templates defined for an SOC, and create per-setpoint and per-subsystem function implementations, which will then be expanded by the GCC preprocessor and optimized by the complier to produce the functions in the final image.
Proposed change (Detailed)
Here I wanted to dive a bit deeper into how the clock management code works on the SOC side. There are two python scripting changes within this PR:
scripts/dts/gen_defines.py
scripts/build/gen_clock_mgmt.py
Changes to
gen_defines
The changes to the
gen_defines
scripting are required to enable the concept of "clock-ids" covered above. The key issue here is that SOC clock drivers need a way to check if a given clock node is configured by a setpoint, and take action if so. Crucially, this information must be available at compile time to enable optimizing each setpoint function. By providing new devicetree macros to check for the presence of the clock node with a given clock ID in a setpoint, and extract properties from it if such a node exists, setpoint functions can be written that will be optimized to one or two function calls by the compiler.The other change to the devicetree tooling is to generate a macro to iterate over every clock ID in use as a subystem. This is needed to enable callback handling, which I describe in more detail below
gen_clock_mgmt
The new
gen_clock_mgmt
script is a bit strange. While implementing this support, I initially tried to define clock management handlers as macros, which expand to functions. While this is possible to implement, debugging something as simple as a missed parenthesis becomes a long process of staring at code, since the compiler can't effectively report errors on a line by line basis. Whatgen_clock_mgmt
does is take a clock driver template like the following:clock_mgmt_lpc55xxx.c:
And expand it into function definitions like the following:
clock_mgmt_soc_generated.c
:The function definitions are placed inline within
clock_mgmt_soc_generated.c
, at the same location where the templates previously were in the file. Although this script effectively duplicates part of the functionality of the GCC preprocessor, I think it is justified in order to make the lives of SOC implementers much easier.Clock Callbacks
Finally, I wanted to touch on how clock callbacks are implemented. The goal of the implementation was to enable clock callbacks to be on a per-subsystem basis, without needing to define data structures for clock subsystems that will never be used. Therefore, we use the linker to "register" clock management structures (which are simply
sys_slist_t
linked lists). Theclock_callbacks.c
common code defines asys_slist_t
structure for every clock ID present in the clock tree. Then, theCLOCK_MGMT_DEFINE
macro will reference thesesys_slist_t
structures for a given clock subsystem. What this means in effect is that the linker will discard anysys_slist_t
structure for a clock ID that is not referenced by a driver. Thesys_slist_t
structures are placed in named sections, so the macro to fire a clock callback for a given ID can access the correctsys_slist_t
structure without actually referencing it by name (and therefore pulling it into the build)Dependencies
This is of course a large change. I'm opening the RFC early for review, but if we choose this route for clock management we will need to create a tracking issue and follow a transition process similar to how we did for pin control.
Concerns and Unresolved Questions
There are a few things I'm not sure of with this PR's implementation:
gen_clock_mgmt
script worth the tradeoffs? I have concerns about parsing C code from python- although the script needs to do very little (essentially locate the bounds of two functions, and then run a find and replace), we still may run into nuances in the differences between how the GCC preprocessor handles token pasting, versus the naive find and replace used by the scriptingAlternatives
I have considered extending the existing clock control API with setpoints, and in fact have a draft PR to do so here: #66732. I think this approach makes more sense because while implementing that RFC, I realized that clocks were still exposed to the driver within the clock control subsystem, so even if setpoints abstracted clock configuration drivers would still have some dependencies on the clock control driver they were written to use.
I have also considered something like Linux's common clock framework: https://www.kernel.org/doc/Documentation/clk.txt. I think with enough devicetree trickery (to optimize out clock structures that weren't needed) we could efficiently implement support for reading clock rates, but support for setting clock rates seems more challenging. Needing each clock element to keep a record of its parents and children will eat up flash space fast. The other issue with the common clock framework is that we really don't want to be calculating "best possible" rates, or anything similar. The specifiers on clock nodes within setpoints are vendor specific by design. The idea is that setpoints will directly set multiplexer and divider values for a clock state, rather than having this calculated at runtime.