Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] abstract clock subsystem: counter syntonization #60400

Closed

Conversation

fgrandel
Copy link
Contributor

@fgrandel fgrandel commented Jul 14, 2023

Proposes a well-defined intermediate concept of scalar syntonized nanosecond resolution time with overflow protection above low-level counters/cycles/ticks and below higher level time abstractions (timescales, calenders, etc.). Details see the discussion below.

Currently this happens in the net subsystem but in principle this can and may be promoted to something more generic as it is an implementation of layers 1 (counter) and 2 (syntonization) of the proposed clock subsystem RFC which is itself relevant to the POSIX roadmap.

The infrastructure discussed in this RFC will first be used for TSCH TDMA.

@fgrandel
Copy link
Contributor Author

@rlubos @jciupis @edmont relevant to you in the context of #59245

@fgrandel fgrandel force-pushed the feat/clock-subsys-layer-one branch 3 times, most recently from dc8294d to a95beef Compare July 14, 2023 23:30
@fgrandel
Copy link
Contributor Author

@cfriedt relevant to you in the context of POSIX clocks

include/zephyr/sys/ktime.h Outdated Show resolved Hide resolved
include/zephyr/sys/ktime.h Outdated Show resolved Hide resolved
include/zephyr/sys/ktime.h Outdated Show resolved Hide resolved
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real objections to the typedef, but it seems a little nonorthogonal? Right now kernel time is in ticks, because a "tick" is what kernel time is measured in by definition. Conversion of ticks to ns (both numerically and via the existing timeout mechanism) is already provided for anyone who needs it, and the utilities to do so are uniformly 64 bit (there are 32 bit converters for smaller units, but not ns).

Having a separate "kind" of time seems more confusing than helpful to me, honestly. There aren't any APIs that use this yet. What's the end goal here?

@cfriedt
Copy link
Member

cfriedt commented Jul 17, 2023

@andyross - I would almost suggest a 1:1 with Florian. I had a few broader picture discussions with him about some hi-res timer / clock things and was sold on it. Aside, those old tickets might actually get closed soon!

@andyross
Copy link
Contributor

Sure. There's just a long graveyard of attempts in this space, and I'd much rather we see an implementation/prototype arrive before making any changes to the public API. In particular an attempt to tie kernel timekeeping to a real unit is something we spilled a lot of blood trying to get away from, given that so many of our platforms use counters that don't convert cleanly into decimal units.

@fgrandel fgrandel force-pushed the feat/clock-subsys-layer-one branch 3 times, most recently from ed134fe to 838ad07 Compare July 18, 2023 18:11
@fgrandel
Copy link
Contributor Author

fgrandel commented Jul 18, 2023

I'd much rather we see an implementation/prototype arrive before making any changes to the public API

@andyross : 100% agreed - the onus is on me to prove that it makes sense to decouple certain uses of time from existing kernel concepts (ticks/cycles) and that ktime_t is the right candidate to do so. I agree with @cfriedt that a F2F would be helpful.

In any case I provide some more thoughts in writing - depending on your communication preference no need to read the following fine print, though, if you prefer F2F anyway...

Conversion of ticks to ns (both numerically and via the existing timeout mechanism) is already provided for anyone who needs it

The new type is not meant to represent kernel time "on the fly" but to represent time in structures and variables independently of the underlying clock source which more often than not will not be the kernel cycle counter.

given that so many of our platforms use counters that don't convert cleanly into decimal units.

That's correct, rounding from/to the ktime_t domain must be dealt with. Clock sources will have to provide precise cycle values, similarly to what the kernel timer does (and the kernel timer is the default clock source). Rounding results in +/- 0.5 ns jitter or alarm errors but error does not accumulate over time.

There aren't any APIs that use this yet.

@cfriedt , @andyross : Do you agree that the waiting PR justifies ktime_t at least for the network subsystem? Would you prefer that we "hide" ktime_t somewhere in the network subsystem until we provide PRs in the POSIX/libc/TSCH area using that type in a broader sense?

IMO the ktime_t concept has been very useful to refactor time across the network subsystem (including but not limited to the IEEE 802.15.4 subsystem):

  • Attempts to use generic types to represent time in the network subsystem failed in the past as a precise common specification of time was never provided. Fields representing time were used inconsistently within the same subsystem and across drivers due to lack of a well-defined concept of high-resolution scalar time.
  • Standardizing around kernel cycles in net_pkt or the radio API wouldn't work as kernel cycles are not guaranteed to be high resolution enough to represent network time plus the clock source is almost never kernel time (but comparison to / synchronization with kernel time is still needed).
  • Other existing notions of time would be inadequate to represent time in the network subsystem as well. The rationale is given in the description of ktime_t for all such types representing time in Zephyr so far.

What's the end goal here?

The ultimate end goal is defined in an RFC.

As soon as this PR is accepted the waiting PR will be added to the code base immediately introducing ktime_t as network time.

But that alone would not justify introducing ktime_t in such an exposed place. We all need to be 100% convinced that at least the following features make sense in Zephyr in the short to mid term, should be decoupled from kernel time and use ktime_t-based intermediate clock sources instead:

  1. high-resolution POSIX clocks and libc time sharing a common adjusted and drift compensated low-power (tickless) hybrid RTC/high-res counter clock source which may or may not be (partially) based on the kernel timer and can be configured in DT to use other peripherals instead.
  2. hybrid nanosecond resolution clock sources for TDMA / timed TX/RX protocols in the IEEE 802.15.4 stack (namely CSL and TSCH) based on the combination of a low-power, always-on RTC and a radio timer providing high resolution during RX/TX windows which will usually not involve kernel time at all
  3. synchronization/compensation of the main internal clock source (and thereby POSIX/libc clocks) with low-frequency, high-precision "pulse-per-second" devices like always-on RTCs, NTP, PTP, GPS, clocks derived from TDMA radio or industrial automation protocols or even atomic clock modules.

These use cases have in common:

  • They all rely on "hybrid" clock sources combining at least two hardware peripherals or drivers of which the kernel timer is only one (if used at all).
  • They share a couple of algorithms (timer multiplexing, drift compensation, leap second support, epoch offsets, etc.) which should be re-usable across all of them as a shared clock subsystem and which cannot sensibly be based on ticks/cycles for sufficient precision and configurability.

Next steps that follow immediately will be:

  • I'll provide PRs for the basics of a clock subsystem with support for generic hybrid clock sources, drift adjustment + well-defined timescales (this is when ktime_t would have to be moved from the network subsystem to something shared).
  • @cfriedt , if I understood you correctly you'll then be able to use that clock subsystem to solve the current layering problem in POSIX and libc plus introduce a DT-based configuration approach for the hybrid POSIX/libc clock source.
  • At the same time I'll start to base TSCH timing on the same common clock subsystem.

@fgrandel fgrandel force-pushed the feat/clock-subsys-layer-one branch from 838ad07 to bc4409b Compare July 18, 2023 21:02
@fgrandel
Copy link
Contributor Author

@kartben The build suddenly fails with "API_DEFINE: do not specify a non-Zephyr API for libc File:include/zephyr/sys/ktime.h". Any idea what that means? The only thing that changed is the comments.

@kartben
Copy link
Collaborator

kartben commented Jul 18, 2023

@kartben The build suddenly fails with "API_DEFINE: do not specify a non-Zephyr API for libc File:include/zephyr/sys/ktime.h". Any idea what that means? The only thing that changed is the comments.

@keith-packard would know :)
See #60487

@keith-packard
Copy link
Collaborator

keith-packard commented Jul 18, 2023

@kartben The build suddenly fails with "API_DEFINE: do not specify a non-Zephyr API for libc File:include/zephyr/sys/ktime.h". Any idea what that means? The only thing that changed is the comments.

@keith-packard would know :) See #60487

Whoa. Looks like that new check is busted. PR #60551 should help here.

Adds the elapsed() and set_timeout() operations to the timeout api and
decouples all calls to these operations from the system clock.

This prepares for future pluggability of these operation.

Signed-off-by: Florian Grandel <[email protected]>
Decouples the activation of the system clock from the activation of the
generic timeout API so that the latter can be enabled independently from
the former.

Signed-off-by: Florian Grandel <[email protected]>
Exposes the newly created multi-instance timeout API via timeout_q.h
header file.

Signed-off-by: Florian Grandel <[email protected]>
The recent refactoring of time conversion functions to macros left a few
references orphaned in the docs.

Signed-off-by: Florian Grandel <[email protected]>
Now that "clocks" can mean system clocks or abstract clocks, the "clock"
concept has become ambivalent. Also there is an inconsistency between
system clock header files (that use the sys_ prefix) and docs (that do
not use the prefix).

Both, the ambivalence and the inconsistency, are being fixed by
prefixing the system clock doc group with "sys".

The change also renames "timeutil_unit_apis" which is also inconsistent
and not really readable anyway.

Signed-off-by: Florian Grandel <[email protected]>
Updates the existing net time docs to refer to the recently introduced
timepoint concept to further clarify what is meant by tick counter
values.

Signed-off-by: Florian Grandel <[email protected]>
The header files `zephyr/sys/time_units.h` and `zephyr/sys_clock.h` had
both, generic time conversion and system clock specific utilities. This
change creates a header file that will provide generic utilities to
represent and convert time to be used independently from the system
clock.

The newly created `zephyr/sys_time.h` header must not depend on any
other header files except for low level generic type declarations, the
toolchain and low level utilities so that it can be included from
`time_units.h` and `sys_clock.h` without include loops.

The header is named `zephyr/sys_time.h` to avoid confusion with the
POSIX `sys/time.h` header. The chosen name is consistent with existing
sys_* headers.

Signed-off-by: Florian Grandel <[email protected]>
Fixes a minor bug that prevented doxygen from correctly recognizing the
function to which the doc comment was attached.

Signed-off-by: Florian Grandel <[email protected]>
Moves generic time to timeout conversion utils to the general time
header.

Signed-off-by: Florian Grandel <[email protected]>
Moves generic time converter generator macros to the general time
header.

Signed-off-by: Florian Grandel <[email protected]>
@fgrandel fgrandel force-pushed the feat/clock-subsys-layer-one branch 2 times, most recently from 000a9e3 to 31ea073 Compare October 21, 2023 00:00
Slightly improves generic time converter macro documentation.

Signed-off-by: Florian Grandel <[email protected]>
The TIME_CONSTEXPR definition was obsoleted by
d8b276e and can be removed now.

Signed-off-by: Florian Grandel <[email protected]>
Improve readability of inline documentation by re-ordering related
comments.

Signed-off-by: Florian Grandel <[email protected]>
Moves public generic time constants that are not specific to the system
clock into the generic time header.

Signed-off-by: Florian Grandel <[email protected]>
Moves private generic time constants that are not specific to the system
clock into the generic time header.

Signed-off-by: Florian Grandel <[email protected]>
Moves all time point related utilities to the generic time utilities as
these are not specifically bound to the system clock.

Signed-off-by: Florian Grandel <[email protected]>
Now that the content of the sys_clock header is reduced to what is
really specific to the system clock, the condition macros and
documentation structure can be simplified and cleaned up.

Signed-off-by: Florian Grandel <[email protected]>
Introduces definitions for special timepoints corresponding to a
timepoint that can "never" occur and a "zero" timepoint.

Signed-off-by: Florian Grandel <[email protected]>
Introduces a dependency to a specific timout API into the timer
structure. This allows multiplexed timers to be defined against
arbitrary tick counter sources.

Also fixes a minor locking issue that would allow concurrent read/write
access on timeout nodes due to insufficient locking.

Signed-off-by: Florian Grandel <[email protected]>
Introduces the API for a syntonized network uptime reference including
the underlying hybrid (sleep/high-resolution) counter.

Also prepares the IEEE 802.15.4 driver API as a client of this API.

Signed-off-by: Florian Grandel <[email protected]>
Optionally introduces the network uptime API into the IEEE 802.15.4
driver API.

Signed-off-by: Florian Grandel <[email protected]>
Padding section alignment for 64bit-aligned timeout struct.

Signed-off-by: Florian Grandel <[email protected]>
Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Base OS Base OS Library (lib/os) area: IEEE 802.15.4 area: Kernel area: Networking Enhancement Changes/Updates/Additions to existing features platform: TI SimpleLink Texas Instruments SimpleLink MCU RFC Request For Comments: want input from the community Stale
Projects
Status: Done
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants