Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: posix: nanosleep: fix nanosleep for sub microsecond durations #28542

Merged
merged 2 commits into from
Oct 2, 2020
Merged

lib: posix: nanosleep: fix nanosleep for sub microsecond durations #28542

merged 2 commits into from
Oct 2, 2020

Conversation

cfriedt
Copy link
Member

@cfriedt cfriedt commented Sep 20, 2020

We must round up to the nearest microsecond in order to fulfill the nanosleep(2) API requirement of sleeping for at least that many nanoseconds.

Also, switching to k_usleep() rather than k_busy_wait() as instrumentation using k_cycle_get_32() seems to be less reliable for very small time sclaes.

Fixes #28483

@github-actions github-actions bot added area: POSIX POSIX API Library area: Tests Issues related to a particular existing or missing test labels Sep 20, 2020
Copy link
Collaborator

@pabigot pabigot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is unnecessarily complex; please use alternative solutions as suggested.

I'm not entirely happy about the TODO notes; it seems pretty unlikely Zephyr will ever evolve to having sub-microsecond timers at the system level. But I won't block on those if others believe they're useful.

lib/posix/nanosleep.c Outdated Show resolved Hide resolved
tests/posix/common/src/nanosleep.c Outdated Show resolved Hide resolved
@MaureenHelm MaureenHelm added the bug The issue is a bug, or the PR is fixing a bug label Sep 21, 2020
@cfriedt
Copy link
Member Author

cfriedt commented Sep 21, 2020

I opted to not use k_busy_wait() because it does not seem that instrumenting with k_cycle_get_32() is reliable enough at small time scales. Specifically, k_cycle_get_32() occasionally returns the same hardware cycle before and after nanosleep(2) is called for small time-scales.

Calling k_usleep() at least indicates that the thread is suspended / scheduled again with a non-zero value, which avoids both insufficient delay and seems to scale things up enough so that k_cycle_get_32() is more reliable.

Sadly, nanosleep(2) will not be sleeping for 1ns anytime soon.

The normative spec is satisfied, which is really the main goal.

@pabigot
Copy link
Collaborator

pabigot commented Sep 21, 2020

I opted to not use k_busy_wait() because it does not seem that instrumenting with k_cycle_get_32() is reliable enough at small time scales. Specifically, k_cycle_get_32() occasionally returns the same hardware cycle before and after nanosleep(2) is called for small time-scales.

Of course. Various platforms might have cycle rates much less than 1 MHz; Nordic in particular is 32 KiHz, and that clock is completely asynchronous to the clock used by k_busy_wait(), with a 13% relative error. Which is part of why this whole thing of measuring one clock against another is fraught.

Calling k_usleep() at least indicates that the thread is suspended / scheduled again with a non-zero value, which avoids both insufficient delay and seems to scale things up enough so that k_cycle_get_32() is more reliable.

Could be. Of course the farther this diverges from what was mostly working before, the more likely we'll find that various platforms no longer pass.

It's pretty much guaranteed that removing the tolerance check is going to be a showstopper for Nordic. If you want to go that route, we'll have to do a huge amount of testing.

@cfriedt
Copy link
Member Author

cfriedt commented Sep 21, 2020

I opted to not use k_busy_wait() because it does not seem that instrumenting with k_cycle_get_32() is reliable enough at small time scales. Specifically, k_cycle_get_32() occasionally returns the same hardware cycle before and after nanosleep(2) is called for small time-scales.

Of course. Various platforms might have cycle rates much less than 1 MHz; Nordic in particular is 32 KiHz, and that clock is completely asynchronous to the clock used by k_busy_wait(), with a 13% relative error. Which is part of why this whole thing of measuring one clock against another is fraught.

All the more reason we should implement clock_getres(2). At least that's a vendor-independent way of describing clock resolution, and then everyone wins.

Calling k_usleep() at least indicates that the thread is suspended / scheduled again with a non-zero value, which avoids both insufficient delay and seems to scale things up enough so that k_cycle_get_32() is more reliable.

Could be. Of course the farther this diverges from what was mostly working before, the more likely we'll find that various platforms no longer pass.

All platforms are currently passing tests.

It's pretty much guaranteed that removing the tolerance check is going to be a showstopper for Nordic. If you want to go that route, we'll have to do a huge amount of testing.

Sorry, I didn't realize this was that important to Nordic. I did not see any upstream code using it yet but I had planned to use it to make implementing std::thread and std::this_thread easier because libc++ / gthr-posix.h could be used as-is (#25569).

I'll see if I can make another rework that is more of a generic solution and will try and take the tolerance into account.

@pabigot
Copy link
Collaborator

pabigot commented Sep 22, 2020

All platforms are currently passing tests.

It isn't sanitycheck I'm so concerned with, it's running on real hardware, especially odd boards.

But I did just try one Nordic, one STM, and one TI board, and they all passed. So I've approved this because I really never want to have to look at this test again.

@cfriedt cfriedt marked this pull request as draft September 22, 2020 01:09
We must round up to the nearest microsecond in order to fulfill the
nanosleep(2) API requirement of sleeping for *at least* that many
nanoseconds.

The only platform with an upper-bound check right now is Nordic.

Fixes #28483

Signed-off-by: Christopher Friedt <[email protected]>
Here, we include some addtional tests for durations that have
sub-microsecond components.

1ns => k_busy_wait(0). Round to 1us.
1us + 1ns => k_busy_wait(1us). Round to 2us.
1s + 1ns => k_busy_wait(1000000us). Round to 1000001us.
1s + 1us + 1ns => k_busy_wait(1000001us). Round to 1000002us.

Fixes #28483

Signed-off-by: Christopher Friedt <[email protected]>
@cfriedt
Copy link
Member Author

cfriedt commented Sep 22, 2020

All platforms are currently passing tests.

It isn't sanitycheck I'm so concerned with, it's running on real hardware, especially odd boards.

But I did just try one Nordic, one STM, and one TI board, and they all passed. So I've approved this because I really never want to have to look at this test again.

Same.

I looked around for a few hours to see if there was something better than k_busy_wait() but found nothing. If higher resolution timers are not brought out via an API, and if there is no calibrated delay loop, then putting any more work into this is very much a path of diminishing returns, and I am already well beyond the amount of time I wanted to put in for basic API compatibility.

The best I can say that it will comply with the normative spec by delaying for at least the required number of nanoseconds, but that's it for now.

@cfriedt cfriedt marked this pull request as ready for review September 22, 2020 03:39
@@ -199,23 +178,30 @@ static void common(const uint32_t s, uint32_t ns)
zassert_equal(rem.tv_nsec, 0, "actual: %d expected: %d",
rem.tv_nsec, 0);

uint64_t actual_ns = k_cyc_to_ns_ceil64((now - then));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still approving, but this is where it's going to break: when we have a system where the cycle clock is slower than the CPU clock then we'll have to add a fudge factor for lower bounds that aren't met. That'll generally be board-specific; I think one of my reel_board's has exhibited that behavior in the past.

Copy link
Member Author

@cfriedt cfriedt Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, "break" is debatable. From the spec:

The suspension time may be longer than requested because the argument value is rounded up to an integer multiple of the sleep resolution or because of the scheduling of other activity by the system.

Which clearly applies here because the scheduler is triggered at least once.

The board I'm currently working on is also in the 32k club, so it's also nowhere near accurate.

I considered adding a negative value in rmtp to reflect the overrun w.r.t. rqtp but the Linux & macOS do not do that, and it seems that rmtp is only modified when nanosleep(2) is interrupted by a signal.

Ideally, a solution to #6498 would be rolled-out soon-ish and that should help to fix this issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, "break" is debatable. From the spec:

The suspension time may be longer than requested because the argument value is rounded up to an integer multiple of the sleep resolution or because of the scheduling of other activity by the system.

Let me try again:

Assume the CPU clock that's used for the nanosleep busy-wait runs at 10 MHz and is precise: there are 10^7 ticks in one second.

There is no access to this as a clock, so a different clock is used for the system clock that counts cycles. Assume it runs at 10 kHz and has a -500 ppm error: it ticks 9995 times in one second.

Then the duration measured for a 1 s sleep taking 10^7 CPU cycles will be 999500 us, which will appear to violate the spec because it's shorter than expected. The test will fail, even though the code does exactly what it's supposed to do. That's what I mean by "break".

This mostly shows up in Nordic because other platforms have synchronous CPU and cycle clocks, but in theory it can apply to any platform. The frequency tolerance for Nordic LFRC is 5%, and for HFRC is 8% so a 13% error between the CPU and cycle clocks can be observed. It's pretty likely at least one board in some test environment somewhere will display a negative ppm error and fail the test.

Copy link
Member Author

@cfriedt cfriedt Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly shows up in Nordic because other platforms have synchronous CPU and cycle clocks, but in theory it can apply to any platform. The frequency tolerance for Nordic LFRC is 5%, and for HFRC is 8% so a 13% error between the CPU and cycle clocks can be observed. It's pretty likely at least one board in some test environment somewhere will display a negative ppm error and fail the test.

Thanks for clarifying. I think that is a definite concern when using one clock domain to try and synchronize a separate clock domain. That is not what is happening here. In the implementation, I took out the busy wait and opted to use k_usleep() instead, which synchronises in the same (albeit slow as heck) clock domain.

Assumption 1: k_usleep() does not use the cycle counter that k_busy_wait() uses.

Assumption 1 seems to be valid but please correct me if I'm wrong.

Assumption 2: calling k_usleep() for any non-zero duration would incur at least 1 expiration of the cycle clock

My assumption was, that by doing so, there would be a guarantee that the cycle clock must expire at least once (it is likely more than that).

It would appear that Assumption 2 also holds.

So I am skeptical that (now - then) will ever be a negative value in this test.

I realize that nanosleep is incredibly inaccurate this way, but it is both compliant with the spec and the tests do not break.

Given that Zephyr is not relying on nanosleep anywhere yet, I feel that the inaccuracy is a lesser evil than requiring all platforms to try and tune their clock-tolerances and to try and do something hackish in the tests. I like keeping things simple-ish.

I think the issue here, is that Nordic put a lot of work into trying to tune nanosleep based on clock tolerances, but it was difficult to get tests to pass consistently. Tests also do not pass consistently in qemu as well, or on other platforms I was using. Using clock tolerances is a bit of a slippery slope.

The posixly-correct way to fix this would be to provide access to a collection of clocks and their resolutions via clock_getres(), then call clock_nanosleep() using the clock with the best resolution.

Since we don't (yet) have clock_getres(), and I believe we do not have the ability to use an arbitrary clock to un-suspend a thread, and also since we do not have a calibrated per-cpu delay-loop, I'm ok with using k_usleep() until other more reliable and more accurate options are available.

I realize it's a lot to ask, but can you be ok with that as well for the next release?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd missed that the implementation had also changed away from busy-wait. I haven't seen anything here that makes me remove my existing approval, but we need @pfalcon to approve this.

For the other stuff, I just want to point to #19030 and its parent #19282. Clocks and time are very important to me.

Copy link
Contributor

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd missed that the implementation had also changed away from busy-wait. I haven't seen anything here that makes me remove my existing approval, but we need @pfalcon to approve this.

I wonder if ditching k_busy_wait() completely was a bit, well, premature. I'd personally consider it a good idea to use k_busy_wait() for a very small small delay (below some threshold, e.g. a few microseconds), to let people do funky things, e.g. bit-banging in pure POSIX. That said, I don't have such a usecase in a sleeve, this PR has gone thru the detailed discussion already, and @cfriedt is the original author of this nanosleep() implementation, so let me just approve.

@carlescufi carlescufi merged commit 75b9292 into zephyrproject-rtos:master Oct 2, 2020
@cfriedt cfriedt deleted the issue/28483/fix-nanosleep-2-for-sub-microsecond-durations branch December 8, 2020 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: POSIX POSIX API Library area: Tests Issues related to a particular existing or missing test bug The issue is a bug, or the PR is fixing a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix nanosleep(2) for sub-microsecond durations
5 participants