WIP kernel: define and use a formal scheduling API #29668

andrewboie · 2020-10-30T19:03:25Z

The intention of this header is to provide a supported interface for building IPC objects. Callers can use these functions to implement IPC without worrying about safety with respect to sched_spinlock or thread objects unexpectedly changing state.

Previously, implementations of IPC objects were using various ksched.h APIs which are kernel-private and can present thread safety issues. This is a formal set of APIs which should always do the right thing.

Variants using irq locking instead of spinlocks have been intentionally omitted; anything still doing this should be considered legacy and needs to be converted ASAP.

A warning is added to ksched.h to convey how perilous it can be to use the APIs within until further cleanup and scoping can be done.

This PR addresses one possible cause of #28105 but does not completely resolve it; occasional "Attempt to resume un-suspended thread object" faults can still be observed.

Some of the more complex IPC have not been converted yet, this PR is a draft.

A problem with a recursive spinlock (sched_spinlock) when freeing memory on thread exit has been worked around; once #28611 lands I can work up a better solution.

TODO:

To match z_thread_malloc() calls. Does not invoke the scheduler. Signed-off-by: Andrew Boie <[email protected]>

The intention of this header is to provide a supported interface for building IPC objects. Callers can use these functions to implement IPC without worrying about safety with respect to sched_spinlock or thread objects unexpectedly changing state. Previously, implementations of IPC objects were using various ksched.h APIs which are kernel-private and can present thread safety issues. This is a formal set of APIs which should always do the right thing. Variants using irq locking instead of spinlocks have been intentionally omitted; anything still doing this should be considered legacy and unsupported. A warning is added to ksched.h to convey how perilous it can be to use the APIs within; we're not even using these in a completely safe way within the core kernel, much less users outside of it, although there are no known issues on uniprocessor systems. Signed-off-by: Andrew Boie <[email protected]>

Closes races where thread state could change in between the z_unpend_first_thread() and z_ready_thread() calls on SMP. Signed-off-by: Andrew Boie <[email protected]>

Closes races where 'pending_thread' could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

andyross

Quick notes on the API. This all looks pretty good to me.

andyross · 2020-10-30T22:20:25Z

include/sys/scheduler.h

+ k_sched_wake_cb_t cb, void *obj, void *context);
+
+/**
+ * Wake up a thread pending on the provided wait queue


Should document as "wake up the highest priority thread" I think. Strictly a wait queue will present the highest priority thread that has been in the queue the longest for wakeup at the head of the list. We never had to detail that expressly anywhere because it was an internal API, but it deserves a callout here I think.

Yes agreed, will add this.

andyross · 2020-10-30T22:23:37Z

include/sys/scheduler.h

+
+ while (k_sched_wake(wait_q, swap_retval, swap_data)) {
+ woken = true;
+ }


This (releasing the lock between each thread) is the way z_unpend_all() works currently, but this probably deserves some thought as to whether we want this to be an atomic operation (hold the lock while you release all threads simultaneously). The latter is something that can't be emulated by the caller, where they could write the loop themselves.

The question is do we want that? I dunno, but it would be more cleanly specified, less surprising, and easy to do right here.

I had to do something like that in #29618 where I had a list of waiting threads and wanted to wake a subset of them, done by iterating over the set and extracting the ones I needed to a separate list under lock before releasing each thread (because the per-thread operation was a reschedule point; see #29610). A waitq has different semantics, so maybe it isn't appropriate for this interface.

the idea with these APIs is that there are two locks involved:

sched_spinlock, which is held for individual calls to k_sched_wake() which protects the integrity of the scheduling data structures and serializes the state of individual threads that were woken up. This keeps the scheduler consistent. This is all internal to these APIs.

Another, IPC-specific lock which establishes atomicity for whatever IPC operations being done. It's expected that this second lock is held by the caller across all k_sched_wake() and k_sched_wake_all() calls, until it's time to call k_sched_invoke() with that lock.

I put in the documentation for k_sched_wake():

* It is up to the caller to implement locking such that the return value of * this function does not immediately become stale. Calls to wait and wake on * the same wait_q object must have synchronization. Calling this without * holding any spinlock is a sign that this API is not being used properly.

@andyross so in other words, we actually do have:

hold the lock while you release all threads simultaneously

As the second lock will cover this case with respect to users of whatever IPC object is being built.
I don't think we need to hold sched_spinlock across these calls, as the second lock should be enough.

Let me know if this approach is workable.

andyross · 2020-10-30T22:24:15Z

include/sys/scheduler.h

+ * expired without being woken up.
+ */
+int k_sched_wait(struct k_spinlock *lock, k_spinlock_key_t key,
+ _wait_q_t *wait_q, k_timeout_t timeout, void **data);


Random note, but we probably want to promote wait queues to a proper "struct k_wait_q" definition instead of the internal type.

agreed, will add a patch to this series.

pabigot · 2020-11-02T14:07:19Z

include/sys/scheduler.h

+ * another CPU) when the callback is modifying the thread's state.
+ *
+ * It is only inside these callbacks that it is safe to inspect or modify
+ * the thread that was woken up.


It would be good to document the intended use of obj and context here.

Also to document what can and cannot be done within the callback. From what I can tell with #29610 if the callback invokes an operation that's a reschedule point spinlocks held by the caller might be broken.

the callback holds sched_spinlock, so the true limitation is that no other API that holds sched_spinlock can be called. Simply stating this would be extremely unclear to users, I'll try to express in a better way.

Thanks. So in this case the spinlock wouldn't be broken because the callback would block forever waiting to take sched_spinlock.

Could you take a look at #29610 and correct my understanding if it isn't true in general that uniprocessor spinlocks (like irq_lock) broken if the current thread is preempted as a result of making a higher priority thread read and rescheduling? Is it true in SMP?

If either is true I'll work on a documentation update, because currently we warn about that only for irq_lock.

pabigot · 2020-11-02T14:14:29Z

include/sys/scheduler.h

+
+ while (k_sched_wake(wait_q, swap_retval, swap_data)) {
+ woken = true;
+ }


I had to do something like that in #29618 where I had a list of waiting threads and wanted to wake a subset of them, done by iterating over the set and extracting the ones I needed to a separate list under lock before releasing each thread (because the per-thread operation was a reschedule point; see #29610). A waitq has different semantics, so maybe it isn't appropriate for this interface.

pabigot · 2020-11-02T14:20:34Z

include/sys/scheduler.h

+ * @param lock Address of spinlock to release when we swap out
+ * @param key Key to the provided spinlock when it was locked
+ */
+void k_sched_invoke(struct k_spinlock *lock, k_spinlock_key_t key);


This would be nice: I think it could be the thing that gets invoked at a reschedule point.

this particular API is the reschedule point. it is currently a thin wrapper on z_reschedule().

This uses the cherry-picked scheduler API from zephyrproject-rtos#29668 to avoid certain race conditions in SMP between unpending and readying threads. Signed-off-by: Peter Bigot <[email protected]>

github-actions · 2021-01-04T01:45:05Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

These functions are a subset of proposed public APIs to clean up several issues related to safely handling waking of threads. They have been made private as they interface may change, but their use will simplify the reimplementation of the k_work functionality. See: zephyrproject-rtos#29668 Signed-off-by: Andrew Boie <[email protected]> Signed-off-by: Peter Bigot <[email protected]>

These functions are a subset of proposed public APIs to clean up several issues related to safely handling waking of threads. They have been made private as they interface may change, but their use will simplify the reimplementation of the k_work functionality. See: #29668 Signed-off-by: Andrew Boie <[email protected]> Signed-off-by: Peter Bigot <[email protected]>

These functions are a subset of proposed public APIs to clean up several issues related to safely handling waking of threads. They have been made private as they interface may change, but their use will simplify the reimplementation of the k_work functionality. See: zephyrproject-rtos/zephyr#29668 BUG=none TEST=none Signed-off-by: Andrew Boie <[email protected]> Signed-off-by: Peter Bigot <[email protected]> (cherry picked from commit 0259c86) Change-Id: I9e7288cecafb05dcd126ba311407670c0ac6215d Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/zephyr/+/3004965 Reviewed-by: Jack Rosenthal <[email protected]> Reviewed-by: Denis Brockus <[email protected]> Commit-Queue: Denis Brockus <[email protected]> Tested-by: Denis Brockus <[email protected]>

github-actions · 2021-07-14T00:30:01Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

github-actions bot added area: API Changes to public APIs area: Kernel labels Oct 30, 2020

andrewboie requested review from andyross and nashif October 30, 2020 19:04

andrewboie force-pushed the smp-unpend-races branch from 11ccf12 to 5335472 Compare October 30, 2020 20:21

Andrew Boie added 9 commits October 30, 2020 13:25

kernel: add z_thread_free()

49f2b6c

To match z_thread_malloc() calls. Does not invoke the scheduler. Signed-off-by: Andrew Boie <[email protected]>

kernel: sem: use scheduler.h APIs

8308be5

Closes races where thread state could change in between the z_unpend_first_thread() and z_ready_thread() calls on SMP. Signed-off-by: Andrew Boie <[email protected]>

kernel: mem_slab: use scheduler.h

eaf3137

Closes races where 'pending_thread' could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

kernel: futex: use scheduler.h

5aec626

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

kernel: kheap: use scheduler.h

6a99cca

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

kernel: stack: use scheduler.h

f136cbe

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

kernel: msgq: use scheduler.h APIs

1c9716f

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

kernel: queue: use scheduler.h

92ce7b2

Closes races where woken up thread could unexpectedly change state in between ksched.h calls, which are no longer used. Signed-off-by: Andrew Boie <[email protected]>

andrewboie force-pushed the smp-unpend-races branch from 5335472 to 92ce7b2 Compare October 30, 2020 20:25

andyross reviewed Oct 30, 2020

View reviewed changes

pabigot reviewed Nov 2, 2020

View reviewed changes

andrewboie mentioned this pull request Nov 17, 2020

sporadic "Attempt to resume un-suspended thread object" faults on x86-64 #28105

Closed

pabigot mentioned this pull request Dec 1, 2020

reproducible qemu_x86_64 SMP failures #30360

Closed

pabigot mentioned this pull request Dec 7, 2020

kernel: add new work queue implementation #29618

Merged

9 tasks

github-actions bot added the Stale label Jan 4, 2021

andrewboie removed the Stale label Jan 5, 2021

pabigot self-assigned this Feb 10, 2021

pabigot mentioned this pull request Feb 10, 2021

Rework scheduler/IPC layer API to not be a horrifying mess #32148

Open

pabigot removed their assignment Mar 22, 2021

github-actions bot added the Stale label Jul 14, 2021

github-actions bot closed this Jul 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP kernel: define and use a formal scheduling API #29668

WIP kernel: define and use a formal scheduling API #29668

andrewboie commented Oct 30, 2020 •

edited

Loading

andyross left a comment

andyross Oct 30, 2020

andrewboie Nov 4, 2020

andyross Oct 30, 2020

pabigot Nov 2, 2020

andrewboie Nov 4, 2020 •

edited

Loading

andyross Oct 30, 2020

andrewboie Nov 4, 2020

pabigot Nov 2, 2020

andrewboie Nov 4, 2020

pabigot Nov 4, 2020

pabigot Nov 2, 2020

pabigot Nov 2, 2020

andrewboie Nov 4, 2020

github-actions bot commented Jan 4, 2021

github-actions bot commented Jul 14, 2021

WIP kernel: define and use a formal scheduling API #29668

WIP kernel: define and use a formal scheduling API #29668

Conversation

andrewboie commented Oct 30, 2020 • edited Loading

andyross left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewboie Nov 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jan 4, 2021

github-actions bot commented Jul 14, 2021

andrewboie commented Oct 30, 2020 •

edited

Loading

andrewboie Nov 4, 2020 •

edited

Loading