Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP kernel: define and use a formal scheduling API #29668

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions include/sys/scheduler.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
/*
* Copyright (c) 2020 Intel Corporation
*
* SPDX-License-Identifier: Apache-2.0
*/

#ifndef ZEPHYR_INCLUDE_SYS_SCHEDULER_H
#define ZEPHYR_INCLUDE_SYS_SCHEDULER_H

#include <kernel.h>
#include <sys_clock.h>
#include <wait_q.h>

/*
* APIs for working with the Zephyr kernel scheduler. Intended for use in
* management of IPC objects, either in the core kernel or other IPC
* implemented by OS compatibility layers, providing basic wait/wake operations
* with spinlocks used for synchronization.
*
* These APIs are public and will be treated as contract, even if the
* underlying scheduler implementation changes.
*/

/**
* Callback function prototype for k_sched_wake_one/wake_all
*
* If a thread is woken up, the callback function is called with the thread
* as an argument. The sched_spinlock is held the entire time, ensuring that
* thread state doesn't suddenly change (from some nested IRQ or activity on
* another CPU) when the callback is modifying the thread's state.
*
* It is only inside these callbacks that it is safe to inspect or modify
* the thread that was woken up.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to document the intended use of obj and context here.

Also to document what can and cannot be done within the callback. From what I can tell with #29610 if the callback invokes an operation that's a reschedule point spinlocks held by the caller might be broken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the callback holds sched_spinlock, so the true limitation is that no other API that holds sched_spinlock can be called. Simply stating this would be extremely unclear to users, I'll try to express in a better way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. So in this case the spinlock wouldn't be broken because the callback would block forever waiting to take sched_spinlock.

Could you take a look at #29610 and correct my understanding if it isn't true in general that uniprocessor spinlocks (like irq_lock) broken if the current thread is preempted as a result of making a higher priority thread read and rescheduling? Is it true in SMP?

If either is true I'll work on a documentation update, because currently we warn about that only for irq_lock.

*/
typedef void (*k_sched_wake_cb_t)(struct k_thread *thread, void *obj,
void *context);

/**
* Wake up a thread pending on the provided wait queue, with callback
*
* Given a wait_q, wake up the highest priority thread on the queue. If the
* queue was empty just return false.
*
* Otherwise, do the following, in order, holding sched_spinlock the entire
* time so that the thread state is guaranteed not to change:
* - if the callback function is provided, run the callback function with
* the provided object and context pointers
* - Set the thread's swap return values to swap_retval and swap_data
* - un-pend and ready the thread, but do not invoke the scheduler.
*
* It is up to the caller to implement locking such that the return value of
* this function (whether a thread was woken up or not) does not immediately
* become stale.
*
* Repeated calls to this function until it returns false is a suitable
* way to wake all threads on the queue.
*
* It is up to the caller to implement locking such that the return value of
* this function does not immediately become stale. Calls to wait and wake on
* the same wait_q object must have synchronization. Calling this without
* holding any spinlock is a sign that this API is not being used properly.
*
* @param wait_q Wait queue to wake up the highest prio thread
* @param swap_retval Swap return value for woken thread
* @param swap_data Data return value to supplement swap_retval. May be NULL.
* @param cb Callback function, or NULL in which case no callback is invoked.
* @param obj Kernel object associated with this wake operation, passed to cb.
* Only used by the callback. May be NULL.
* @param context Additional context pointer associated with this wake
* operation. Only used by the callback. May be NULL.
* @retval true If a thread waa woken up
* @retval false If the wait_q was empty
*/
bool k_sched_callback_wake(_wait_q_t *wait_q, int swap_retval, void *swap_data,
k_sched_wake_cb_t cb, void *obj, void *context);

/**
* Wake up a thread pending on the provided wait queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should document as "wake up the highest priority thread" I think. Strictly a wait queue will present the highest priority thread that has been in the queue the longest for wakeup at the head of the list. We never had to detail that expressly anywhere because it was an internal API, but it deserves a callout here I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed, will add this.

*
* Convenient function to k_sched_callback_wake(), for the very common case
* where no callback is required.
*
* @param wait_q Wait queue to wake up the highest prio thread
* @param swap_retval Swap return value for woken thread
* @param swap_data Data return value to supplement swap_retval. May be NULL.
* @retval true If a thread waa woken up
* @retval false If the wait_q was empty
*/
static inline bool k_sched_wake(_wait_q_t *wait_q, int swap_retval,
void *swap_data)
{
return k_sched_callback_wake(wait_q, swap_retval, swap_data,
NULL, NULL, NULL);
}

/**
* Wake up all threads pending on the provided wait queue
*
* Convenience function to invoke k_sched_wake() on all threads in the queue
* until there are no more to wake up.
*
* @param wait_q Wait queue to wake up the highest prio thread
* @param swap_retval Swap return value for woken thread
* @param swap_data Data return value to supplement swap_retval. May be NULL.
* @retval true If any threads were woken up
* @retval false If the wait_q was empty
*/
static inline bool k_sched_wake_all(_wait_q_t *wait_q, int swap_retval,
void *swap_data)
{
bool woken = false;

while (k_sched_wake(wait_q, swap_retval, swap_data)) {
woken = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (releasing the lock between each thread) is the way z_unpend_all() works currently, but this probably deserves some thought as to whether we want this to be an atomic operation (hold the lock while you release all threads simultaneously). The latter is something that can't be emulated by the caller, where they could write the loop themselves.

The question is do we want that? I dunno, but it would be more cleanly specified, less surprising, and easy to do right here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to do something like that in #29618 where I had a list of waiting threads and wanted to wake a subset of them, done by iterating over the set and extracting the ones I needed to a separate list under lock before releasing each thread (because the per-thread operation was a reschedule point; see #29610). A waitq has different semantics, so maybe it isn't appropriate for this interface.

Copy link
Contributor Author

@andrewboie andrewboie Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea with these APIs is that there are two locks involved:

  • sched_spinlock, which is held for individual calls to k_sched_wake() which protects the integrity of the scheduling data structures and serializes the state of individual threads that were woken up. This keeps the scheduler consistent. This is all internal to these APIs.
  • Another, IPC-specific lock which establishes atomicity for whatever IPC operations being done. It's expected that this second lock is held by the caller across all k_sched_wake() and k_sched_wake_all() calls, until it's time to call k_sched_invoke() with that lock.

I put in the documentation for k_sched_wake():

 * It is up to the caller to implement locking such that the return value of
 * this function does not immediately become stale. Calls to wait and wake on
 * the same wait_q object must have synchronization. Calling this without
 * holding any spinlock is a sign that this API is not being used properly.

@andyross so in other words, we actually do have:

hold the lock while you release all threads simultaneously

As the second lock will cover this case with respect to users of whatever IPC object is being built.
I don't think we need to hold sched_spinlock across these calls, as the second lock should be enough.

Let me know if this approach is workable.


/* True if we woke at least one thread up */
return woken;
}

/**
* Atomically put the current thread to sleep on a wait queue, with timeout
*
* The thread will be added to the provided waitqueue. The lock, which should
* be held by the caller with the provided key, will be released once this is
* completely done and we have swapped out.
*
* The return value and data pointer is set by whoever woke us up via
* k_sched_wake.
*
* @param lock Address of spinlock to release when we swap out
* @param key Key to the provided spinlock when it was locked
* @param wait_q Wait queue to go to sleep on
* @param timeout Waiting period to be woken up, or K_FOREVER to wait
* indefinitely.
* @param data Storage location for data pointer set when thread was woken up.
* May be NULL if not used.
* @retval Return value set by whatever woke us up, or -EAGAIN if the timeout
* expired without being woken up.
*/
int k_sched_wait(struct k_spinlock *lock, k_spinlock_key_t key,
_wait_q_t *wait_q, k_timeout_t timeout, void **data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random note, but we probably want to promote wait queues to a proper "struct k_wait_q" definition instead of the internal type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, will add a patch to this series.


/**
* Atomically invoke a scheduling decision
*
* Waking up a thread with APIs like k_sched_wake() un-pends and readies
* the woken up thread, but does not implicitly invoke the scheduler to
* immediately schedule them. This allows other tasks to be performed in
* between.
*
* When called, the kernel may context switch to some other thread,
* releasing the lock once we have switched out. The lock will never
* be held when this returns.
*
* If the caller is a co-operative thread, or the next thread to run on this
* CPU is otherwise still the current thread, this just unlocks the spinlock
* and returns.
*
* Most users will use the same lock here that synchronizes k_sched_wait()
* and k_sched_wake() calls.
*
* It is safe (albeit very slightly wasteful) to call this even if no thread
* was woken up.
*
* @param lock Address of spinlock to release when we swap out
* @param key Key to the provided spinlock when it was locked
*/
void k_sched_invoke(struct k_spinlock *lock, k_spinlock_key_t key);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be nice: I think it could be the thing that gets invoked at a reschedule point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this particular API is the reschedule point. it is currently a thin wrapper on z_reschedule().


#endif /* ZEPHYR_INCLUDE_SCHEDULER_H */
23 changes: 9 additions & 14 deletions kernel/futex.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,10 @@
*/

#include <kernel.h>
#include <kernel_structs.h>
#include <spinlock.h>
#include <kswap.h>
#include <syscall_handler.h>
#include <init.h>
#include <ksched.h>
#include <wait_q.h>
#include <sys/scheduler.h>

static struct z_futex_data *k_futex_find_data(struct k_futex *futex)
{
Expand All @@ -28,7 +26,6 @@ int z_impl_k_futex_wake(struct k_futex *futex, bool wake_all)
{
k_spinlock_key_t key;
unsigned int woken = 0;
struct k_thread *thread;
struct z_futex_data *futex_data;

futex_data = k_futex_find_data(futex);
Expand All @@ -37,17 +34,15 @@ int z_impl_k_futex_wake(struct k_futex *futex, bool wake_all)
}

key = k_spin_lock(&futex_data->lock);

do {
thread = z_unpend_first_thread(&futex_data->wait_q);
if (thread) {
z_ready_thread(thread);
arch_thread_return_value_set(thread, 0);
if (k_sched_wake(&futex_data->wait_q, 0, NULL)) {
woken++;
} else {
break;
}
} while (thread && wake_all);
} while (wake_all);

z_reschedule(&futex_data->lock, key);
k_sched_invoke(&futex_data->lock, key);

return woken;
}
Expand Down Expand Up @@ -81,8 +76,8 @@ int z_impl_k_futex_wait(struct k_futex *futex, int expected,
return -EAGAIN;
}

ret = z_pend_curr(&futex_data->lock,
key, &futex_data->wait_q, timeout);
ret = k_sched_wait(&futex_data->lock, key, &futex_data->wait_q,
timeout, NULL);
if (ret == -EAGAIN) {
ret = -ETIMEDOUT;
}
Expand Down
9 changes: 8 additions & 1 deletion kernel/include/kernel_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ extern char *z_setup_new_thread(struct k_thread *new_thread,
*
* Threads may be assigned a resource pool, which will be used to allocate
* memory on behalf of certain kernel and driver APIs. Memory reserved
* in this way should be freed with k_free().
* in this way should be freed with z_thread_free().
*
* If called from an ISR, the k_malloc() system heap will be used if it exists.
*
Expand All @@ -61,6 +61,13 @@ extern char *z_setup_new_thread(struct k_thread *new_thread,
*/
void *z_thread_malloc(size_t size);

/**
* @brief Free memory allocated from a resource pool
*
* @param ptr Memory to free, or NULL (in which case this is a no-op)
*/
void z_thread_free(void *ptr);

/* set and clear essential thread flag */

extern void z_thread_essential_set(void);
Expand Down
10 changes: 10 additions & 0 deletions kernel/include/ksched.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,16 @@
#include <tracing/tracing.h>
#include <stdbool.h>

/* WARNING WARNING WARNING HERE THERE BE DRAGONS
*
* Many of the APIs here are unsafe to call without sched_spinlock held.
* This is particularly the case with most of the APIs here which take a
* struct k_thread pointer argument.
*
* The contents of this file should be treated as private to the core kernel
* and may change at any time.
*/

BUILD_ASSERT(K_LOWEST_APPLICATION_THREAD_PRIO
>= K_HIGHEST_APPLICATION_THREAD_PRIO);

Expand Down
11 changes: 6 additions & 5 deletions kernel/kheap.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,10 @@
*/

#include <kernel.h>
#include <ksched.h>
#include <wait_q.h>
#include <init.h>
#include <sys/scheduler.h>
#include <kernel_arch_interface.h>

void k_heap_init(struct k_heap *h, void *mem, size_t bytes)
{
Expand Down Expand Up @@ -42,8 +43,8 @@ void *k_heap_alloc(struct k_heap *h, size_t bytes, k_timeout_t timeout)
break;
}

(void) z_pend_curr(&h->lock, key, &h->wait_q,
K_TICKS(end - now));
(void)k_sched_wait(&h->lock, key, &h->wait_q,
K_TICKS(end - now), NULL);
key = k_spin_lock(&h->lock);
}

Expand All @@ -57,8 +58,8 @@ void k_heap_free(struct k_heap *h, void *mem)

sys_heap_free(&h->heap, mem);

if (z_unpend_all(&h->wait_q) != 0) {
z_reschedule(&h->lock, key);
if (k_sched_wake_all(&h->wait_q, 0, NULL)) {
k_sched_invoke(&h->lock, key);
} else {
k_spin_unlock(&h->lock, key);
}
Expand Down
14 changes: 4 additions & 10 deletions kernel/mem_slab.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
#include <ksched.h>
#include <init.h>
#include <sys/check.h>
#include <sys/scheduler.h>

static struct k_spinlock lock;

Expand Down Expand Up @@ -118,11 +119,7 @@ int k_mem_slab_alloc(struct k_mem_slab *slab, void **mem, k_timeout_t timeout)
result = -ENOMEM;
} else {
/* wait for a free block or timeout */
result = z_pend_curr(&lock, key, &slab->wait_q, timeout);
if (result == 0) {
*mem = _current->base.swap_data;
}
return result;
return k_sched_wait(&lock, key, &slab->wait_q, timeout, mem);
}

k_spin_unlock(&lock, key);
Expand All @@ -133,12 +130,9 @@ int k_mem_slab_alloc(struct k_mem_slab *slab, void **mem, k_timeout_t timeout)
void k_mem_slab_free(struct k_mem_slab *slab, void **mem)
{
k_spinlock_key_t key = k_spin_lock(&lock);
struct k_thread *pending_thread = z_unpend_first_thread(&slab->wait_q);

if (pending_thread != NULL) {
z_thread_return_value_set_with_data(pending_thread, 0, *mem);
z_ready_thread(pending_thread);
z_reschedule(&lock, key);
if (k_sched_wake(&slab->wait_q, 0, *mem)) {
k_sched_invoke(&lock, key);
} else {
**(char ***)mem = slab->free_list;
slab->free_list = *(char **)mem;
Expand Down
22 changes: 22 additions & 0 deletions kernel/mempool.c
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,25 @@ void *z_thread_malloc(size_t size)

return ret;
}

void z_thread_free(void *ptr)
{
#ifdef CONFIG_MEM_POOL_HEAP_BACKEND
/* Basically k_heap free without scheduling hooks, we never set a
* timeout for z_thread_malloc anyway
*/
if (ptr != NULL) {
k_spinlock_key_t key;
struct k_mem_block_id *id;

/* point to hidden block descriptor at start of block */
id = (struct k_mem_block_id *)((char *)ptr -
WB_UP(sizeof(struct k_mem_block_id)));

/* return block4 to the heap memory pool */
key = k_spin_lock(&id->heap->lock);
sys_heap_free(&id->heap->heap, id->data);
k_spin_unlock(&id->heap->lock, key);
}
#endif
}
Loading