Optimize TaskPools for use in static variables. #12990
Draft
+727
−108
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Objective
Fixes #11849. async_executor internally locks a
Mutex
each time it spawns and finishes a task. This adds a significant amount of overhead for any operation interacting with the task pools, even in single-threaded cases. For more information, see smol-rs/async-executor#112.Solution
Create a const-constructible version of
TaskPool
that usesStaticExecutor
instead ofExecutor
. Use it to back the static TaskPools instead. This also eliminates theOnceLock
on trying to fetch the task pool as the a raw'static
borrow on the static variable can be returned without issue. This solution is loosely based on the work in #4740 to specialize a fork of async_executor for Bevy.This is blocked on
async_executor
andconcurrent_queue
pushing new releases with the necessary changes.TODO: This PR duplicates a huge amount of code currently due to the change in the lifetime requirements on
self
needed to useStaticExecutor
's API.TODO: The docs are still copied over from TaskPool and needs to be rewritten.
Performance
See the benchmarks in smol-rs/async-executor#112. For both single threaded and multithreaded use cases, the overhead of launching, ticking, and finishing tasks is lower, potentially upwards of 9x lower in multithreaded cases under contention.
Future Work
To lower the overhead other use cases like
scope
s, the LocalExecutors and ThreadExecutors may also benefit from being replaced in this fashion too.Changelog
TODO
Migration Guide
TODO