Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

single vs multi threaded as a builtin compile time option #1764

Closed
andrewrk opened this issue Nov 20, 2018 · 12 comments
Closed

single vs multi threaded as a builtin compile time option #1764

andrewrk opened this issue Nov 20, 2018 · 12 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Nov 20, 2018

There are many places where it matters whether code is single- or multi-threaded. For example:

  • whether to use a mutex to protect std.debug.warn
  • whether to use a mutex to protect panic
  • whether to call exit() or exit_group()
  • std.event.loop.initSingleThreaded() vs std.event.loop.initMultiThreaded()
  • whether coroutine await and coroutine return need to use atomic xchg operations to coordinate
  • whether to select a thread-safe allocator or non-thread-safe allocator by default

This proposal is to add --single-threaded as a build-option that is exposed as @import("builtin").single_threaded. Libraries can then use this to write code that is optimal for both cases. Most libraries can ignore the boolean; instead having functions that operate only on their input parameters or are simply not documented to be thread-safe. Where this value comes in useful is when a library has thread-safe functions, and the implementation can be optimized when it is known that there will only ever be a single thread. For example, std.atomic.Queue can simply omit all the mutex locking code, and its functions remain "thread safe" because there will only ever be one thread.

In the same way we plan to be able to set the build mode in a scope (#978) it should be possible to set the multithreaded-ness in a scope.

It's important to note that coroutines in particular, become extremely low overhead when compiling with --single-threaded. A --release-fast --single-threaded build which uses coroutines and always stack allocated the frames would in theory generate the exact same code as if it used normal functions.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Nov 20, 2018
@andrewrk andrewrk added this to the 0.5.0 milestone Nov 20, 2018
@andrewrk andrewrk added the accepted This proposal is planned. label Nov 21, 2018
@andrewrk
Copy link
Member Author

As mentioned in the above linked issue, another thing this option should do is turn thread local variables into global variables.

@andrewrk andrewrk modified the milestones: 0.5.0, 0.4.0 Nov 21, 2018
@Sahnvour
Copy link
Contributor

This is an interesting idea. Regarding only userland code, do you see other advantages than providing "API sugar" to libraries, ie. having mirrored single/multi threaded APIs for free ?

In the same way we plan to be able to set the build mode in a scope (#978) it should be possible to set the multithreaded-ness in a scope.

I think this is very much needed as a global flag would be a far too low granularity. But how would the scoped-based instruction and global compilation flag would interact ?

For example, you compile your project in --single-threaded because it is, mostly, but you have one constrained area of code that does some multithreading, for example a single function.
Imagine this function (with @SetMultithreaded()) and others (which are compiled with single_threaded being true) share some state -- say a collection -- that supports both single- and multi- threadness, and this collection is instanciated in the outer scope, ie. where we assume singlethreaded code. The implementation of the struct has been compiled with the assumption that it will only be singlethreaded, possibly ignoring some members and some code. And then we use it in a scope that is multithreaded.

Is this possible, and if so what happens ? Do we forbid cross-multithread-ness-boundaries state sharing when it would make a difference (annoying) ? Do we recompile the struct with single_threaded being false to be conservative (suboptimal) ?

@euantorano
Copy link
Contributor

May also be worth noting that Nim provides a similar option (--threads:on) and they currently default to having threading support disabled. I believe they plan to enable them by default in the future as they get a lot of questions about it.

@daurnimator
Copy link
Contributor

Beware that this is the source of many glibc bugs. See http://lua-users.org/lists/lua-l/2015-04/msg00010.html and then https://sourceware.org/bugzilla/show_bug.cgi?id=18192
The solution to that problem is to always link with libpthread, even if you're not using threads: this would be the equivalent to banning the single threaded mode you propose.

@andrewrk
Copy link
Member Author

Sure, it's quite possible that you get an error if you try to use --single-threaded and --library c at the same time. But that's one of the beautiful things about Zig not depending on libc, is that, for applications which do not depend on it, we have the power to do these things. There are real motivations and rewards for abandoning libc.

@daurnimator
Copy link
Contributor

If you're creating a library you can't control who might use you: what if you are used in a multi-threaded program?

If you're creating a executable that loads any dynamic libraries then you need to be multi-thread ready just in case on of the libraries you load (or even their dependencies) are multi-threaded.

If you proceed with this issue, I think you need some sort of marker injected into the binary/library saying "single-threaded", and make it an error to load a library without that flag.

@andrewrk
Copy link
Member Author

andrewrk commented Feb 1, 2019

This is an interesting idea. Regarding only userland code, do you see other advantages than providing "API sugar" to libraries, ie. having mirrored single/multi threaded APIs for free ?

All the userland benefits I can think of can be categorized as mirrored single/multi threaded APIs for free. It seems like a worthwhile benefit to me.

I think this is very much needed as a global flag would be a far too low granularity. But how would the scoped-based instruction and global compilation flag would interact ?

This (and your following description) is a good question. When I originally talked about supporting this flag at any scope I was thinking about language differences, such as whether coroutines are emitted with atomic operations. You could do this, for example, if you could guarantee that a particular coroutine or set of coroutines would always be created, suspended, resumed, and awaited from the same thread. However it's not clear how this would work in general.

For userland code, which will be checking @import("builtin").single_threaded, it does not make sense that this value could be different at any scope. It might make sense, instead, to change this at a package level. However, ability to change build parameters at package level is something I feel comfortable leaving until the package manager (#943) is further along. So I think this option can be a global build option for now.

If you're creating a library you can't control who might use you: what if you are used in a multi-threaded program?

Then they'll be building with the single-threaded flag off, and everything will work correctly. In order to mess things up the library user would have to:

  • build your library as a .a/.lib or .so/.dll rather than the usual way of building against source
  • make the conscious choice to enable single-threaded mode
  • then proceed to violate the contract and use your API with more than one thread.

Note that even multithreaded applications can use code built in single threaded mode, as long as the single-threaded code only ever sees a single thread for the entire lifetime of the application.

If you proceed with this issue, I think you need some sort of marker injected into the binary/library saying "single-threaded", and make it an error to load a library without that flag.

This is a good idea, and would be further enhanced by #1535.

@andrewrk
Copy link
Member Author

andrewrk commented Feb 1, 2019

One problem I just ran into is that even in single-threaded mode, std.event.Loop still wants to make a background thread on Linux for blocking file system operations, to make them async. @bnoordhuis are you aware of a reasonable API for non-blocking file system operations that would work with no threads?

Otherwise, one of these things must happen:

  • building in --single-threaded mode would make it impossible to use std.event.Loop API.
  • when using an event loop in --single-threaded mode, file system operations would be blocking and not async
  • ability to set multithreaded-ness in a scope (see discussion above), somehow, so that std.event.Loop can make an exception for its file system background thread. This exception would be safe because all the code that could get called from that extra thread would set the override. One tricky problem to solve would be data structures. For example std.atomic.Queue would have to get instantiated 2 different ways. Zig would have to notice that it depends on the single-threaded flag, and make that a secret comptime parameter. And that could be especially confusing because comparing the types of this data structure would be not equal if they were instantiated from scopes with differing single-threaded flags. Essentially what this boils down to is a fundamental conflict between a global "it is guaranteed there are no threads in the entire application" and "ok actually std.event.Loop needs a thread". Or, std.atomic.Queue would have to explicitly accept a comptime boolean parameter allowing override of single-threaded/multi-threaded behavior.

@andrewrk
Copy link
Member Author

andrewrk commented Feb 1, 2019

io_submit could be an interesting solution, but it depends on a relatively new linux kernel version. (Related #1907)

@daurnimator
Copy link
Contributor

build your library as a .a/.lib or .so/.dll rather than the usual way of building against source

To me this is the biggest selling point of zig: being able to generate C ABI libraries that can be used from other languages. I think saying "rather than the usual way" is taking a far too zig-centric view.

@andrewrk
Copy link
Member Author

andrewrk commented Feb 1, 2019

That's fair enough, but the other 2 points still stand, and, for the case where you are generating a C ABI library, you are in control of your own build process. So don't override the default by passing --single-threaded and you're golden.

@bnoordhuis
Copy link
Contributor

are you aware of a reasonable API for non-blocking file system operations that would work with no threads?

@andrewrk The short answer unfortunately is 'no'. Native Linux AIO is unreliable, it can still block or fail.

I'm unclear on whether FreeBSD's and Solaris's AIO APIs are reliable. They don't seem to be in widespread use; make of that what you will.

I'm working on IOCB_CMD_POLL support in libuv for network I/O but regular file I/O will keep on using the thread pool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants