Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid spawning the async-io thread if possible #40

Open
Kestrer opened this issue Oct 30, 2020 · 8 comments
Open

Avoid spawning the async-io thread if possible #40

Kestrer opened this issue Oct 30, 2020 · 8 comments

Comments

@Kestrer
Copy link

Kestrer commented Oct 30, 2020

In most programs that use async-io, the user will simply run block_on at the top level. However, this will still spawn the async-io thread despite it not being necessary. Would it be possible to avoid that?

danieldg added a commit to danieldg/async-io that referenced this issue Jul 25, 2021
This passes the responsibility for calling the Reactor between threads
instead of relying on main_thread to pick up the abandoned Reactor.
Threads that are waiting in block_on are likely to acquire the reactor
on their second wait, and the main thread is only woken if there are no
block_on threads polling.

This also adds support for disabling the async-io thread which allows
applications that only process I/O from within block_on to avoid the
needless thread creation.  It also supports designating the async-io
thread as the sole owner of the Reactor to avoid this pass-the-reactor
behavior, if that is undesired.

fixes smol-rs#40
danieldg added a commit to danieldg/async-io that referenced this issue Jul 25, 2021
This passes the responsibility for calling the Reactor between threads
instead of relying on the async-io thread to pick up the abandoned
Reactor.  Threads that are waiting in block_on are likely to acquire the
reactor on their second wait, and the main thread is only woken if there
are no block_on threads polling.

This also adds support for disabling the async-io thread which allows
applications that only process I/O from within block_on to avoid the
needless thread creation.  It also supports designating the async-io
thread as the sole owner of the Reactor to avoid this pass-the-reactor
behavior, if that is undesired.

fixes smol-rs#40
danieldg added a commit to danieldg/async-io that referenced this issue Jul 25, 2021
This passes the responsibility for calling the Reactor between threads
instead of relying on the async-io thread to pick up the abandoned
Reactor.  Threads that are waiting in block_on are likely to acquire the
reactor on their second wait, and the main thread is only woken if there
are no block_on threads polling.

This also adds support for disabling the async-io thread, which allows
applications that only process I/O from within block_on to avoid the
needless thread creation.  It also supports designating the async-io
thread as the sole owner of the Reactor to avoid this pass-the-reactor
behavior, if that is undesired.

fixes smol-rs#40
@notgull
Copy link
Member

notgull commented Aug 19, 2022

I don't know, this feels like it would be an easy footgun to accidentally set off. See the issues with #59

@taiki-e
Copy link
Collaborator

taiki-e commented Aug 19, 2022

Yeah, both of the currently submitted PRs seem to cause problems with the existing code when the threads are disabled.

@notgull
Copy link
Member

notgull commented Aug 21, 2022

@taiki-e Do you mind if I close this as well as #59? At worst, all the async-io thread will do is idle on the reactor lock and occasionally pick up slack. Even in those cases, it can prevent programs from grinding to a halt. The only reason I can think of to plan this is if we wanted to port smol to no_std for some reason.

@taiki-e
Copy link
Collaborator

taiki-e commented Aug 21, 2022

Yeah, theoretically it would be more efficient to have no threads, but I don't see this as some kind of bug or problem that needs to be fixed in the near future.

@notgull notgull closed this as not planned Won't fix, can't repro, duplicate, stale Aug 21, 2022
@notgull
Copy link
Member

notgull commented Sep 22, 2023

Reopening this as apparently this is a sticking point for some users

@notgull notgull reopened this Sep 22, 2023
notgull added a commit to smol-rs/async-process that referenced this issue Sep 23, 2023
I know I said that I wouldn't add any more features, but I
think this is important enough.

Right now, a thread called "async-process" is responsible for listening
for SIGCHLD and reaping zombie processes. This listens for the SIGCHLD
signal in Unix and uses a channel connected to the waitable handle on
Windows. While this works, we can do better. Through async-signal, the
signal was already asynchronous on Unix; we were already just using
async_io::block_on to wait on the signal. After swapping out the channel
used on Windows with async-channel, the process reaping function "reap"
can be reimplemented as a fully asynchronous future.

From here we must make sure this future is being polled at all times. To
facilitate this, a function named "driver()" is added to the public API.
This future acquires a lock on the reaper structure and calls the
"reap()" future indefinitely. Multiple drivers can be created at once;
they will just wait forever on this lock. This future is intended to be
spawned onto an executor and left to run forever, making sure all child
processes are signalled whenever necessary. If no tasks are running the
driver future, the "async-process" thread is spawned and runs the
"reap()" future itself.

I've added the following controls to make sure that this system is
robust:

- If a "driver" task is dropped, another "driver" task will acquire the
  lock and keep the reaper active.
- Before being dropped, the task checks to see if it is the last driver.
  If it is, it will spawn the "async-process" thread to be the driver.
- When a Child is being created, it checks if there are any active
  drivers. If there are none, it spawns the "async-process" thread
  itself.
- One concern is that the driver future wil try to spawn the
  "async-process" thread as the application exits and the task is being
  dropped, which will be unnecessary and lead to slower shutdowns. To
  prevent this, the future checks to see if there are any extant `Child`
  instances (a new refcount is added to Reaper to facilitate this). If
  there are none, and if there are no zombie processes, it does not
  spawn the additional thread.
- Someone can still `mem::forget()` the driver thread. This does not
  lead to undefined behavior and just leads to processes being left
  dangling. At this point they're asking for wacky behavior.

This strategy might also be viable for `async-io`, if we want to try to
avoid needing to spawn the additional thread there as well.

Closes #7
cc smol-rs/async-io#40

Signed-off-by: John Nunley <[email protected]>
notgull added a commit to smol-rs/async-process that referenced this issue Oct 11, 2023
I know I said that I wouldn't add any more features, but I
think this is important enough.

Right now, a thread called "async-process" is responsible for listening
for SIGCHLD and reaping zombie processes. This listens for the SIGCHLD
signal in Unix and uses a channel connected to the waitable handle on
Windows. While this works, we can do better. Through async-signal, the
signal was already asynchronous on Unix; we were already just using
async_io::block_on to wait on the signal. After swapping out the channel
used on Windows with async-channel, the process reaping function "reap"
can be reimplemented as a fully asynchronous future.

From here we must make sure this future is being polled at all times. To
facilitate this, a function named "driver()" is added to the public API.
This future acquires a lock on the reaper structure and calls the
"reap()" future indefinitely. Multiple drivers can be created at once;
they will just wait forever on this lock. This future is intended to be
spawned onto an executor and left to run forever, making sure all child
processes are signalled whenever necessary. If no tasks are running the
driver future, the "async-process" thread is spawned and runs the
"reap()" future itself.

I've added the following controls to make sure that this system is
robust:

- If a "driver" task is dropped, another "driver" task will acquire the
  lock and keep the reaper active.
- Before being dropped, the task checks to see if it is the last driver.
  If it is, it will spawn the "async-process" thread to be the driver.
- When a Child is being created, it checks if there are any active
  drivers. If there are none, it spawns the "async-process" thread
  itself.
- One concern is that the driver future wil try to spawn the
  "async-process" thread as the application exits and the task is being
  dropped, which will be unnecessary and lead to slower shutdowns. To
  prevent this, the future checks to see if there are any extant `Child`
  instances (a new refcount is added to Reaper to facilitate this). If
  there are none, and if there are no zombie processes, it does not
  spawn the additional thread.
- Someone can still `mem::forget()` the driver thread. This does not
  lead to undefined behavior and just leads to processes being left
  dangling. At this point they're asking for wacky behavior.

This strategy might also be viable for `async-io`, if we want to try to
avoid needing to spawn the additional thread there as well.

Closes #7
cc smol-rs/async-io#40

Signed-off-by: John Nunley <[email protected]>
@notgull
Copy link
Member

notgull commented Nov 26, 2023

For the single thread use case, we can get around the pre-emption deadlock by having the block_on call just refuse to give up the reactor lock if it sees that BLOCK_ON_COUNT is equal to 1. Then we only spawn the reactor thread once BLOCK_ON_COUNT is bumped to 2 or if an asynchronous resource is used with no block on threads.

For the multi threaded use cases, this definitely requires more thought and synchronization. Possibly so much synchronization that it stops being worth it. This requires more thought and benchmarking... especially since the current reactor is especially stable and I don't want to rock that boat too hard.

I'd accept a PR for the single threaded use case.

@ajwerner
Copy link

I'd be interested in a way to shut down the spawned thread if there is one. I have a use case where I want to dl_close a DLL, and I need to make sure all the threads using the code are gone. I know it's a bit different from what this is asking for, but it's related.

@notgull
Copy link
Member

notgull commented Sep 6, 2024

I have a use case where I want to dl_close a DLL, and I need to make sure all the threads using the code are gone. I know it's a bit different from what this is asking for, but it's related.

Maybe a shutdown_reactor function would be nice here, for people who know what they are doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

4 participants