Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

on_cleanup / on_shutdown are called after active tasks on the event loop are canceled #3593

Closed
terrisgit opened this issue Feb 3, 2019 · 11 comments · Fixed by #3805
Closed

Comments

@terrisgit
Copy link

terrisgit commented Feb 3, 2019

I am running Python 3.7.2 with aiohttp 3.5.4.

web.run_app() is called. Then I press control-c.

  1. web.py executes _cancel_all_tasks() which sends cancel() to all incomplete tasks on the event loop
  2. web.py then executes "await runner.cleanup()" (eventually) which then executes on_shutdown and on_cleanup handlers

This causes my asyncio tasks (started via on_startup) to get cancelled before on_cleanup can gracefully stop them using a method other than .cancel(). This is new behavior from the previous version of aiohttp I was using.

Handling CancelledError is problematic given that it can be thrown anywhere 'await' and similar are used (I guess I could use asyncio.shield in literally hundreds of places).

@terrisgit terrisgit changed the title Now getting concurrent.futures._base.CancelledError on control-C All asyncio tasks receive concurrent.futures._base.CancelledError on control-C Feb 3, 2019
@terrisgit terrisgit changed the title All asyncio tasks receive concurrent.futures._base.CancelledError on control-C Server: All asyncio tasks receive concurrent.futures._base.CancelledError on control-C Feb 3, 2019
@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #1207 (Handle all concurrent.futures._base.TimeoutError and throw a aiohttp.errors.TimeoutError instead.), #2168 (server often throw "concurrent.futures._base.CancelledError"), #2374 (cancellederror still exist.), #930 (aiohttp.Timeout causes CancelledError), and #3549 (Concurrent GET requests lead to ClientConnectorError(8, 'nodename nor servname provided, or not known')).

@asvetlov
Copy link
Member

asvetlov commented Feb 3, 2019

This is intended change in aiohttp.web.run_app behavior.
Earlier the server left unfinished tasks after the loop closing, which did lead to unpredictable behavior.

Now all unfinished tasks are canceled, run_app works pretty close to standard asyncio.run() function.
You still handle the cancellation error by try/except asyncio.CancelledError or try/finally to graceful shutdown, isn't it?

@asvetlov
Copy link
Member

asvetlov commented Feb 3, 2019

on_shutdown / on_cleanup are called before tasks canceling.
The cancellation is the last step before the loop stopping and destroying

@terrisgit terrisgit reopened this Feb 3, 2019
@terrisgit terrisgit changed the title Server: All asyncio tasks receive concurrent.futures._base.CancelledError on control-C Need equivalent of on_cleanup / on_shutdown to be async Feb 3, 2019
@terrisgit terrisgit changed the title Need equivalent of on_cleanup / on_shutdown to be async Need equivalent of on_cleanup / on_shutdown to await an async function Feb 3, 2019
@terrisgit
Copy link
Author

terrisgit commented Feb 4, 2019

I did more research in the code (including on the master branch). The description has been updated. The cancellation does not appear to be the last step.

@terrisgit terrisgit changed the title Need equivalent of on_cleanup / on_shutdown to await an async function on_cleanup / on_shutdown execute after _cancel_all_tasks() is called in web.py Feb 4, 2019
@terrisgit terrisgit changed the title on_cleanup / on_shutdown execute after _cancel_all_tasks() is called in web.py on_cleanup / on_shutdown are called after _cancel_all_tasks() is called in web.py Feb 4, 2019
@terrisgit terrisgit changed the title on_cleanup / on_shutdown are called after _cancel_all_tasks() is called in web.py on_cleanup / on_shutdown are called after all active tasks on the event loop are canceled Feb 4, 2019
@terrisgit terrisgit changed the title on_cleanup / on_shutdown are called after all active tasks on the event loop are canceled on_cleanup / on_shutdown are called after active tasks on the event loop are canceled Feb 4, 2019
@terrisgit
Copy link
Author

terrisgit commented Feb 4, 2019

My workaround is as follows. Because asyncio signal handlers must be synchronous, there are some challenges.

I do not propose altering the behavior of on_shutdown and on_cleanup. I would instead like an additional handler that would be called before canceling all tasks on the event loop. "some_stop_function" could be passed via, say a new method on Application such as on_signal().

loop = asyncio.get_event_loop()
for sig in (signal.SIGTERM, signal.SIGINT):
  loop.add_signal_handler(sig, lambda: asyncio.ensure_future(stop_handler1()))

web.run_app( handle_signals=False, ... )
...

async def stop_handler1():
  # This function runs asynchronously. While this code
  # runs, asyncio is running as if SIGTERM/SIGINT were
  # never caught
  await some_stop_function() #  Do shutdown stuff

  # Now send TERM to asyncio and handle it in stop_handler2
  loop = asyncio.get_event_loop()
  loop.remove_signal_handler(signal.SIGINT)
  sig = signal.SIGTERM
  loop.remove_signal_handler(sig)
  loop.add_signal_handler(sig, stop_handler2)
  os.kill(os.getpid(), sig)
...

def stop_handler2():
  # This function must run synchronously
  loop = asyncio.get_event_loop()
  loop.remove_signal_handler(signal.SIGTERM)
  raise web.GracefulExit()

@jasonmichalski
Copy link

on_shutdown / on_cleanup are called before tasks canceling.
The cancellation is the last step before the loop stopping and destroying

@asvetlov - Not sure if I'm doing something wrong, but I seem to be running into the same problem where long running tasks are cancelled prior to the on_shutdown / on_cleanup signals firing... sample app and results below:

import asyncio
from aiohttp import web

my_long_task = None

async def long_task():
    print("starting long task")
    try:
        await asyncio.sleep(3600)
    except asyncio.CancelledError:
        print("long_task cancelled")

async def shutdown(app):
    print("on_shutdown")
    print(f"my_long_task.done() = {my_long_task.done()}")

async def cleanup(app):
    print("on_cleanup")

async def startup(app):
    global my_long_task
    loop = asyncio.get_event_loop()
    my_long_task = loop.create_task(long_task())

app = web.Application()

app.on_startup.append(startup)
app.on_shutdown.append(shutdown)
app.on_cleanup.append(cleanup)

web.run_app(app)

Running this app 10 times - 50/50 split - half the time the task is cancelled first, the other half on_shutdown / on_cleanup are executed first:

for i in $(seq 1 10); do echo "************"; pipenv run python3 test_app.py; done
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Clong_task cancelled
on_shutdown
my_long_task.done() = True
on_cleanup
************
starting long task
======== Running on http://0.0.0.0:8080 ========
(Press CTRL+C to quit)
^Con_shutdown
my_long_task.done() = False
on_cleanup
long_task cancelled

I'm running:

aiohttp 3.5.4
Python 3.7.1

@pjknkda
Copy link
Contributor

pjknkda commented Mar 12, 2019

I have written a small code snippet to mimic current (aiohttp 3.5.4) run_app behavior,

import asyncio
import signal


class GracefulExit(SystemExit):
    code = 1


def _raise_graceful_exit():
    print('_raise_graceful_exit')
    raise GracefulExit()


async def _runner_setup():
    loop = asyncio.get_event_loop()

    loop.add_signal_handler(signal.SIGINT, _raise_graceful_exit)
    loop.add_signal_handler(signal.SIGTERM, _raise_graceful_exit)


async def _run_app():
    try:
        await _runner_setup()
        while True:
            print('running....')
            await asyncio.sleep(3600)
    finally:
        print('call runner.cleanup()')


def run_app():
    loop = asyncio.get_event_loop()

    try:
        loop.run_until_complete(_run_app())
    except (GracefulExit, KeyboardInterrupt) as ex:
        print('GracefulExit?', type(ex))
    finally:
        print('Cancel all tasks')


run_app()

The result shows that finally clause in _run_app has not been reached.

pjknkda@anne:~/test/asyncio_signal$ python3.7 main.py
running....
^C_raise_graceful_exit
GracefulExit? <class '__main__.GracefulExit'>
Cancel all tasks

This behavior is somewhat unexpected to me and I think this may cause the described issue.

@varungujjar
Copy link

async def stop_handler1():

This function runs asynchronously. While this code

runs, asyncio is running as if SIGTERM/SIGINT were

never caught

await some_stop_function() # Do shutdown stuff

Now send TERM to asyncio and handle it in stop_handler2

loop = asyncio.get_event_loop()
loop.remove_signal_handler(signal.SIGINT)
sig = signal.SIGTERM
loop.remove_signal_handler(sig)
loop.add_signal_handler(sig, stop_handler2)
os.kill(os.getpid(), sig)
...

def stop_handler2():

This function must run synchronously

loop = asyncio.get_event_loop()
loop.remove_signal_handler(signal.SIGTERM)
raise web.GracefulExit()

this worked :)

multun added a commit to multun/aiohttp that referenced this issue May 27, 2019
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes aio-libs#3593
@helpr helpr bot added the pr-available label May 27, 2019
multun added a commit to multun/aiohttp that referenced this issue May 27, 2019
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes aio-libs#3593
@multun
Copy link
Contributor

multun commented May 27, 2019

I had a similar issue where a cleanup hook would depend on some background task running. The task would get cancelled before the cleanup hook ran, leading to some unexpected behavior.

Cancelling the run_app task first makes the cleanup hooks run first, thus hopefully fixing the bug.

multun added a commit to multun/aiohttp that referenced this issue Sep 6, 2019
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes aio-libs#3593
multun added a commit to multun/aiohttp that referenced this issue Sep 6, 2019
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes aio-libs#3593
multun added a commit to multun/aiohttp that referenced this issue Sep 6, 2019
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes aio-libs#3593
@david-julien
Copy link

david-julien commented Oct 21, 2019

@asvetlov per your comment:

on_shutdown / on_cleanup are called before tasks canceling.

Using python 3.7.2 and aiohttp version 3.6.2, after locally adding a print debug statement to _cancel_all_tasks() in web.py and BaseRunner.cleanup() in web_runner.py, it looks like _cancel_all_tasks() is always called before BaseRunner.cleanup() once a SIGINT is sent to the app. This suggests on_shutdown / on_cleanup is called after tasks are cancelled. I was wondering if this is expected behavior?

asvetlov pushed a commit that referenced this issue Jan 27, 2020
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes #3593
@helpr helpr bot added pr-merged and removed pr-available labels Jan 27, 2020
asvetlov pushed a commit that referenced this issue Jan 27, 2020
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes #3593
(cherry picked from commit c32101d)

Co-authored-by: multun <[email protected]>
@helpr helpr bot added the pr-available label Jan 27, 2020
asvetlov added a commit that referenced this issue Jan 27, 2020
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes #3593
(cherry picked from commit c32101d)

Co-authored-by: multun <[email protected]>

Co-authored-by: multun <[email protected]>
@helpr helpr bot removed the pr-available label Jan 27, 2020
@janpascal
Copy link

Just got bitten by this, too. Could aiohttp 3.7 containing the fix for this bug be released, please?

asvetlov pushed a commit that referenced this issue Oct 16, 2020
Otherwise, some tasks might be cancelled before cleanup hooks run. Fixes #3593
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants