Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client timing out when coroutines making requests in async_timeout block are run together #3457

Closed
owensearls opened this issue Dec 19, 2018 · 7 comments
Labels
client invalid This doesn't seem right wontfix

Comments

@owensearls
Copy link

Long story short

I have a coroutine which makes an async request inside of a timeout block and then performs a slow synchronous process with the returned data. When I run this process twice in parallel the request times out after the first process completes and the context switches to the other request.

This issue only happens about half the time I run the test script included in the Steps to Reproduce section, and I haven't been able to figure out why.

Expected behaviour

I would expect both processes to finish successfully.

Actual behaviour

In the script below, both processes enter the async with session.get(url) as response: block and execute the line await response.text(). The first process downloads the data and spends longer than the timeout processing the data. The context then switches back to the other request and attempts to download the data but raises a TimeoutError

output: 48921
Traceback (most recent call last):
  File "aiohttptest.py", line 12, in fetch
    return await response.text()
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 916, in text
    await self.read()
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/aiohttp/client_reqrep.py", line 880, in read
    self._body = await self.content.read()
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/aiohttp/streams.py", line 322, in read
    block = await self.readany()
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/aiohttp/streams.py", line 338, in readany
    await self._wait('readany')
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/aiohttp/streams.py", line 260, in _wait
    await waiter
concurrent.futures._base.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "aiohttptest.py", line 28, in <module>
    loop.run_until_complete(main())
  File "/usr/lib/python3.6/asyncio/base_events.py", line 473, in run_until_complete
    return future.result()
  File "aiohttptest.py", line 24, in main
    fetch_and_process(session, 'http://python.org'))
  File "aiohttptest.py", line 16, in fetch_and_process
    html = await fetch(session, 'http://python.org')
  File "aiohttptest.py", line 12, in fetch
    return await response.text()
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/async_timeout/__init__.py", line 55, in __aexit__
    self._do_exit(exc_type)
  File "/home/dev/.virtualenvs/water_buffalo/lib/python3.6/site-packages/async_timeout/__init__.py", line 92, in _do_exit
    raise asyncio.TimeoutError
concurrent.futures._base.TimeoutError

Steps to reproduce

import asyncio
import time

import aiohttp
import async_timeout


async def fetch(session, url):
    async with async_timeout.timeout(5):
        async with session.get(url) as response:
            return await response.text()


async def fetch_and_process(session, url):
    html = await fetch(session, 'http://python.org')
    print('output:', len(html))
    time.sleep(10)


async def main():
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(fetch_and_process(session, 'http://python.org'),
                             fetch_and_process(session, 'http://python.org'),)


loop = asyncio.get_event_loop()
loop.run_until_complete(main())

Your environment

Using aiohttp client

Python 3.6.5
aiohttp==3.4.4

Ubuntu 18.04.1 LTS

Note

In my actual code I'm using one session per request with a single connector per application. This seems to make the issue occur every time. I'm aware that the documentation recommends one session per application, but the justification that this is so that all requests use the same connection pool makes me think that what I am doing is fine. Let me know if I'm missing something.

Test script using multiple sessions

This script results in the exact same output as seen in the Actual behaviour section, again only about half the time.

import asyncio
import time

import aiohttp
from aiohttp.connector import TCPConnector
import async_timeout

connector = TCPConnector()


async def fetch(url):
    async with aiohttp.ClientSession(connector=connector,
                                     connector_owner=False,) as session:
        async with async_timeout.timeout(5):
            async with session.get(url) as response:
                return await response.text()


async def fetch_and_process(url):
    html = await fetch('http://python.org')
    print('output:', len(html))
    time.sleep(10)


async def main():
    await asyncio.gather(fetch_and_process('http://python.org'),
                         fetch_and_process('http://python.org'),)


try:
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
finally:
    connector.close()
@aio-libs-bot
Copy link

GitMate.io thinks the contributor most likely able to help you is @asvetlov.

Possibly related issues are #3291 (Websocket client hanging/blocking?), #1142 (Can't make 40k requests from client to server), #3251 (Aiohttp Client request limit), #1180 (Client request timeout should be optional), and #22 (Make client tracebacks more useful.).

@jordanbray
Copy link

This doesn't look like a bug to me. time.sleep(10) does block, which causes tie timeout to be hit. If time.sleep(10) is replaced with await asyncio.sleep(10) I would expect this to work.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Jul 6, 2021

and then performs a slow synchronous process with the returned data.

I assume the time.sleep() is just an example to simulate the workload.

From what I can understand, the confusion is that the request hits the timeout due to the heavy workload in another task, rather than anything to do with the network request.

I think this is just the way the timeout is implemented and I'm not sure if it should behave differently. A workaround might be to use asyncio.sleep(0) in the heavy workload to ensure the code yields back to the event loop and has a chance to complete the other request before the timeout occurs.

@skavans
Copy link

skavans commented Dec 5, 2021

@Dreamsorcerer,

Same problem. I believe the "timeouts" work just by checking how much time has passed from the task (request in our case) start. And if we have some another blocking (cpu-consuming) task running in parallel, it can cause the timeout in the first one despite the request itself was fast.

I don't know if there any possibility to count the real sock-read time though. Working on it...

@skavans
Copy link

skavans commented Dec 5, 2021

import asyncio
import time
from aiohttp import web, ClientSession, ClientTimeout


routes = web.RouteTableDef()

@routes.get('/')
async def hello(request):
	await asyncio.sleep(1)
	return web.Response(text="Hello, world")

app = web.Application()
app.add_routes(routes)


async def start_server():
	runner = web.AppRunner(app)
	await runner.setup()
	site = web.TCPSite(runner, 'localhost', 9090)
	await site.start()


async def make_request():
	print('making request...')
	timeout = ClientTimeout(sock_connect=3, sock_read=3)
	async with ClientSession(timeout=timeout) as sess:
		async with sess.get('http://localhost:9090/') as resp:
			await resp.text()
			print('got response!')


async def blocking():
	time.sleep(3)

async def main():
	asyncio.create_task(start_server())
	asyncio.create_task(make_request())

	await asyncio.sleep(1)

	asyncio.create_task(blocking())

	await asyncio.sleep(5)


asyncio.run(main())

PoC. The topic starter used just asyncio.timeout context manager and, in that case, everything works as expected actually.
In my case, I use the built-in sock-read aiohttp timeout that is considered to check the real socket time in my opinion. But it seems to work just by checking the time passed from request start that confuses me.

@asvetlov asvetlov added invalid This doesn't seem right wontfix and removed bug labels Dec 5, 2021
@asvetlov
Copy link
Member

asvetlov commented Dec 5, 2021

Blocking event loop for a relatively (e.g., more than 10-100 milliseconds) long time violates asyncio usage rules.
aiohttp doesn't function in this mode well by design.

Please put such code in run_in_executor() sections.
Run with python -X dev ... mode to see cases when a coroutine execution is blocked for more than loop.slow_callback_duration time.

@asvetlov asvetlov closed this as completed Dec 5, 2021
@Dreamsorcerer
Copy link
Member

Blocking event loop for a relatively (e.g., more than 10-100 milliseconds) long time violates asyncio usage rules. aiohttp doesn't function in this mode well by design.

Indeed, this was what I was trying to explain in my previous comment. Another option is to break up the workload using asyncio.sleep(0) as I mentioned previously. For example, your demo uses time.sleep(3) to simulate the workload, so if you can break it up into smaller pieces, you can do something like:

async def blocking():
    for i in range(30):
        time.sleep(0.1)  # One step of your workload
        await asyncio.sleep(0)  # Yield back to event loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client invalid This doesn't seem right wontfix
Projects
None yet
Development

No branches or pull requests

6 participants