DPR request times out when using gunicorn preload option #2409

aruroyc · 2022-04-12T15:16:17Z

Question
When using gunicorn to start the app server with a --preload flag, the request for passage retrieval for a query times out after reaching the embed_queries() function line 214 in haystack/nodes/retriever/dense.py
Same code works perfectly when not using the preload flag.

Additional context
Startup command: gunicorn --preload -c gunicorn.conf.py main:app -k uvicorn.workers.UvicornWorker
Retriever used: DensePassageRetriever
haystack version: 1.0.0

brandenchan · 2022-04-14T12:18:10Z

Hi @aruroyc, I had a quick look into this but nothing jumped out at me yet. I'm tagging @ZanSara here who I think will be able to help you out!

ZanSara · 2022-04-14T12:29:57Z

Hey @aruroyc, I'm not too familiar with the --preload flag either, but reading the documentation I think I have a clue of what's going on.

--preload "load application code before the worker processes are forked."(https://docs.gunicorn.org/en/stable/settings.html#preload-app). The fact is that our REST API for Haystack 1.0.0 used to do a lot of operations at import time: namely, they were loading all the nodes and setting up the pipeline.

I have the impression that:

You're using a dense retriever, and as such modified the REST API to compute the embeddings.
The code calculating the embeddings is executed at import time

Is any of this true? If you modified the REST API, could you share your changes?

A possible solution would be to upgrade Haystack and the REST API to 1.3.0. Plenty of improvements there, including a refactoring of the REST API that now do not load the pipeline at import time anymore. However, from 1.0.0 you might have to migrate a few things: proceed with care and skim through the changelog before jumping on the next version. As an alternative, I believe you have to find a way to defer the creation of embeddings at any point after import time: this is likely to allow the workers to boot and fork without meeting with any timeout.

I hope any of this helps! In case this didn't sort out things for you, please be a bit more specific about your setup and I'll see if I can help you further.

aruroyc · 2022-04-14T14:20:36Z

@ZanSara @brandenchan Thanks for looking into this quickly!

@ZanSara I created a FastAPI REST endpoint that calculates the query embeddings at runtime, not import time. It then uses a ElasticSearchDocumentStore to query and retrieve the top_n similar passages. for this I am using:

search_pipeline.run(query['question'], params=params)
Where search_pipeline is a DocumentSearchPipeline global object (with a reference to the DPR retriever) created at import time.

While looking up possible solutions over the internet, I saw that this problem could in fact related to how pytorch does intra_op_parallelism. Following a couple of discussion threads I finally landed here:
pytorch/pytorch#49555

Following solutions, the below approach of setting the no of threads for intra_op_parallelism to 1 using torch.set_num_threads(1) works, but I am unsure if this could results in performance issues when the Rest API is called in parallel at high concurrency.
https://stackoverflow.com/questions/59144482/running-pytorch-multiprocessing-in-a-docker-container-with-gunicorn-worker-manag

Might the issue have something to do with how the DPR model is run in pytorch and whether it uses multiprocessing?
Please note that I have not supplied num_workers>1 in any function call/config etc.

I can draw up a prototype code on github to reproduce this, in case this does not help.

ZanSara · 2022-04-19T08:07:10Z

Thanks for the detailed explanation! I see you have a very different setup than I imagined.

I don't know how much I can help you here: it seems like this is an issue with Pytorch. What I can tell you is that Haystack's DPRetriever implementation is not complex and does not use multiprocessing: we generally defer to pytorch all the optimization and parallelization work. However, if you eventually figure out that this is a bug caused by DPR, by all means let us know so that we can fix it. In that case a small reproducible example would be a fantastic help.

Good luck for your bug hunt!

ZanSara · 2022-05-10T09:21:23Z

Hey @aruroyc, do you still need help with this issue? Otherwise we should close it 🙂

aruroyc · 2022-05-10T11:47:46Z

Closing this as the issue appears to be within the pytorch module and how it handles intra_op_parallelism.

aruroyc changed the title ~~DPR request times out when using gunicorn.~~ DPR request times out when using gunicorn preload option Apr 13, 2022

ZanSara added the topic:rest_api label Apr 14, 2022

brandenchan self-assigned this Apr 14, 2022

aruroyc closed this as completed May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DPR request times out when using gunicorn preload option #2409

DPR request times out when using gunicorn preload option #2409

aruroyc commented Apr 12, 2022 •

edited

brandenchan commented Apr 14, 2022

ZanSara commented Apr 14, 2022 •

edited

aruroyc commented Apr 14, 2022 •

edited

ZanSara commented Apr 19, 2022

ZanSara commented May 10, 2022

aruroyc commented May 10, 2022

DPR request times out when using gunicorn preload option #2409

DPR request times out when using gunicorn preload option #2409

Comments

aruroyc commented Apr 12, 2022 • edited

brandenchan commented Apr 14, 2022

ZanSara commented Apr 14, 2022 • edited

aruroyc commented Apr 14, 2022 • edited

ZanSara commented Apr 19, 2022

ZanSara commented May 10, 2022

aruroyc commented May 10, 2022

aruroyc commented Apr 12, 2022 •

edited

ZanSara commented Apr 14, 2022 •

edited

aruroyc commented Apr 14, 2022 •

edited