Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run long tasks in background #57

Open
clemux opened this issue Aug 26, 2020 · 11 comments
Open

Run long tasks in background #57

clemux opened this issue Aug 26, 2020 · 11 comments

Comments

@clemux
Copy link
Contributor

clemux commented Aug 26, 2020

Some of the work archivy does should run asynchronously from the web server:

  • importing links from external services (currently pocket, more in the future)
  • retrieving a web page's content

Some possible solutions:

Celery

This would add a dependency to either rabbitmq or redis, which might not match your vision for archivy as a simple app. On the other hand, redis might be useful for other stuff (like a cache for a text-search system easier to install than ES)

Python-RQ

(much) More minimalist design, using redis: official website

@clemux
Copy link
Contributor Author

clemux commented Aug 26, 2020

Addendum, with celery it would be easy to make the rabbitmq/redis dependency optional and run tasks in the flask process when it is not available.

@Uzay-G
Copy link
Member

Uzay-G commented Aug 26, 2020

I'd rather not have to use redis, do you know of any more lightweight alternatives? Maybe we could use threads... 🤔

@cktang88
Copy link
Contributor

cktang88 commented Aug 26, 2020

Agree with keeping it simple for now, we could also use Python3's built-in asyncio eg.

@Uzay-G
Copy link
Member

Uzay-G commented Aug 26, 2020

Yes I think using asyncio is a good idea

@clemux
Copy link
Contributor Author

clemux commented Aug 26, 2020

I agree!

@clemux
Copy link
Contributor Author

clemux commented Aug 27, 2020

It seems that flask and werkzeug don't play nice with asyncio, because werkzeug is blocking by design.

That doesn't mean we cannot use custom coroutines/threading/multiprocessing, though.

@clemux
Copy link
Contributor Author

clemux commented Aug 27, 2020

https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures seems to provide what we need here, any opinions?

https://flask-executor.readthedocs.io/en/latest/ shows how it could be implemented (not necessarily by using that extension directly)

@pirate
Copy link

pirate commented Feb 2, 2021

We're about to do this refactor on ArchiveBox too, we're looking at theses 2 queue systems primarily:

There are also adapters that link them to Flask I think (I know there are adapters for Django, should be easy to adapt if no flask-specific ones).
The reason we didn't end up going with asyncio is because it's still singlethreaded, and there's a decent amount of blocking python that still needs to be run while archiving each link. Archivy's architecture / access patterns may be different though, idk.

I'm rooting hard for Archivy, it looks like you've managed to avoid a lot of the early mistakes that plagued the ArchiveBox codebase, the UI is gorgeous, and your plugin system is awesome. I'd love to share notes/lessons we learned from ours so that you can avoid those pitfalls.

@Uzay-G
Copy link
Member

Uzay-G commented Feb 2, 2021

Thanks for the suggestions!

I'm rooting hard for Archivy, it looks like you've managed to avoid a lot of the early mistakes that plagued the ArchiveBox codebase, the UI is gorgeous, and your plugin system is awesome. I'd love to share notes/lessons we learned from ours so that you can avoid those pitfalls.

Yes I'd definitely loved to collaborate, and I remember your comments on the post I made about Archivy on Hacker News, back in August.

Your work with ArchiveBox is really cool :)

@pirate
Copy link

pirate commented Feb 2, 2021

Ah yeah sorry I forgot to follow up after I initially commented on HN, got swamped with work. I'll join your discord and we can continue the convo there :)

@Uzay-G
Copy link
Member

Uzay-G commented Feb 2, 2021

Cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants