Run long tasks in background #57

clemux · 2020-08-26T16:15:33Z

Some of the work archivy does should run asynchronously from the web server:

importing links from external services (currently pocket, more in the future)
retrieving a web page's content

Some possible solutions:

Celery

This would add a dependency to either rabbitmq or redis, which might not match your vision for archivy as a simple app. On the other hand, redis might be useful for other stuff (like a cache for a text-search system easier to install than ES)

Python-RQ

(much) More minimalist design, using redis: official website

clemux · 2020-08-26T18:51:28Z

Addendum, with celery it would be easy to make the rabbitmq/redis dependency optional and run tasks in the flask process when it is not available.

Uzay-G · 2020-08-26T19:04:37Z

I'd rather not have to use redis, do you know of any more lightweight alternatives? Maybe we could use threads... 🤔

cktang88 · 2020-08-26T21:44:49Z

Agree with keeping it simple for now, we could also use Python3's built-in asyncio eg.

Uzay-G · 2020-08-26T22:22:48Z

Yes I think using asyncio is a good idea

clemux · 2020-08-26T22:28:52Z

I agree!

clemux · 2020-08-27T14:13:25Z

It seems that flask and werkzeug don't play nice with asyncio, because werkzeug is blocking by design.

That doesn't mean we cannot use custom coroutines/threading/multiprocessing, though.

clemux · 2020-08-27T15:38:51Z

https://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futures seems to provide what we need here, any opinions?

https://flask-executor.readthedocs.io/en/latest/ shows how it could be implemented (not necessarily by using that extension directly)

pirate · 2021-02-02T16:00:26Z

We're about to do this refactor on ArchiveBox too, we're looking at theses 2 queue systems primarily:

https://github.com/coleifer/huey (supports SQLite as the backing store)
https://github.com/Bogdanp/dramatiq (requires Redis/RabbitMQ)

There are also adapters that link them to Flask I think (I know there are adapters for Django, should be easy to adapt if no flask-specific ones).
The reason we didn't end up going with asyncio is because it's still singlethreaded, and there's a decent amount of blocking python that still needs to be run while archiving each link. Archivy's architecture / access patterns may be different though, idk.

I'm rooting hard for Archivy, it looks like you've managed to avoid a lot of the early mistakes that plagued the ArchiveBox codebase, the UI is gorgeous, and your plugin system is awesome. I'd love to share notes/lessons we learned from ours so that you can avoid those pitfalls.

Uzay-G · 2021-02-02T17:21:21Z

Thanks for the suggestions!

I'm rooting hard for Archivy, it looks like you've managed to avoid a lot of the early mistakes that plagued the ArchiveBox codebase, the UI is gorgeous, and your plugin system is awesome. I'd love to share notes/lessons we learned from ours so that you can avoid those pitfalls.

Yes I'd definitely loved to collaborate, and I remember your comments on the post I made about Archivy on Hacker News, back in August.

Your work with ArchiveBox is really cool :)

pirate · 2021-02-02T19:11:43Z

Ah yeah sorry I forgot to follow up after I initially commented on HN, got swamped with work. I'll join your discord and we can continue the convo there :)

Uzay-G · 2021-02-02T20:22:49Z

Cool!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run long tasks in background #57

Run long tasks in background #57

clemux commented Aug 26, 2020

clemux commented Aug 26, 2020

Uzay-G commented Aug 26, 2020

cktang88 commented Aug 26, 2020 •

edited

Uzay-G commented Aug 26, 2020

clemux commented Aug 26, 2020

clemux commented Aug 27, 2020

clemux commented Aug 27, 2020

pirate commented Feb 2, 2021

Uzay-G commented Feb 2, 2021

pirate commented Feb 2, 2021

Uzay-G commented Feb 2, 2021

Run long tasks in background #57

Run long tasks in background #57

Comments

clemux commented Aug 26, 2020

Celery

Python-RQ

clemux commented Aug 26, 2020

Uzay-G commented Aug 26, 2020

cktang88 commented Aug 26, 2020 • edited

Uzay-G commented Aug 26, 2020

clemux commented Aug 26, 2020

clemux commented Aug 27, 2020

clemux commented Aug 27, 2020

pirate commented Feb 2, 2021

Uzay-G commented Feb 2, 2021

pirate commented Feb 2, 2021

Uzay-G commented Feb 2, 2021

cktang88 commented Aug 26, 2020 •

edited