Configure options to optimize the crawling and extraction process #232

kvasilopoulos · 2022-07-15T13:27:45Z

Hi, I would like to know how can I configure news-please options to optimize the crawling and the extraction process. For example, let's assume that we have a machine with 4 CPUs (2 threads per CPU) and we have 20 websites to crawl from, what is the optimal number of number_of_parallel_daemons, number_of_parallel_crawlers and CONCURRENT_REQUESTS_PER_DOMAIN.

Versions (please complete the following information):

OS: debian 20.04
Python Version : python3.8
news-please Version latest

Intent (optional; we'll use this info to prioritize upcoming tasks to work on)

personal
business

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure options to optimize the crawling and extraction process #232

Configure options to optimize the crawling and extraction process #232

kvasilopoulos commented Jul 15, 2022

Configure options to optimize the crawling and extraction process #232

Configure options to optimize the crawling and extraction process #232

Comments

kvasilopoulos commented Jul 15, 2022