-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many images on page slows down request when using page.images.with_size #242
Comments
page.images.with_size
You can disable image downloading entirely if you want: https://github.com/jaimeiniesta/metainspector#image-downloading Downloading images can take a lot of time if there are many images and they take a lot to download. Better do this in a background job so that the server does not time out. To improve this, I can think of: a) Introduce a new option to limit the number of images to download. Or maybe we could change it to be an integer, instead of a boolean. For example, instead of b) We could try downloading the images in parallel. What do you think? |
O wow, this speeds up things drastically and fixed my issue for now! Thanks so much. I feel this can indeed be improved even more by being able to set an option to limit the number of images to download. Loading in parallel also might fix the load time, but it still downloads the images then, which might be redundant if you don't need the last say 70 images. Right? |
That's right, if you're interested in downloading images to get better results for their dimensions then this should be definitely go into a background job. Well, as any external request can take a long time, maybe all things related to scraping should go in background jobs as a general rule. You're right that we could do a) and b), that is, let us specify the number of images to download, and also try to download them in parallel. |
I think the idea of making "download_images" an integer instead of a boolean is a good one too. I will make a PR. That or since there are more and more image options, maybe we just add a hash: image_options: { download: true, max_downloaded_images: 300 } |
Hey, when you request a page with many images (eg 80) and use
page.images.with_size
the request is so slow that our applications times out (>30s).Is there a way to limit the number of images it calculates the size for? It would love to be able to only fetch the size of the first 10 images for example.
The text was updated successfully, but these errors were encountered: