Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate scroll-down event to force images to load #8

Open
rvernica opened this issue Aug 20, 2014 · 5 comments
Open

Generate scroll-down event to force images to load #8

rvernica opened this issue Aug 20, 2014 · 5 comments

Comments

@rvernica
Copy link
Contributor

Hello,

I am trying to use this to scrape some images from a website. The problem is the image URLs are only generated if the user scrolls down the page and the images get into the view port. Is it possible to generate a Page Down event (or Ctrl + End or just go at the end of the page) inside the DynamicScraper?

Thanks!

@ruipgil
Copy link
Owner

ruipgil commented Aug 20, 2014

Loading images is not intended since it slows down scraping, instead just get their URL and then download them.

@rvernica
Copy link
Contributor Author

I don't want to load them I just need their URLs. The problem is the URLs are not generated until the images get into the view port. When the page initially loads, the images at the bottom of the page have the URL set to "1px.png" or something like that. If I scroll down and the images get into the view port, their URL is generated and points to the right image.

So, I need to somehow scroll down the page so the JavaScript code runs and generates the URLs for these images.

Initially, even the top images don't have the URLs generated, but because they are in the view port, their URL gets generated and I can scrape their URLs correctly.

@ruipgil
Copy link
Owner

ruipgil commented Aug 23, 2014

You can always generate DOM events with the DynamicScraper.
But try to inspect the code and see where the information about the images is stored, and get that.

@rvernica
Copy link
Contributor Author

rvernica commented Nov 9, 2014

The JavaScript for generating the image URLs is pretty complex and very hard to figure out.

Could you provide a small example on how to generate a DOM event?

Additionally, it is possible to specify a pixel height for the view port? I assume there is some default width and height in which the page is loaded by the DynamicScraper.

@ruipgil
Copy link
Owner

ruipgil commented Nov 9, 2014

There's this thread that might be useful, but not right now, since there's no proper way to set the viewport size.

@ruipgil ruipgil reopened this Nov 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants