Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle timeout exception from selenium #59

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

michelts
Copy link

@michelts michelts commented Mar 3, 2020

Hi @clemfromspace

I implemented the necessary steps to meet the issue #58. There wasn't any test of the wait_time and wait_until usage, so I added one.

I decided to always ignore the timeout exception and return the content to scrapy, but I can surely add a config option to allow retrocompatibity, if you prefer.

@manikandanraji
Copy link

can you tell me how can I use your fork using pipenv?

@michelts
Copy link
Author

michelts commented Jun 2, 2020

Hi @manikandanraji

I am using git urls through requirements.txt, something similar to:

git+git://github.com/michelts/scrapy-selenium.git@prod#egg=scrapy-selenium

I don't use pipenv, but maybe be you can start here ;)

@manikandanraji
Copy link

manikandanraji commented Jun 2, 2020 via email

@michelts
Copy link
Author

michelts commented Jun 2, 2020

It is possible to use several comma-separated css rules for the same condition, when using, for instance, element_to_be_clickable. You want the page to be loaded but sometimes the page renders different from what you expect.

This works for me:

wait_until = ".element-i-want-to-be-present, .not-found-warning"
EC.element_to_be_clickable((By.CSS_SELECTOR, wait_until))

@manikandanraji
Copy link

woah, that fixed the problem I have been trying to solve for the past couple of hours. once again, thank you man.

@michelts
Copy link
Author

michelts commented Jun 2, 2020

You are welcome ;)

@dustinmichels
Copy link

I like this pull request! It operates more in line with my expected / needed behavior, ie, if you get a timeout error because the HTML element never loaded, proceed to scrape what you can instead of skipping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants