Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow spider attr #15

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

pawelmhm
Copy link

This is about #14 and
#11

opening this for discussion, needs tests and improvements, would be cool to get some feedback if it's in good direction


slot_policy = splash_options.get('slot_policy', self.slot_policy)
self._set_download_slot(request, meta, slot_policy)

args = splash_options.setdefault('args', {})
args.setdefault('url', request.url)
args['url'] = request.url

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmike are you okay with this change. What was usecase of .setdefault() here? Does it have any sense to create request with one url and specify another url in meta?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I was thinking of allowing user not to use 'url' argument at all. This argument is not required for Splash scripts - e.g. you can render a chunk of HTML using splash:set_content. But the meta syntax doesn't make much sense in this case; something in lines of #12 could be a better fit.

* removed process_html (render response in html feature) from this PR
* removed remote request tests with extra spider
* refactors tests from functions to objects inheriting
from unittest.TestCase
* adds tests for enabling middleware with spider attribute
@pawelmhm
Copy link
Author

updated after discussions

@chekunkov
Copy link

@pawelmhm tests are failing

@@ -6,6 +6,7 @@
from scrapy.exceptions import NotConfigured

from scrapy import log
from scrapy.http.response.html import HtmlResponse

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like unused import

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, this is fixed now

@pawelmhm
Copy link
Author

TODO

  • docs

something else?

@Gallaecio
Copy link
Contributor

I know it has been a rather long time, but it looks like this is still a valid pull request, and a nice once at that. It seems to me like it’s only missing renaming dont_proxy to something else and covering the changes in the documentation.

@pawelmhm Do you have plans to resume work on it at some point?

Gallaecio pushed a commit to Gallaecio/scrapy-splash that referenced this pull request Aug 12, 2019
@Gallaecio Gallaecio mentioned this pull request Aug 12, 2019
@Gallaecio
Copy link
Contributor

Continued at #235

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants