Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requests with ExecuteJavascriptMiddleware are sent twice #239

Open
Fadarrizz opened this issue Apr 9, 2024 · 1 comment · May be fixed by #240
Open

Requests with ExecuteJavascriptMiddleware are sent twice #239

Fadarrizz opened this issue Apr 9, 2024 · 1 comment · May be fixed by #240
Assignees
Labels
bug Something isn't working

Comments

@Fadarrizz
Copy link

Fadarrizz commented Apr 9, 2024

Describe the bug
When using the ExecuteJavascriptMiddleware, two requests are sent. One by Browsershot, the other by Guzzle.

Reproduction
My spider only has the ExecuteJavascriptMiddleware registered as downloader middleware.

I placed dumps with the request object id in the following places, just before requests are sent:

RoachPHP\Http\Client
image
RoachPHP\Downloader\Middleware\ExecuteJavascriptMiddleware
image

When the spider runs, both dumps are shown:
image

Expected behavior
I was expecting only one request being sent: one from Browsershot, not also one from Guzzle.

Package versions (please complete the following information):

  • core: v3.0.1
@Fadarrizz Fadarrizz added the bug Something isn't working label Apr 9, 2024
@Fadarrizz
Copy link
Author

Fadarrizz commented Apr 9, 2024

Since the ExecuteJavascriptMiddleware handles a response, it's called once a response is received. So, the http client has already sent the request before it reaches the middleware.

My feeling is that the middleware should be able to handle the request before it is handled by something else.

Looking at how Scrapy does this, a downloader middleware can process requests and can return a response. When a response is returned by the middleware, no other request processing is done, only response processing.
https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#scrapy.downloadermiddlewares.DownloaderMiddleware.process_request

@ksassnowski What do you think is a suitable solution for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants