New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
httpcompression middleware: warn of unexpected and prevent manual unupported encodings #6145
base: master
Are you sure you want to change the base?
httpcompression middleware: warn of unexpected and prevent manual unupported encodings #6145
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #6145 +/- ##
==========================================
- Coverage 88.59% 88.57% -0.02%
==========================================
Files 159 159
Lines 11582 11599 +17
Branches 1885 1891 +6
==========================================
+ Hits 10261 10274 +13
- Misses 994 996 +2
- Partials 327 329 +2
|
7dc4408
to
62f4c2d
Compare
62f4c2d
to
fcbc7fa
Compare
@@ -17,6 +18,9 @@ | |||
from typing_extensions import Self | |||
|
|||
ACCEPTED_ENCODINGS: List[bytes] = [b"gzip", b"deflate"] | |||
logger = logging.getLogger(__name__) | |||
|
|||
ACCEPTED_ENCODINGS = [b"gzip", b"deflate"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a merge error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I'm working on some more changes. Many tests got broken, fixed that already. Polishing next commit now.
@@ -85,6 +91,19 @@ def process_response( | |||
|
|||
return response | |||
|
|||
def _raise_unsupported_compressors(self, request: Request): | |||
encodings = request.headers.getlist("Accept-Encoding") | |||
unsupported = [key for key in encodings if key not in ACCEPTED_ENCODINGS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to split the header value.
if b"br" in ACCEPTED_ENCODINGS: | ||
body = brotli.decompress(body) | ||
else: | ||
body = bytes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
body = bytes() | |
body = b'' |
Alternatively, keep it compressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, was still thinking what to return. Will change.
So, this raises exceptions for requests with unsupported encodings declared in Accept-Encoding. I'm not sure if this is the best course of action. In the previous issues/PRs there were also such options suggested as logging a warning and removing the unsupported encodings from the header. In any case, the warning/error logged/raised should clearly explain what is the problem in trhe case of brotli and zstd: that the user should install a module to get it supported. |
@wRAR we can change raising an error to logging error message, not a problem. I just followed @Gallaecio 's issue description. I've changed PR already, so it fires exception only if it's b"br" manually added by user and not supported by machine. |
This change is made for solving issue #4697.
PR #4697 is used in terms of fast understanding where the code base must be changed. Please, inform me if this PR must be to that branch first and not directly to master of scrapy/scrapy.
Please, not that in initial version of this PR, the body becomes empty bytes in case of b"br" is used in response but not supported by executing machine. Please, inform me if it's not correct approach.
Will be happy to get review.
Fixes #4697, closes #5480