You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In practical, it has been observed that this function can be very time-consuming if encountering image files or other larger binary files. It is recommended to modify it to this or similars:
def __looks_like_html(response):
"""Guesses entity type when Content-Type header is missing.
Since Content-Type is not strictly required, some servers leave it out.
"""
#text = response.text.lstrip().lower()
#return text.startswith('<html') or text.startswith('<!doctype')
return re.search(br'<html|<!doctype', response.content[:200]) is not None
The suggestion looks good, but I think you need to add re.IGNORECASE as third argument to re.search to also match uppercase tags. Can you turn this into a proper pull-request?
In practical, it has been observed that this function can be very time-consuming if encountering image files or other larger binary files. It is recommended to modify it to this or similars:
MechanicalSoup/mechanicalsoup/browser.py
Line 62 in 91b1207
The text was updated successfully, but these errors were encountered: