You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to make get request and receive chunked response, then parse each chunk body to get information from it. But at the end of get request invoked following chain: add_soup(response) -> Browser.__looks_like_html(response) -> response.text, that forces to load all chunks into memory. Is add_soup method required when request has "stream=True"?
The text was updated successfully, but these errors were encountered:
We never really tested this use-case, so you should probably expect issues if you need to stream content that do not fit in memory. Clearly, you won't be able to use BeautifulSoup-related features if you can't load the page in memory, but other potential issues should be fixable, so patches welcome. On my side, I won't have time to implement it myself any time soon, but I can help if you want to work on a patch.
I think I've run into this while trying to POST a form, follow the 302 redirect and then GET the result.
It's not too bad to dip into Browser's internals with:
form=browser.get_current_form().form# Here we would like to call browser.submit_selected(update_state=False, stream=False)# but mechanicalsoup will not allow us to stream the response.# <https://github.com/MechanicalSoup/MechanicalSoup/issues/288>response=browser._request(form, browser.get_url(), stream=True)
... and then streaming the content by accessing response.iter_content.
Thanks for the workaround. It would be really helpful if this were documented somewhere. I think there are many cases where submitting a form returns a file to download that we do not want to load into memory.
I want to make get request and receive chunked response, then parse each chunk body to get information from it. But at the end of get request invoked following chain: add_soup(response) -> Browser.__looks_like_html(response) -> response.text, that forces to load all chunks into memory. Is add_soup method required when request has "stream=True"?
The text was updated successfully, but these errors were encountered: