Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Json API Response's content-type is text/html;charset=UTF-8 by I want application/json; #122

Open
workji opened this issue Aug 10, 2022 · 0 comments

Comments

@workji
Copy link

workji commented Aug 10, 2022

crawling vuejs site' background json data api

  1. The Request:
yield SeleniumRequest(
                    url=json_api_url,
                    wait_time=3,
                    callback=self.parse_api)
  1. The Origin Response:
{"data":{"list":[{"title":"adidas originals Yeezy 450 "Cloud White" H68038"},{"title":"adidas "Have A Good Game" H68038"}],"next":true,"total":2000},"result":1}
  1. I really Get Response:
    def parse_api(self, response):
        json_str = response.xpath('//body/text()').get()
        json_obj = json.loads(json_str)
{"data":{"list":[{"title":"adidas originals Yeezy 450 "Cloud White" H68038"},{"title":"adidas "Have A Good Game" H68038"}],"next":true,"total":2000},"result":1}
  1. The Problem:
json_obj = json.loads(json_str)              <- Go Error
json.decoder.JSONDecodeError: Expecting ',' delimiter: line
  1. The basic reason:
    when response's content-type is text/html;
    the HTML character entities ( &quot; ) changed to ( " ) and destory json format

so, my question is how can i change content-type [ text/html; ] to [ application/json; ] , or how can i avoid ( &quot; ) changed to ( " )
thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant