Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for user defined user-agent string #74

Open
yamamushi opened this issue Jun 26, 2017 · 3 comments
Open

Add support for user defined user-agent string #74

yamamushi opened this issue Jun 26, 2017 · 3 comments

Comments

@yamamushi
Copy link

yamamushi commented Jun 26, 2017

Expected behavior

Parsing https://www.reddit.com/r/games/.rss should work with an appropriate delay in making requests (Reddit asks for 2 seconds between bot requests).

To further describe the issue, this could be resolved if we had the option of defining our own user-agent strings (or any headers for that matter) when calling gofeed.ParseURL(url string) or when constructing our parser with gofeed.NewParser() .

Actual behavior

Returns 429 Too Many Requests, as Reddit filters requests that do not have user-agent strings.

The first request will work, after which Reddit will block all new requests for a period of time.

Steps to reproduce the behavior

fp := gofeed.NewParser()
feed, err := fp.ParseURL("https://www.reddit.com/r/games/.rss")
if err != nil {
fmt.Println(err.Error())
return
}
// This first request will work
fmt.Println(feed.Title)

time.Sleep(5 * time.Second)

// This second request will fail because no user-agent string is defined for the request
secondfeed, err := fp.ParseURL("https://www.reddit.com/r/games/.rss")
if err != nil {
fmt.Println(err.Error())
return
}
fmt.Println(secondfeed.Title)

Note: Please include any links to problem feeds, or the feed content itself!

@bogatuadrian
Copy link

As a workaround you could use your own transport by implementing the RoundTripper interface to set the User-Agent header, like:

type UserAgentTransport struct {
	http.RoundTripper
}

func (c *UserAgentTransport) RoundTrip(r *http.Request) (*http.Response, error) {
	r.Header.Set("User-Agent", "<platform>:<app ID>:<version string> (by /u/<reddit username>)")
	return c.RoundTripper.RoundTrip(r)
}

func main() {
	fp := gofeed.NewParser()
	fp.Client = &http.Client{
		Transport: &UserAgentTransport{http.DefaultTransport},
	}
	fp.ParseURL("https://www.reddit.com/r/games/.rss")
}

The <platform>:<app ID>:<version string> (by /u/<reddit username>) is suggested by the reddit API documentation.

@carthics
Copy link

carthics commented Mar 9, 2018

@bogatuadrian Thank you very much. This was really useful!

@GaruGaru
Copy link

#108 Should resolve this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants