Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_news()['url'] returns a URL about Google Consent Mode in GNews #53

Open
sohaibrahman64 opened this issue Feb 15, 2023 · 4 comments
Open

Comments

@sohaibrahman64
Copy link

Is there any way to bypass Google Consent Mode in GNews get_news() function? Calling get_news()['url'] function returns a URL as follows:

https://consent.google.com/m?continue=https://news.google.com/rss/articles/CBMijQFodHRwczovL3d3dy5idXNpbmVzc3RvZGF5LmluL21hZ2F6aW5lL2NvcnBvcmF0ZS9zdG9yeS9ib2xseXdvb2RzLXNtYWxsLWJ1ZGdldC1maWxtcy1hcmUtaW4tY3Jpc2lzLW1vZGUtaGVyZXMtd2hhdHMtZ29pbmctb24tMzYxMTE4LTIwMjMtMDEtMTnSAZEBaHR0cHM6Ly93d3cuYnVzaW5lc3N0b2RheS5pbi9hbXAvbWFnYXppbmUvY29ycG9yYXRlL3N0b3J5L2JvbGx5d29vZHMtc21hbGwtYnVkZ2V0LWZpbG1zLWFyZS1pbi1jcmlzaXMtbW9kZS1oZXJlcy13aGF0cy1nb2luZy1vbi0zNjExMTgtMjAyMy0wMS0xOQ?oc%3D5&gl=DE&m=0&pc=n&hl=en-US&src=1

Accessing the above URL, opens a Google Consent form to accept or reject cookies. The get_news() function should return a URL that allows to access the news article directly without Google Consent.

@mirandacross
Copy link

I'm also having this issue! I think it may have to do with EU GDPR stuff. Tried getting rid of the consent substring and it didn't work. Any help appreciated!

@ranahaani
Copy link
Owner

@mirandacross @sohaibrahman64 sorry for late response, you can get original link using request library

import request

requests.head(get_news()['url']).headers['location']

@dpujol04
Copy link

Hello all!
For me, your solution didn't work. So, I ended up implementing this one:
orig_url = requests.get(get_news()['url']).url

@murnanedaniel
Copy link

Hello all! For me, your solution didn't work. So, I ended up implementing this one: orig_url = requests.get(get_news()['url']).url

This solution works, but is much slower than than getting just the head. Unfortunately getting just the headers doesn't work, as it returns the Google RSS feed URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants