Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The number of crawled paper is very small #88

Open
Ricksanchez000 opened this issue Mar 11, 2024 · 2 comments
Open

The number of crawled paper is very small #88

Ricksanchez000 opened this issue Mar 11, 2024 · 2 comments

Comments

@Ricksanchez000
Copy link

Hi plz some one help me with this:

I utilized GNews to crawl News from 2023.10.1 to 2024.3.10 about the "Red Sea Crisis", but only got about 80 papers.
But when I search key word in Factiva for the same duration, it has results about 3000 articles. I am doing NLP analysis so the volume of articles is quite essential.

Is the number of articles being limited by GNews or it simply does not have that much articles on Google News?

@MonikaBarget
Copy link

We have the same issue. I also tried to scrape Google News with a different code before and got 100 results max. per query. It seems that we need pagination but I am not sure how to implement this here. One option would be to work with the start and end dates, going through really small windows of time to collect more results for consecutive days.

@MonikaBarget
Copy link

This is a related issue suggesting some workarounds: #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants