New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how can I get more than 100 link #31
Comments
by default, google news return 100 results for more than 100 results we need a custom implementation. I'll try to implement it ASAP. |
Hello, how can I get more than 100 links per query ? |
hey, is more than 100 links an option? |
To get more links, you can call gnews from within a for loop from a start date, and increment the period repeatedly. Example:
|
I have also written a program to recursively call the function for 1/2 the time until all the data is extracted, I've also added a tqdm bar to show the progress from tqdm.notebook import tqdm_notebook as tqdm
from datetime import datetime, timedelta
def get_related_news(google_news, keyword:str, start_date:datetime, end_date:datetime, bar=None)->list[dict]:
google_news.start_date = (start_date.year, start_date.month, start_date.day)
google_news.end_date = (end_date.year, end_date.month, end_date.day)
if (bar is None):
bar = tqdm(total=(end_date-start_date).days+1, desc="Getting News")
# Get the news results
results = google_news.get_news(keyword)
num = len(results)
if ((num >=99) and ((end_date - start_date)> timedelta(days=4))):
# Recursively call the function for 1/2 the time and add them up
mid_date = start_date + (end_date - start_date) / 2
mid_date = datetime(mid_date.year, mid_date.month, mid_date.day)
# Merge the results
results = get_related_news(google_news, keyword, mid_date+timedelta(days=1), end_date, bar)\
+ get_related_news(google_news, keyword, start_date, mid_date, bar)
#Check tqdm bar and close it
if (bar.total == bar.n):
bar.close()
# Return direclty since results are already sorted
return results
sorted_results= sorted(results,
key=lambda x: datetime.strptime(x['published date'], "%a, %d %b %Y %H:%M:%S %Z"),
reverse=True)
# Update tqdm bar
update_proportion = (end_date - start_date).days+1
bar.update(update_proportion)
if (bar.total == bar.n):
bar.close()
return sorted_results |
hi Ranahaani
please can you explain in detail . so how can we get more than 100 urls. I change the parameters of GNews(max result =10000) but it doesnt work .
The text was updated successfully, but these errors were encountered: