New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Write script to read in sources, order them based on primary & secondary keyword matches and count. #162

Open

devsaxena974 opened this issue Mar 21, 2023 · 1 comment

Assignees

Labels

Backend

Projects

SCOPE Spring 2023

Collaborator

devsaxena974 commented Mar 21, 2023

No description provided.

devsaxena974 created this issue from a note in SCOPE Spring 2023 (To-Do)

devsaxena974 assigned devsaxena974, SamiraR123, l-zheng24 and sigaloid

l-zheng24 added the Backend label

l-zheng24 moved this from To-Do to In Progress in SCOPE Spring 2023

Collaborator

sigaloid commented Mar 26, 2023

Finished the ranking: 31ac015

Todo:

Integrate ranking with GDELT results.
- Currently the results from GDELT are not ranked in a meaningful way to us
Re-rank the results based off of the keyword rankings
- Fetch each result's content (Readability-lxml, readability, newspaper3k (obsolete?), Prof. Nwala's scraper, etc?)
- Rank with new rank function from primary keyword, secondary keywords, article texts
Do we save the full text in the database?

devsaxena974 moved this from In Progress to Done in SCOPE Spring 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment