Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write script to read in sources, order them based on primary & secondary keyword matches and count. #162

Open
devsaxena974 opened this issue Mar 21, 2023 · 1 comment

Comments

@devsaxena974
Copy link
Collaborator

No description provided.

@devsaxena974 devsaxena974 created this issue from a note in SCOPE Spring 2023 (To-Do) Mar 21, 2023
@l-zheng24 l-zheng24 moved this from To-Do to In Progress in SCOPE Spring 2023 Mar 26, 2023
@sigaloid
Copy link
Collaborator

Finished the ranking: 31ac015

Todo:

  • Integrate ranking with GDELT results.
    • Currently the results from GDELT are not ranked in a meaningful way to us
  • Re-rank the results based off of the keyword rankings
    • Fetch each result's content (Readability-lxml, readability, newspaper3k (obsolete?), Prof. Nwala's scraper, etc?)
    • Rank with new rank function from primary keyword, secondary keywords, article texts
  • Do we save the full text in the database?

@devsaxena974 devsaxena974 moved this from In Progress to Done in SCOPE Spring 2023 May 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

4 participants