Skip to content

awavering/CC-Bill-Tracker

Repository files navigation

CC-Bill-Tracker

These map reduce functions use Common Crawl data to look at the spread of congressional legislation on the internet.

Program Tasks:

  1. Count on how many pages the bill, in any of its forms, has been mentioned
  2. Record the domains of pages that mention a bill, in any of its forms, and outputs the 50 domains that have mentioned the bill the most (with their count of pages that have mentioned the bill)
  3. Output the top 50 words found across all pages that mention a bill in any of its forms, less a set of 100 very common words

These functions are called from the file TotalAnalysis.

About

These map reduce functions use Common Crawl data to look at the spread of congressional legislation on the internet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published