Skip to content

The collection of scrapers and parsers I have written.

Notifications You must be signed in to change notification settings

ccxzhang/scrapers-and-parsers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web-scrapers

The following scrapers are either part of the project that I did or merely for fun.

  1. Bipartisan-Index: scraper and raw data for the Lugar Center’s Bipartisan index for 116th congress. The Code folder includes codes to scrape house and sentae bills and extract congress member personal details. Data foler contains two json files extracted from the Biographic Directory of the United States Congress.

  2. Paraguay: scraper for Paraguay's Comptroller General (Contraloría General de la República).

  3. FBpages: scraper built on selenium to extract information from Facebook public pages.

  4. Zhihu: scraper to get answers from Zhihu (Chinese version of Quora) which used Ajax.

  5. Selenium-Tutorial: presentation slides for Selenium, including the common usages of Selenium.