Skip to content

irahorecka/craigslist-housing-miner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

craigslist-housing-miner

Welcome to craigslist-housing-miner, a web scraping tool that extracts information from up to every Craigslist Housing post around the world.
Note: This tool should only be used for personal use and data analysis.

cover_photo

About the application

This project leverages asynchronous execution of processes to rapidly mine information from Craigslist Housing posts. The data is written to CSV in the following format:
CraigslistHousing_{country/state}_{region/subregion}.csv

An example of a CSV file for Dothan, Alabama would look like this:
CraigslistHousing_alabama_dothan.csv

Another example of a CSV file for Tokyo, Japan would look like this:
CraigslistHousing_japan_tokyo.csv

Running the application

  1. Clone this github repository.
  2. Install the required dependencies:
    pip install -r requirements.txt
  3. Run main.py:
    python main.py

The user is given two prompts:
Input a list of appropriate countries.
If no list is provided, a global search will be conducted:

Input a list of appropriate country keywords in which you would like to search. You may find the full list of country keywords here. For example:
['united_states', 'japan', 'canada']

Would you like to include geotags of your Craigslist posts [y/n]:
Type y if you would like to receive geographic coordinates for every craigslist post:
geotag=True

Note: acquiring geotags will take a considerable amount of time.
To mitigate this, you can omit geotags by typing n:
geotag=False

The application will exit once the process is finished; otherwise, you may have to repeatedly press CTRL + C in your operating terminal to properly exit the application.

Data

All data is stored in the craigslist-housing-miner/data/{date data acquired} directory.
For example:
craigslist-housing-miner/data/2020-06-14

Summary

craigslist-housing-miner is a useful tool if you are intereted in studying housing posts on Craigslist. However, there are two limitations in the tool's current state:

  1. The current state of the project is not a PyPI library.
  2. Lack of GUI to facilitate easy selection of housing type(s), countries, regions, etc.

These features are scheduled to be implemented in the near future.

Fin