Skip to content

notthatbreezy/nyc-taxi-spark-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Description

An analysis of NYC Taxi-cab data using python and spark

(Incomplete) Instructions

Download the full dataset here: http://www.andresmh.com/nyctaxitrips/ or use the subset in data/

Download weather data (fill in your API key for forecast.io first) using python/get_weather_data.py

Fix hardcoded paths in python/generate-models.py to point to the correct data and python directories

Run locally with spark-submit

ToDo: Clean up hardcoded paths

NOTE: This is still a WIP -- the model developed here is expository only

About

Example python spark machine learning on NYC taxi data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages