Skip to content

pachanteau/3_uber_pickups

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

PROJECT Unsupervised Machine Learning

Uber Pickups

Company's Description 📇

Uber is one of the most famous startup in the world. It started as a ride-sharing application for people who couldn't afford a taxi. Now, Uber expanded its activities to Food Delivery with Uber Eats, package delivery, freight transportation and even urban transportation with Jump Bike and Lime that the company funded. The company's goal is to revolutionize transportation accross the globe. It operates now on about 70 countries and 900 cities and generates over $14 billion revenue! 😮

Project 🚧

One of the main pain point that Uber's team found is that sometimes drivers are not around when users need them. For example, a user might be in San Francisco's Financial District whereas Uber drivers are looking for customers in Castro.

(If you are not familiar with the bay area, check out Google Maps)

Eventhough both neighborhood are not that far away, users would still have to wait 10 to 15 minutes before being picked-up, which is too long. Uber's research shows that users accept to wait 5-7 minutes, otherwise they would cancel their ride.

Therefore, Uber's data team would like to work on a project where their app would recommend hot-zones in major cities to be in at any given time of day.

Goals 🎯

Uber already has data about pickups in major cities. Your objective is to create algorithms that will determine where are the hot-zones that drivers should be in. Therefore you will:

Create an algorithm to find hot zones Visualize results on a nice dashboard Scope of this project 🖼️ To start off, Uber wants to try this feature in New York city. Therefore you will only focus on this city. Data can be found here:

👉👉 Uber Trip Data 👈👈 (https://full-stack-bigdata-datasets.s3.eu-west-3.amazonaws.com/Machine+Learning+non+Supervis%C3%A9/Projects/uber-trip-data.zip)

You only need to focus on New York City for this project

Deliverable 📬

Have a map with hot-zones using any python library (plotly or anything else). You should at least describe hot-zones per day of week. Compare results with at least two unsupervised algorithms like KMeans and DBScan. Your maps should look something like this: