The goal of this statistical analysis is to help us understand the relationship between trip duration and other features in the dataset that affect the trip duration.
Predect the trip duration time that depends on specific features extracted from the dataset.
- Scrum
- id: a unique identifier for each trip
- vendor_id: a code indicating the provider associated with the trip record
- pickup_datetime: date and time when the meter was engaged
- dropoff_datetime: date and time when the meter was disengaged
- passenger_count: the number of passengers in the vehicle (driver entered value)
- pickup_longitude: the longitude where the meter was engaged
- pickup_latitude the latitude where the meter was engaged
- dropoff_longitude: the longitude where the meter was disengaged
- dropoff_latitude: the latitude where the meter was disengaged
- store_and_fwd_flag: This flag indicates whether the trip record was held in vehicle memory before sending to the vendor because the vehicle did not have a connection to the server - Y=store and forward; N=not a store and forward trip.
- trip_duration: duration of the trip in seconds
- Number of rows: 1458644
- Number of columns: 11
from Kaggle website [Kaggle]
- Trip Duration per Hour and per Day
- Is there a relationship between Distance and Trip Duration?
- Distance per Hour and per Days
- Is Vendor Id have relationships with Trip Duration?
- What Vendor have the most Trips?
- Is the Passenger number affect the Trip Duration?
- Which Days has the highest number of Passengers?
- What’s the number of Trips per Hour and per Day?
- Features Correlating with Trip Duration
- Arima Model
- VScode
- Trello
- Jupyter
- Github
- PowerPoint
- Zoom
- Python
- Pandas
- numpy
- seaborn
- plotly
- sklearn
- Find out what factors affect on the trip duration.
- Find how many trip per hour we can reach in one day.
- We want to see if there is a specific day that have alot of trafic.