Skip to content

What are the different factors which affect the target variable? What business recommendations can we give based on the analysis? How can we use different ensemble techniques - Bagging, Boosting, and Stacking to build a model to predict the count of bikes rented?

Notifications You must be signed in to change notification settings

NatenaelTBekele/Bike_Sharing_Demand_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Bike_Sharing

Context:

Bike-sharing systems are a new generation of traditional bike rentals where the whole process from membership, rental and return has become automatic. Through these systems, the user can easily rent a bike from a particular position and return to another position. Currently, there are about over 500 bike-sharing programs around the world which are composed of over 500 thousand bicycles. Today, there exists a great interest in these systems due to their important role in traffic, environmental, and health issues.

Problem Statement:

'Travel Along' is a new bike-sharing company and wants to expand its customer count and provide better services at a reasonable cost. They have conducted several surveys and collated the data about weather, weekends, holidays, etc. from the past 2 years.

As a recently hired data scientist at 'Travel Along', you have been asked to analyze the patterns in the data and figure out the key areas which can help the organization to grow and manage the customer demands. Further, you need to use this information to predict the count of bikes shared so that the company can take prior decisions for surge hours.

Data Description:

The bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions, precipitation, the day of week, season, the hour of the day, etc. can affect the rental behaviors.

  • instant: record index
  • dteday : date
  • season : season (1:spring, 2:summer, 3:fall, 4:winter)
  • yr : year (0: 2011, 1:2012)
  • mnth : month ( 1 to 12)
  • hr : hour (0 to 23)
  • holiday : whether the day is holiday or not
  • weekday : day of the week
  • workingday : if day is neither weekend nor holiday then 1, otherwise is 0.
  • weathersit :
    • 1: Clear, Few clouds, Partly cloudy
    • 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist
    • 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds
    • 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
  • temp : Normalized temperature in Celsius. The values are divided by 41 (max)
  • atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max). The “feel like” temperature relies on environmental data including the ambient air temperature, relative humidity, and wind speed to determine how weather conditions feel to bare skin.
  • hum: Normalized humidity. The values are divided by 100 (max)
  • windspeed: Normalized wind speed. The values are divided by 67 (max)
  • casual: count of casual users
  • registered: count of registered users
  • cnt: count of total rental bikes including both casual and registered

About

What are the different factors which affect the target variable? What business recommendations can we give based on the analysis? How can we use different ensemble techniques - Bagging, Boosting, and Stacking to build a model to predict the count of bikes rented?

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published