README file for Airbnb Price Prediction using Ensemble Learning Methods
Authors:
- CHAGNON Pierre
- MOHAMED Shamir
- SARFATI Alban
- TAYLOR Thomas
- TRIGANO Elie
This project aims to use ensemble learning methods to predict the price of Airbnb listings in New York City. The data set used for this project is obtained from Kaggle and contains around 47,000 listings in New York City in 2019. This project is part of the Ensemble Learning course at CentraleSupélec. The project was done in collaboration with Shamir Mohammed, Alban Sarfati, Thomas Taylor, and Elie Trigano.
Introduction Airbnb has disrupted the hospitality industry by offering individuals the opportunity to list their own properties as rental places. However, determining the optimal price for a listing can be challenging, and the variation in types of listings can make it difficult for renters to get an accurate sense of fair pricing. This project aims to use ensemble learning methods to predict the price of Airbnb listings in New York City.
Dataset Description The data set used for this project is obtained from Kaggle and contains around 47,000 listings in New York City in 2019. The data set includes 15 features on listings, including the name of the listing, neighborhood, price, review information, and availability.
Access to the Data The data set can be downloaded from Kaggle at the following link: https://www.kaggle.com/datasets/dgomonov/new-york-city-airbnb-open-data
Objectives:
- Perform an exploratory data analysis (EDA) on the data set
- Preprocess the data set
- Perform feature engineering
- Try different a decistion tree and different ensemble learning methods to predict the price of a listing
- Compare the results of the different models
Content:
- README.md: this file
- AB_NYC_2019.csv: the data set
- DT.ipynb: the notebook containing the second part of the project, a decision tree made from scratch
- main.ipynb: the notebook containing all parts of the project: EDA, preprocessing, feature engineering, model selection, model evaluation, and model comparison
- modeling.ipynb: the notebook containing the preprocessing, model selection, hyperparameter tuning, evaluations and score
- Ensemble_Learning_Final_Report.pdf: final report of our project
- ENSEMBLE LEARNING _ Airbnb price predictions & DT.pdf: slides presentation of our project
Bonus:
- We implemented a research paper using tree-based classifiers to predict the direction of stock market prices. The code and results can be found here: https://github.com/albnsft/TreeBasedStockMarketDirection
Requirements:
- Python
- Jupyter Notebook
- Numpy
- Pandas
- Matplotlib
- Seaborn
- Scikit-learn
- XGBoost
- Warnings
- Category Encoders