Skip to content

williammollers/housing-regression-midtermproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IRONHACK MIDWAY PROJECT: Case Study Regression (HOUSE SALES IN SEATTLE)

This project is a part of the Data Analytics Bootcamp Oct 2020 - Jan 2021 at IRON HACK, Berlin, Germany .

-- Project Status: [Completed]

Project Intro/Objective

The purpose of this project is to predict the price of houses (which was provided in the .csv) from the remainder of the variables that pertained to housing sales that occurred in the Seattle area in 2014-2015. As a secondary objective we were tasked with seeing if properties over $650k had particular reasons, as opposed to cheaper properties. This was my first individual project completed during the bootcamp and was designed to test my knowledhge in MySQL, Python (and associated libraries, e.g. NUMPY, SCIKITLEARN, PANDAS, etc) and to be an introduction to Tableau.

For the exact specifications of the various parts of the project, please see the .md files contained in the respective directories and the .csv that was used is also provided in this repository.

Methods Used

  • Linear Regression & other forms of regressions analysis
  • Machine Learning
  • Data Visualization
  • Kanban

Technologies

  • Python
  • MySql
  • Pandas, jupyter
  • SkLearn
  • Numpy
  • Tableau

Project Description

The data used in this project was provided by the IRONHACK team and in general it was a fairly clean dataset.

In the beginning I was tasked with exploring the data in MySql and then to carry out further EDA, cleaning and wrangling within my Jupyter Notebook using python and it's associated libraries. Visualisation was completed in Tableau.

There were many challenges in the project, including the following:

  1. First time doing something like this. Therefore I spent a lot of time on structure and hope that the project is easy to read.
  2. Tableau was something that we had barely touched before this. Therefore were some visualisations far from ideal, but I am sure this will improve by the next commit.
  3. The project occurred during the 3rd week of Germany's second lockdown. This made exchanging opinion with colleagues very difficult and it was truly an individal project.

Needs of this project

  • data exploration/descriptive statistics
  • data processing/cleaning
  • statistical modeling
  • writeup/reporting

Getting Started

  1. Clone this repo (for help see this tutorial).
  2. Raw Data is being kept [here]https://github.com/williammollers/housing-regression-ironhack-midtermproject-nov-2020/tree/master/PYTHON_REGRESSION_ANALYSIS) within this repo.
  3. Data processing/transformation scripts are being kept here

Featured Notebooks/Analysis/Deliverables

All in all we had to deliver a MySql query book, a jupyter notebook, a Tableau dashboard, this readme file and a presentation (to be added)

Contact

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published