Skip to content

mjoy296/Housing_ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine-Learning-Project

NYCDSA Machine Learning Project: House Prices

Introduction Data Visualization and data wrangling are key skills for a junior data scientist role. However, there is also a great demand on predictive models and machine learning theorem understanding. The Machine Learning Kaggle Project expects one to demonstrate finesse at handling data, including business understanding and feature engineering. This project will also test one’s ability to work in a team setting with other data scientist fellow.

What We’re Looking For Synthesis of machine learning material is the goal of this project. While the dataset is drawn from a Kaggle competition, the purpose should be oriented to demonstrating of understanding machine learning theory.

For this project, your primary task is to employ machine learning techniques to accurately make predictions given a dataset. The framework will be through the lens of the House Prices: Advanced Regression Techniques from Kaggle. While the primary goal of Kaggle competitions is generally focused on predictive accuracy, you will be expected to lead your audience through descriptive insights as well. For the purposes of your project you will aim to not only create a model that predicts well, but also allow yourself to describe data insights drawn from exploration.

Successful projects will encompass (but are not limited to) the following:

Submission in respect to the deadline.

Background knowledge of dataset(s).

Communication of motivation: why do we care?

Research questions of interest: what do you want to find out?

Answers to research questions: what have you uncovered?

Presentation skills.

Time management (not going over the allotted time).

Ability to answer audience questions effectively and efficiently.

Balance of complexity and simplicity.

Explanation of future work: what would you do if given more time, data, etc.?

Demonstration of EDA skills:

Numeric methodology.

Graphic methodology.

Demonstration of machine learning skills:

Supervised methodology.

Unsupervised methodology.

Ability to assess model weaknesses and identify improvements.

Ability to manage a team workflow.

The Details Your project proposal declaration is due on Wednesday, Feb 21. You must declare your team on the project proposal document. This is a team project in respect to the final deliverable. Every student must work with at least one other student for the project and presentation, but the maximum group size is 4. We encourage collaboration and knowledge generation.

All code, data, etc. used to generate your graphics and slides for your presentation are due to GitHub by Sunday, Mar 4 at 11:59pm. No exceptions. Only one teammate needs to submit on behalf of each group. The team leader needs to add your team members by email when submitting the project.

You will be required to deliver a 10-20 minute presentation dependent upon group size (up to 10 minutes for groups of size ≤ 2, up to 20 minutes for groups of size > 2 and size ≤ 4) and respond to any audience questions. Time slots will be randomly assigned on this calendar, so all projects must be submitted on time. No exceptions.

Friday, Mar 2 will be devoted to working on your project. Make sure that you have a significant chunk done to best use instructor and classmate resources available on project day. Your team need to book a 15-30 minutes session with your TA to go over what you have done and what you can improve.

An associated blog post will be due by Sunday, Mar 11 at 11:59pm. No exceptions. Remember, this is a living and breathing document. You may continue to develop and edit your project far beyond the deadline, as no project will ever truly be complete. You may choose to co-author a single blog post describing your whole project, or submit individual blog posts highlighting your own personal project workflow (e.g., if your team specifically delineated responsibilities, etc.).

MLWave a great website for Kaggle and you can find everything you need on it.

Machine Learning Mastery is another awesome website regarding general machine learning stuff.

If you are interested in how those machine learning algorithms are implemented instead of just know how to use them, feel free to check out this repository.

For any lingering questions, please do not hesitate to reach out; we are always here to help!

Good luck!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published