Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 1.74 KB

proposal.md

File metadata and controls

29 lines (19 loc) · 1.74 KB

Dataset

SF Crime Data

Who the partners are

Brenna Manning and David Abrahams

What Brenna wants to learn

More about machine learning algorithms, how they work, and how to optimize them

  • Gradient Boostng
  • Random Forest
  • More that we have not tried yet Better ways to extract meaningful information and statistics from a dataset How to make more accurate predictions How to make clearer, more visually appealing, and more informative visualizations

What David wants to learn

It’d be really cool to learn how to make awesome fivethirtyeight-esque visualizations, for example visualizing certain features spatially on top of the map of SF. I want to learn about Patsy and other ways to quickly and without too many lines manipulate and feature engineer your data. Gain better intuition about how to make cool plots (I usually just stackoverflow and consult ThinkStats2 to know how to make the graphs I want) Learn more about complex machine learning algorithms and how to chose which one(s) to use. I also want to learn what a pipeline is.

How will this project fulfil our learning goals?

The location and time components of this dataset allows us to explore some interesting ways of visualizing our data, such as visualizing points over a map, or perhaps an animation over time. We will consult online resources and code demos to create spatial visualizations, and make sure we understand how the code works. This project will allow us to explore many different algorithms as we try to improve our predictions. Both of us don’t entirely understand Gradient Boosting or how Random Forests choose their branches, so we will learn about them. Also as a stretch goal we will learn about and implement a Neural Network, which neither of us knows anything about.