-
Chicago Crime Data (work in progress) An exploration of reported Chicago Crime data. This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department’s CLEAR (Citizen Law Enforcement Analysis and Reporting) system.
This report visually explores the data for trends. It also looks for trends using PCA and Gaussian Mixture Models. Through the latter, we see the crime reporting behaves in two distinct patterns: Weekday and Weekend. Weekday crime reporting tends to peak around noon. For Weekends, the flux of crime reports is steady throughout the day.
However, we do see some weekdays behave like weekends, and conversely, some weekends behave like weekdays. Specifically, there are Tuesdays that behave like weekends, i.e. more crime is reported. Some of those happen to be Christmas, New Years, and July 4th—American holidays. And the two Sundays that behaved like weekdays coincide with playoff games, which might say more about Chicagoans than it's reported crime rates do.
In summary, more crimes are reported (or perhaps committed?) during hours of leisure.
Keywords: Pandas, Data visualization, PCA, Gaussian Mixture Model -
California Housing Prices (work in progress) An exploration into predicting housing prices in California districts from census data.
Keywords: Machine Learning, Random Forrest Regression model, Stratified data -
Online dating stats: An analysis, with posterior distributions, of dating data for a Latino test account compared to similar demographics.
Keywords: A/B Test, Bayesian inference, Pandas, Data visualization, Credible Interval -
Split Test Analysis with Bayes Statistics: A product split test analysis starting from a table of coversion rates.
Keywords: A/B Test, Bayesian inference, Pandas, Data visualization, Calculating A/B Test Sample Size, Credible Interval -
Micro-hydro power generation: Due diligence on the viability of utilizing micro-hydro power generators in California's San Joaquin Valley irrigation canals. This is a work in progress!
Keywords: Entrepreneur ventures, Business Development, Return on investment, Net present value, Lists of cash flows, Levelized cost of electricity, Returns over time
see code on GitHub