Skip to content

Cristian Heredia | Data exploration expert, maximizing signal-to-noise through storytelling. Ask me how to use data to inform the business decisions.

Notifications You must be signed in to change notification settings

caheredia/caheredia.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Portfolio


Table Of Contents

Data Storytelling

  • Chicago Crime Data (work in progress) An exploration of reported Chicago Crime data. This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department’s CLEAR (Citizen Law Enforcement Analysis and Reporting) system.

    This report visually explores the data for trends. It also looks for trends using PCA and Gaussian Mixture Models. Through the latter, we see the crime reporting behaves in two distinct patterns: Weekday and Weekend. Weekday crime reporting tends to peak around noon. For Weekends, the flux of crime reports is steady throughout the day.

    However, we do see some weekdays behave like weekends, and conversely, some weekends behave like weekdays. Specifically, there are Tuesdays that behave like weekends, i.e. more crime is reported. Some of those happen to be Christmas, New Years, and July 4th—American holidays. And the two Sundays that behaved like weekdays coincide with playoff games, which might say more about Chicagoans than it's reported crime rates do.

    In summary, more crimes are reported (or perhaps committed?) during hours of leisure.
    Keywords: Pandas, Data visualization, PCA, Gaussian Mixture Model

  • California Housing Prices (work in progress) An exploration into predicting housing prices in California districts from census data.
    Keywords: Machine Learning, Random Forrest Regression model, Stratified data

  • Online dating stats: An analysis, with posterior distributions, of dating data for a Latino test account compared to similar demographics.
    Keywords: A/B Test, Bayesian inference, Pandas, Data visualization, Credible Interval

  • Split Test Analysis with Bayes Statistics: A product split test analysis starting from a table of coversion rates.
    Keywords: A/B Test, Bayesian inference, Pandas, Data visualization, Calculating A/B Test Sample Size, Credible Interval

  • Micro-hydro power generation: Due diligence on the viability of utilizing micro-hydro power generators in California's San Joaquin Valley irrigation canals. This is a work in progress!
    Keywords: Entrepreneur ventures, Business Development, Return on investment, Net present value, Lists of cash flows, Levelized cost of electricity, Returns over time

see code on GitHub

About

Cristian Heredia | Data exploration expert, maximizing signal-to-noise through storytelling. Ask me how to use data to inform the business decisions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages