Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape/Munge UCR Data #34

Open
seanjtaylor opened this issue Feb 6, 2017 · 2 comments
Open

Scrape/Munge UCR Data #34

seanjtaylor opened this issue Feb 6, 2017 · 2 comments
Milestone

Comments

@seanjtaylor
Copy link
Collaborator

The FBI's UCR data is updated yearly and contains aggregate crime stats at the MSA level. This will be our target variable.

Some URLs:

2015 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/table-6
2014 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2014/crime-in-the-u.s.-2014/tables/table-6
2013 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2013/crime-in-the-u.s.-2013/tables/6tabledatadecpdf/table-6

We need at least 5-10 years of this. It shouldn't be too hard to scrape, but the data munging will be hard. Ideally we have a csv file that has column headers: year,msa,offense category,count

Once we get this we can model how incident-level reports aggregate up to these numbers.

@seanjtaylor seanjtaylor added this to the crime-data milestone Feb 6, 2017
@bbrewington
Copy link
Contributor

Couple things to consider:

(1) How to handle MSA or MD with 2 "total" lines: [Total area actually reporting] & [Estimated Total]. It looks like the estimated total extrapolates the reported value to 100% of the population. In the 2015 data, the [Total area actually reporting] covers anywhere from 75% to 100% of a MSA (look at 2015 data --> Akron, OH for an example of < 100% actually reporting)

(2) Do we just want MSA (Metropolitan Statistical Area) data, or MD (Metropolitan Division) as well? MD's are subsets of MSA's as far as I can tell. Here's an example of an MSA w/ MD's: Chicago-Naperville-Elgin, IL-IN-WI M.S.A.

@bbrewington
Copy link
Contributor

#35

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants