You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need at least 5-10 years of this. It shouldn't be too hard to scrape, but the data munging will be hard. Ideally we have a csv file that has column headers: year,msa,offense category,count
Once we get this we can model how incident-level reports aggregate up to these numbers.
The text was updated successfully, but these errors were encountered:
(1) How to handle MSA or MD with 2 "total" lines: [Total area actually reporting] & [Estimated Total]. It looks like the estimated total extrapolates the reported value to 100% of the population. In the 2015 data, the [Total area actually reporting] covers anywhere from 75% to 100% of a MSA (look at 2015 data --> Akron, OH for an example of < 100% actually reporting)
(2) Do we just want MSA (Metropolitan Statistical Area) data, or MD (Metropolitan Division) as well? MD's are subsets of MSA's as far as I can tell. Here's an example of an MSA w/ MD's: Chicago-Naperville-Elgin, IL-IN-WI M.S.A.
The FBI's UCR data is updated yearly and contains aggregate crime stats at the MSA level. This will be our target variable.
Some URLs:
2015 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2015/crime-in-the-u.s.-2015/tables/table-6
2014 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2014/crime-in-the-u.s.-2014/tables/table-6
2013 MSA: https://ucr.fbi.gov/crime-in-the-u.s/2013/crime-in-the-u.s.-2013/tables/6tabledatadecpdf/table-6
We need at least 5-10 years of this. It shouldn't be too hard to scrape, but the data munging will be hard. Ideally we have a csv file that has column headers:
year,msa,offense category,count
Once we get this we can model how incident-level reports aggregate up to these numbers.
The text was updated successfully, but these errors were encountered: