Skip to content

The purpose of this repository is to inform the New York Police Department (NYPD) and the general public about future arrest rates in New York City.

Notifications You must be signed in to change notification settings

ridahbhatti/NYPD-Arrests-2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

NYPD-Arrests-2021

NYPD Arrests Forecasts Updated 13 days ago

The purpose of this statistical analysis is to help inform the New York Police Department(NYPD) about future arrest rates. The New York City Police Department maintains a database of arrests made within the five boroughs. The data file provides a breakdown of every arrest made in NYC by the NYPD going back to 2006 through the end of the previous calendar year. The goal of the modeling in this project is to capture genuine patterns and relations existing in the NYPD historical arrest data but not to replicate past events that will not occur again. These genuine patterns should be modelled and extrapolated. The New York Police Department can use forecasts short-term, medium-term, and long-term forecasts depending on the Organization’s goal and application. Short-term forecasts are useful for scheduling of demand, personnel, and, transportation. Medium term forecasts are useful for determining future resource requirements, hiring personnel, or providing necessary equipment. Long-term forecasts are used for strategic planning, environmental factors, and internal resources. Arrest rates may change based on a multitude of factors, including but not limited to, natural disasters, unemployment, and pandemics. Thus, a forecasting method is required that allows for trend and seasonality if it is present and is robust to sudden changes in underlying patterns. The time series model is also useful when the relationships that govern the system behavior are difficult to measure. Prediction intervals are a useful way of presenting the uncertainty in forecasts. Forecasting can also be used simply when the main concern is only to predict what will happen, not to know why it happens. Time series used for forecasting include decomposition models, exponential smoothing models and Arima models. R is a statistical software platform with the package forecast consisting of several algorithms that do automatic forecasting. There are definite benefits and dangers of automation. One is that it saves time and money. The creator of the forecast package, Prof.Rob Hyndman states, “There’s always going to be particular time series where your automatic algorithm, whichever one you use, where the automatic algorithm does not do so well. A good strategy that I encourage my clients to do is to try to identify the series that are not being forecast well and just look at those ones. And let the automatic algorithm do the bulk of the series. And then you can concentrate on spending your analyst time, which is expensive, on the cases where the automation is not working so well.” The simplest time series forecasts methods use only information on the variable to forecasts. These methods are not used for the purpose of discovering behavior affecting factors. They simply extrapolate trend and seasonal patterns. The first step in forecasting is defining the problem and understanding the way the forecast will be used. The second step is to gather data that can be used to a fit a good statical model. The third step is exploratory analysis, meaning graphing the initial data, and looking for consistent patterns. We must check for significant trends, seasonality, correlation, and outliers within the data. The next step is to choose and fit the models based on the availability of the historical data, and often the comparison of different models. The selected models will have estimated parameters used to make forecasts. The performance of the models is tested after the forecast period becomes available, however there are also several methods that have been developed that assist in forecast accuracy assessment. In this project a range of different forecasting approaches will be tested.

DATA INFORMATION NYPD Arrests Data (Historic) : This project examines two data files titled as “NYPD Arrests Data (Historic)” from data.cityofnewyork.us. This data is manually extracted every quarter and reviewed by the Office of Management Analysis and Planning before being posted on the NYPD website. Each record represents an arrest effected in NYC by the NYPD and includes information about the type of crime, the perpetrator, the location, and time of enforcement. This data can be used by the public to explore the nature of police enforcement activity. In exploring this data set we refer to the NYPD Penal Law Codes which were obtained in a file provided through the NYPD Compliant Dataset footnotes. The Penal Law file lists the PD code value, along with the NYS LAW code, the charge category (Felony, Misdemeanor, Violation, Infraction), along with general and specified crime description.

About

The purpose of this repository is to inform the New York Police Department (NYPD) and the general public about future arrest rates in New York City.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published