PROJECT SUMMARY: Analysing the Texas state Accident data between Jun 2016 to December 2019. The data set containes 298062 observations and 49 variables which is 50mb size file. As part of the project i will be working on analyzing the data pattern of the accident data, visualizing, predicting the severity accuracy using the Regression and KNearest Neighbor Machine learning alogorithms, and time series analysis of the data.
- TEXASAccidentsDataAnalysis.ipynb
- US_Accidents_Dec19_TX_Cleaned.zip (data file zipped to compress for github purpose)
Choropleth maps provide an easy way to visualize how a measurement varies across a geographic area or show the level of variability within a region. Here the following map visualizes by the count of accidents occurend by the counties.
The followimg map shows only Harris county map with the all the accidents points rendered on top of the map. For the Map, i have seleced the Texas counties shapefile.
• US Accidents (3.0 million records) https://www.kaggle.com/sobhanmoosavi/us-accidents#US_Accidents_Dec19.csv • A Countrywide Traffic Accident Dataset∗ https://arxiv.org/pdf/1906.05409.pdf • Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights https://arxiv.org/pdf/1909.09638.pdf • How You Can Avoid Car Accident in 2020 https://medium.com/@RonghuiZhou/how-you-can-avoid-car-accident-in-2020-c9626c9b6f68