Skip to content

stef4k/Business-Intelligence-Big-Data-Assignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LA CRIME ANALYSIS

Assignment for the course Business Intelligence and Big Data Analytics in the 4th year winter semester:

Use SQL Server (database and analysis services) or MySQL + Pentaho:

  1. A large data set will be found, which will will be cleaned and inserted into a data warehouse. Then a data cube and various metrics should be created. [40%]
  2. A visualization tool (Tableau or Power BI) will be used to create various instances of data visualization. [20%]
  3. The warehouse data will be used for some mining operations, such as categorization, correlation rules, clustering, etc. Use trading system methods and models or an open-source tool. Implement at least two models. [40%]

The analysis concerns the Los Angeles crimes from 2010-19. All the information about the dataset can be found here. In detail:

  • LA CRIMES PRESENTATION.pdf is the presentation of the analysis, application and conclusions. Quick way to examine the analysis.
  • Cleaning data.ipynb is the notebook file executing the ETL process (mostly extract and cleaning data).
  • Data Visualizations.ipynb are some examples of complex visualisations of the crime data with python. For all the visualizations check out the book, in section VISUALIZATIONS.
  • Clustering.ipynb describes three different clustering analyses, visualizations about them and the conclusions.
  • Machine Learning.ipynb is the analysis of training decision trees and their extensions in order to predict the crime type of unique incidents by using the rest information.
  • LA Crimes Book Greek.pdf is the whole book of the detailed analysis written in the greek language.

Execution

Firstly follow the instructions and run Cleaning data.ipynb and then execute the others.