Skip to content

A git repo to store all of the files for task-3 of GRIP internship by TSF

Notifications You must be signed in to change notification settings

SuhruthY/GRIP_Task3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploring the GLobal Terrorism Data: Tableau Public

Overview

 Despite various strategies performed by Security and Intelligence agencies throughout the world, terrorist activities are still a millennium challenge. Recently, data-mining techniques have evolved to allow the identification of patterns and associations in criminals.

 GTD has recently released its new version that includes all data about terrorist attacks till 2019, including the year 1993(as a separate file). The GTD codebook provides detailed documentation about the data and its inclusion criteria. I have used SQLite to subset the data and then perform various cleaning and preprocessing using programming languages such as Python and R programming. I then used Tableau, a data visualization tool, to join the subsets and explore the data.

 After working on this project, one can dive deeper into each feature and explore its relations. We can perform various statistical analyses to understand, discover and predict the upcoming terrorist attacks.

Literature Survey

 Fatalities(number of deaths) and Casualties(number of deaths and injuries) are the key issues I dealt with in this project. Correlation factors that influence terrorist attacks by ISSST, Country-level terrorism trends, Identification of subgroups to prevent mass casualties are some of the research topics that I used. Most of these focus on different statistical methods and machine learning approaches to detect terrorist activities in depth.

 The background report 2019 provided by the START was my starting point to understand the GTD as a whole. It showcases an intuitive way of understanding the GTI score. Various regional trends and trends on terrorist groups are studied.

Procedure

Global Terrorism Database

 The GTD defines a terrorist attack as the threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation. To consider an incident for inclusion in the GTD, all three of the following attributes must be present:

  • The incident must be intentional
  • The act must entail some level of violence
  • The perpetrators of the incidents must be sub-national actors

 In addition, at least two of the following three criteria must be present

  • Criterion 1: The act must want to attain a political, economic, religious, or social goal.
  • Criterion 2: There must be evidence of an intention to coerce, intimidate, or convey some other message to a larger audience
  • Criterion 3: The action must be outside the context of legitimate warfare activities

Some things to be noted while working

  • Geo-political boundaries of many countries have changed over time.
  • The location coordinates(latitudes and longitudes) are WGS1984 standards)
  • The success of a terrorist attack is defined according to the tangible effects of the attack but not judged in terms of the larger goals of the perpetrators

 Check out my take on finding GTI score here: Calculating Global Terrorism Index

3-Phase Model

Dividing the project into segments and smaller, it aims to

  • Find a metric to quantify terrorism as a whole.
  • Compare consecutive years by casualties level.
  • Uncover hidden patterns in frequency count and textual data.

 In the first part, we explore the geospatial trends with GTI scores in our minds. We will also find various global KPIs that can be derived. Then in the next part, we will differentiate terrorism between two consecutive years. We will also study the top most influential terrorist groups over the years. The last part tries to unearth the hidden patterns of the textual data and correlation between categories.

Statistical Techniques

 Used Correspondence Analysis to find out the correlation between the categorical variables and fatality. A new feature of factors very-low, low, high, very-high is created by the number of deaths. For each variable in the study, I made a frequency table of fatality levels. Then the distance measure is calculated to obtain the final contributions. After that, implemented Singular Value Decomposition to deduce two principal components that explain more than 90% of the variance.

 Iterative topic modeling is performed with the textual data to understand the term frequency and document the frequency of the top words in each year.

 You can find out the detailed explanation in Exploring the Global Terrosism Data: The bakend

Conclusion & Future Scope

 In summary, we have derived a potential metric to quantify terrorism, observed the similarities and dissimilarities between consecutive years, unleashed patterns through different statistical methods.

 This project could be a starting point to understand the in-detailed correlations between various features in GTD. You can study latent class growth modeling to find chronological patterns, clustering based on different methodologies to groups the events, analyzing killing ranges, understanding the origin and activity of terrorist organizations.

 Also, note that we have been working with only the GTD database throughout the project. Work on various other terrorism databases such as RAND, MIPT Terrorism Knowledge Base, Worldwide incidents Tracking Systems, Tocsearch, etc.

References