GitHub - katrienantonio/workshop-loss-reserv-fraud: Course material for a workshop on loss modelling, reserving and insurance fraud analytics

Workshop Analytics for Loss Modelling, Reserving and Insurance Fraud

by Katrien Antonio and Jonas Crevecoeur

Course materials for the online Loss Modelling, Reserving and Insurance Fraud Analytics course in June 2021.

📆 June 3, 10 & 17 2021
⏰ From 3pm to 6pm
📌 online via MS Teams

Course materials will be posted in the days before the workshop.

Overview

This three-half-day workshop introduces the essential concepts of building insurance loss models with R.

On the first half-day, you will gain insights in the foundations of handling loss data, including useful data wrangling and visualization steps. You will cover a variety of discrete and continuous loss distributions, and techniques to build more flexible distributions from standard distributions (by mixing and splicing). You will learn how to fit these models to actual data and inspect their goodness-of-fit. Then, you will use the fitted model to estimate risk measures.

The second half-day then puts focus on reserving analytics. Starting from a granular data set with the development of individual claims, you will learn how to get useful insights from the data. Step-by-step you will move from the granular data to aggregated data in a run-off triangle. You will fit the famous chain ladder method and examine its validity on the given data. Alternative modelling strategies based on recent research will be discussed.

The third half-day then covers challenges in building fraud detection models from tabular and networked data. You will acquire insights in the foundations of these analytic methods, learn how to set-up the model building process, and focus on building a good understanding of the resulting model output and predictions.

Leaving this workshop, you should have a firm grasp of the working principles of a variety of loss models for frequency and severity data and be able to explore their use in practical settings. Moreover, you should have acquired the fundamental insights to explore some other methods on your own.

Schedule and Course Material

The detailed schedule is subject to small changes.

Session	Duration	Description	Lecture material
Day 0	your own pace	Prework	sheets prework in html and in pdf
Day 1	3 - 3.15	Prologue	sheets in html and in pdf
	3.15 - 3.50	Data sets	sheets in html
	4 - 4.50	Frequency models	sheets in html
	5 - 6	Severity models	sheets in html
Day 2	3 - 3.20	Motivation and strategies	sheets in html and pdf
	3.20 - 3.50	Reserving data structures (part 1): daily and yearly data	sheets in html and pdf
	4 - 4.15	Reserving data structures (part 2): individual and aggregated data	sheets in html and pdf
	4.15 - 4.50	Claims reserving with triangles	sheets in html and pdf
	5 - 5.45	When the chain ladder method fails	sheets in html and pdf
	5.45 - 6	Research outlook	sheets in html and pdf
Day 3	3 - 3.30	Motivation and fraud detection cycle	sheets in pdf
	3.30 - 3.50	Network and tabular data
	4 - 4.30	Ranking algorithm
	4.30 - 4.50	Featurization
	5 - 5.30	Supervised learning with imbalanced data
	5.30 - 5.50	Evaluation metrics
	5.50 - 6	Wrap-up

Half-day 1: Loss modelling analytics

Topics include:

data sets used in the course: MTPL and SecuraRe losses
data handling and visualization tools with {ggplot} and {dplyr}
building frequency models: Poisson, Negative Binomial, ZI and Hurdle, maximum likelihood estimation and goodness-of-fit
building severity models: simple to complex parametric distributions, splicing to construct a global fit, mixing, estimate a risk measure.

Download lecture sheets in html and pdf.

Half-day 2: Claims reserving analytics

Topics include:

data sets used in the course: individual claims, daily data
data handling and visualization tools with {ggplot} and {dplyr}
from daily to yearly data, from individual claims to triangles
fitting well known claims reserving methods
when the simple methods fail: what else is there?

Download lecture sheets in html and pdf.

Half-day 3: Insurance fraud analytics

Topics include:

insurance fraud detection strategies and numbers
network and tabular data
ranking algorithms, e.g. PageRank, BiRank
network featurization steps
supervised learning with imbalanced data, sampling methods like SMOTE
evaluation metrics, eg AUROC, AUPR, top decile lift
wrap up.

Download lecture sheets pdf.

Prework

The workshop requires a basic understanding of R. We prepared some prework sheets that you can take a look at before the workshop (html or pdf).

Being familiar with statistical or machine learning methods is not required. The workshop gradually builds up these concepts, with an emphasis on hands-on demonstrations and exercises.

Software Requirements

You have two options to join the coding exercises covered during the workshop. Either you join the RStudio cloud workspace dedicated to the workshop, and then you’ll run R in the cloud, from your browser. Or you use your local installation of R and RStudio.

We kindly ask participants to join the RStudio Cloud as default!

RStudio Cloud - default!

You will join our workspace on R Studio Cloud. This enables a very accessible set-up for working with R in the cloud for the less experienced user!

https://rstudio.cloud/spaces/147546/join?access_code=YgLiHaQpPvZdcbh943QMO%2FqtF%2FKTct855m5tYtmH

Here are the steps you should take (before the workshop):

visit the above link
log in by creating an account for RStudio Cloud or by using your Google or GitHub login credentials
join the space
at the top of your screen you see ‘Projects’, click ‘Projects’
with the ‘copy’ button (on the right) you can make your own version of the ‘day 1 on loss modelling’ project; in this copy you can work on the exercises, add comments etc.
you should now be able to visit the project and see the ‘scripts’ and ‘data’ folders on the right. Open and run the ‘installation-instructions.R’ script from the scripts folder, to see if everything works fine.

We will have everything set up for you in the correct way. You only have to login!

Local installation - optional (make sure you have access to the RStudio Cloud as sketched above)!

Alternatively, you can bring a laptop with a recent version of R and RStudio installed. Make sure you can connect your laptop to the internet (or download the course material one day before the start of the workshop). You will need:

R (at least 3.5.2 https://cloud.r-project.org/bin/windows/base/)
RStudio (https://www.rstudio.com/products/rstudio/download/#download)

In the prework folder you will find a step-by-step guide to installing R and RStudio (though a bit outdated).

Please run the following script in your R session to install the required packages

packages <- c("tidyverse", "here", "gridExtra", "caret", "rsample", "recipes", "grid", "rstudioapi", "MASS", "actuar", "statmod", "ReIns", "pscl", "zoo", "igraph", "expm", "pROC", "ChainLadder", "lubridate")
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)

if(sum(!(packages %in% installed.packages()[, "Package"]))) {
  stop(paste('The following required packages are not installed:\n', 
             paste(packages[which(!(packages %in% installed.packages()[, "Package"]))], collapse = ', ')));
} else {
  message("Everything is set up correctly. You are ready to go.")
}

Instructors

Katrien Antonio is professor in insurance data science at KU Leuven and associate professor at University of Amsterdam. She teaches courses on data science for insurance, life and non-life insurance mathematics and loss models. Research-wise Katrien puts focus on pricing, reserving and fraud analytics, as well as mortality dynamics.

Jonas Crevecoeur is a post-doctoral researcher in biostatistics at KU Leuven. He recently obtained his PhD within the insurance research group at KU Leuven and holds the degrees of MSc in Mathematics, MSc in Insurance Studies and MSc in Financial and Actuarial Engineering (KU Leuven). Before starting the PhD program he worked as an intern with QBE Re (Belgium office) where he studied multiline products and copulas. Jonas was a PhD fellow of the Research Foundation - Flanders (FWO, PhD fellowship fundamental research).

Happy learning!

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
README_cache/gfm		README_cache/gfm
data		data
image		image
scripts		scripts
sheets		sheets
README.md		README.md
README.rmd		README.rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_cache/gfm

README_cache/gfm

data

data

image

image

scripts

scripts

sheets

sheets

README.md

README.md

README.rmd

README.rmd

Repository files navigation

Workshop Analytics for Loss Modelling, Reserving and Insurance Fraud

Overview

Schedule and Course Material

Half-day 1: Loss modelling analytics

Half-day 2: Claims reserving analytics

Half-day 3: Insurance fraud analytics

Prework

Software Requirements

RStudio Cloud - default!

Local installation - optional (make sure you have access to the RStudio Cloud as sketched above)!

Instructors

About

Releases

Packages

Contributors 2

Languages

katrienantonio/workshop-loss-reserv-fraud

Folders and files

Latest commit

History

Repository files navigation

Workshop Analytics for Loss Modelling, Reserving and Insurance Fraud

Overview

Schedule and Course Material

Half-day 1: Loss modelling analytics

Half-day 2: Claims reserving analytics

Half-day 3: Insurance fraud analytics

Prework

Software Requirements

RStudio Cloud - default!

Local installation - optional (make sure you have access to the RStudio Cloud as sketched above)!

Instructors

About

Topics

Resources

Stars

Watchers

Forks

Languages