Dataverse Hack Hackathon. Insurance-Claim-Prediction.

This is a repository to Dataverse Hack 2022 from Analytics Vidhya

About Hackathon: Insurance Claim Prediction

An insurance policy is an agreement between a company and a customer by which a company undertakes to provide a guarantee of compensation for specified loss, damage or illness in return for the payment of a specified premium. A premium is a sum of money that the customer needs to pay regularly to an insurance company for this guarantee.

For example, you pay a premium of Rs. 3000/- each year for car insurance with a coverage of Rs. 100,000/-. Unfortunately, in case of an accident, the car is severely damaged. In that case, the insurance provider company will bear the cost of damage etc. for up to Rs. 100,000.

Now if you are wondering how can a company bear such a high cost when it charges a premium of only Rs. 3000/- per year only i.e. where the concept of probability comes into the picture. For example, there might be thousands of customers who would be paying a premium of Rs. 3000 every year just like you, but only a few of them (say 2-3) would have had an accident that year and not everyone. This way everyone shares the risk of everyone else.

Our client is an Insurance company that provides insurance for cars to its customers. In this hackathon, you will be closely working with the insurer in understanding the behaviour of the policyholders.

Insurance Claim Prediction

Predict whether the policyholder will file a claim in the next 6 months or not.

Steps of my solution:

Univariate analisys of features. Check for normal distribution and outliers
Bivariate analisys of features. Check for correlation between features and target variables
Check for Correlation matrix
Feature Engineering. Created new features 'max_torque_scalar' and 'max_power_scalar'
Dummy work with 'transmission_type' and 'rear_brakes_type'
Mapping features 'is_esc','is_adjustable_steering','is_tpms','is_parking_sensors','is_parking_camera', 'is_front_fog_lights','is_rear_window_wiper','is_rear_window_washer','is_rear_window_defogger', 'is_brake_assist','is_power_door_locks','is_central_locking','is_power_steering','is_driver_seat_height_adjustable', 'is_day_night_rear_view_mirror','is_ecw','is_speed_alert'
One hot encoding for 'area_cluster', 'make', 'segment', 'model', 'fuel_type', 'engine_type', 'steering_type'
The dataset is imbalanced
Modelling. Models: Desicion Tree Classifier, Random Forest Classifier, AdaBoost Classifier. My final model is Random Forest Classifier f1_score=0.122254758418741 for train and f1_score=0.117362955807776 for test

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
LICENSE		LICENSE
README.md		README.md
final Solution.ipynb		final Solution.ipynb
rating_analitycsvidhya.png		rating_analitycsvidhya.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

final Solution.ipynb

final Solution.ipynb

rating_analitycsvidhya.png

rating_analitycsvidhya.png

Repository files navigation

Dataverse Hack Hackathon. Insurance-Claim-Prediction.

About

Releases

Packages

Languages

License

Katerinafomkina/Insurance-Claim-Prediction.-Classification-Problem

Folders and files

Latest commit

History

Repository files navigation

Dataverse Hack Hackathon. Insurance-Claim-Prediction.

About

Resources

License

Stars

Watchers

Forks

Languages