Skip to content

Build an algorithm that predicts whether a user will download an app after clicking a mobile app ads. (Top 6% ) 🥉 222/3946

Notifications You must be signed in to change notification settings

shejz/TalkingData-AdTracking-Fraud-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 

Repository files navigation

Description

Fraud risk is everywhere, but for companies that advertise online, click fraud can happen at an overwhelming volume, resulting in misleading click data and wasted money. Ad channels can drive up costs by simply clicking on the ad at a large scale. With over 1 billion smart mobile devices in active use every month, China is the largest mobile market in the world and therefore suffers from huge volumes of fraudulent traffic.

  • Build a machine learning model to determine whether a click is fraud or not.

Competition link: TalkingData AdTracking Fraud Detection Challenge

Model and LB score (AUC-ROC)

Model: xgboost and lightgbm

Evaluation Metric: area under the ROC curve (AUC-ROC)

Training and verification: Some models use the data of 11.07-11.09, and some models use the data of 11.07-11.08. Randomly select 50 million rows of data for verification.

Model Public score Private score Final rank
LGBM 0.98122 0.98206 223th (Top 6%) bronze medal 🥉

The libraries used are:

  • numpy
  • pandas
  • matplotlib,
  • seaborn
  • sklearn
  • lightgbm
  • xgboost

Challenges:

  • Large Datasets (TalkingData provides training data for 185 million samples 7GB size.)
  • Imbalanced Data

Exploratory Data Analysis(EDA): Nbviewer

Solution References:

About

Build an algorithm that predicts whether a user will download an app after clicking a mobile app ads. (Top 6% ) 🥉 222/3946

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published