Skip to content

create a streaming application as a simple real-time fraud detection system backed by Apache Kafka using a Python client.

Notifications You must be signed in to change notification settings

Andy-Pham-72/kafka-mini-project-2

Repository files navigation

Kafka Mini Project

Building A Streaming Fraud Detection System With Kafka and Python

Screen Shot 2021-12-31 at 4 09 01 PM

Objective:

In this project, we create a streaming application backed by Apache Kafka using a Python client. This is a simple real-time fraud detection system. We will generate a stream of synthetic transactions and use Python script to process those stream of transactions to detect which ones are potential fraud.

Prerequisites:

Below is the folder map to all the files we have for the project:

.
├── docker-compose.yml 
├── detector
│ ├── Dockerfile
│ ├── app.py
│ └── requirements.txt
├── generator
│ ├── Dockerfile
│ ├── app.py
│ ├── transactions.py
│ └── requirements.txt
├── start.sh
├── start_main_docker_compose.sh
├── read_whole_topic.sh
├── restart.sh
├── stop.sh

We will produce fake transactions on one end, filter and log those that look suspicious on the other end. This will include:

  • a transaction generator (which produces the synthetic data for the process).
  • a fraud detector. Both applications will run in Docker containers and interact with the Kafka cluster.

Architecture diagram

Screen Shot 2021-12-31 at 3 57 29 PM

The fraud detector mechanism

The fraud detector is a typical example of a stream processing application. It takes a stream of transactions as an input, performs the filtering task, then outputs the result into two separate streams - those that are legitimate, and those that are suspicious, an operation also known as branching.

Screen Shot 2021-12-31 at 3 57 49 PM

Assumption: Since in the real world, deteching fraud is a complex problem and it depends on so many different metrics to determine fraud. In this project, we will keep the metric simple which it is illegal to send more than $900.00 at a time. As a result, any transaction whose amount is greater than 900 can be considered as fraud.

Steps to run the application

  1. From the Bash shell run:
$ chmod +w ./start.sh
$ ./start.sh
  1. In another tab of Bash shell, run:
$ chmod +w ./start_main_docker_compose.sh
$ ./start_main_docker_compose.sh

The we should see this output: Screen Shot 2021-12-31 at 3 45 33 PM

We can see the legit transaction which lower than our metric which is: $900.00 Screen Shot 2021-12-31 at 3 46 17 PM

  1. Read the whole topic, run this command:
$ docker-compose -f docker-compose.kafka.yml exec broker kafka-console-consumer --bootstrap-server localhost:9092 --topic queueing.transactions --from-beginning

or run:

$ chmod +x ./read_whole_topic.sh
$ ./read_whole_topic.sh

and see the total number of the read messages, Run Ctrl + C: Screen Shot 2021-12-31 at 4 14 27 PM

  1. Run Ctrl + C to stop the kafka-console-consumer or Stop the generator and delete all the containers/networks/volumes:
$ chmod +x ./stop.sh
$ ./stop.sh

About

create a streaming application as a simple real-time fraud detection system backed by Apache Kafka using a Python client.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published