Skip to content

jgoodman8/streaming-retail-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Streaming Retail Analysis

This repo helps as introduction into data streaming processing. Inside, you will find out how to perform streaming using Apache Trident (Java) and Apache Spark (Scala API).

Both projects take a data input with invoice data (both purchases and cancellations). Data is sent to a Kafka topic by a simulator, which reads a csv file line by line. Each line represents a product purchase or cancellation within an invoice.

Apache Spark

It is located inside the spark_streaming/ folder. The setup and run instructions are in another readme file on that folder.

alt text

Apache Storm Trident

It is located inside the kafka_trident/ folder. The setup and run instructions are in another readme file on that folder.

alt text

About

Introduction to streaming processing by using Spark and Trident, applied to a invoice retail analysis. It takes Kafka as real time data input.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published