Skip to content

CliffLolo/aws-streaming-data-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Serverless event-driven streaming data pipeline on AWS

Context

The business problem that this project aims to solve is how to efficiently and effectively process a high volume of page views data in real-time, derive meaningful insights from the data, and use these insights to improve website performance and user experience.

With millions of page views per day, it can be challenging to analyze and understand the behaviour of users in real-time, and traditional batch processing solutions may not be able to keep up with the volume and velocity of the data. Additionally, the complexity of the data, such as user behaviour across multiple devices and sessions, makes it difficult to gain actionable insights without processing the data in near-real-time.

By building a serverless streaming pipeline on AWS, I aim to solve this problem by providing a scalable, efficient, and cost-effective solution that can ingest and process large volumes of data in real-time, derive meaningful insights, and store the data for further analysis. This will enable website owners to gain a better understanding of user behaviour, identify trends and patterns, and make data-driven decisions to improve website performance and user experience, ultimately leading to increased engagement, retention, and revenue.

Persona - Architect/Data Engineer/Developer

AWS Products/Services - AWS Lambda, AWS Glue, Amazon S3, Amazon Kinesis, Amazon QuickSight.

Architecture

architectural diagram

Why this architecture?

  • Serverless
  • Event-driven
  • Scalable

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published