CMPS-278 - Winter 2017

Data Pipeline

Project Abstract

Lambda Architecture (LA) is a system design methodology for robust, scalable and fault-tolerant computing. Some of the design goals of LA include low-latency, fault-tolerance, wide range of workloads and linear scalability. Lambda Architecture can be divided into three parts: Batch layer, Serving layer and Speed layer.

We attempted to build a Lambda Architecture based application that ingests and processes the data from the General Transit Feed Specification (GTFS). To retrieve data from the API, we have a data ingestion component which includes Apache Kafka. The Kafka producer polls the LA Metro GTFS feed and then ingests data into Kafka server. Kafka will feed data to both the batch and the speed layer.

The batch layer is responsible for aggregating data in batches of one hour, one day, one week etc. We have used MapReduce here to aggregate all these values. The speed layer processes the real time data. We have used Apache Storm in this layer which reads the real-time data from Kafka. The data is then processed in Storm to give real-time analytics. Serving layer is our database/datastore which stores the aggregated data. We have used HDFS, where data would be stored in the form of flat files.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
src/ucsc/cmps278/lambdaArch		src/ucsc/cmps278/lambdaArch
target/maven-archiver		target/maven-archiver
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arch.png		arch.png
control.sh		control.sh
index.html		index.html
input.txt		input.txt
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/ucsc/cmps278/lambdaArch

src/ucsc/cmps278/lambdaArch

target/maven-archiver

target/maven-archiver

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

arch.png

arch.png

control.sh

control.sh

index.html

index.html

input.txt

input.txt

pom.xml

pom.xml

Repository files navigation

CMPS-278 - Winter 2017

Data Pipeline

Project Abstract

About

Releases

Packages

Contributors 2

Languages

License

VedangJoshi/Lambda-Architecture

Folders and files

Latest commit

History

Repository files navigation

CMPS-278 - Winter 2017

Data Pipeline

Project Abstract

About

Topics

Resources

License

Stars

Watchers

Forks

Languages