Skip to content

peterklipfel/firesuit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

firesuit

For Amazon

This will spin up a couple instances

This will be scripted soon: in order to set up storm, after using the aws script, use juju ssh to ssh into the correct worker, and get the topology that you want to run. In a test case, we used the zip file downloaded from https://github.com/peterklipfel/scala-storm-starter as our topology. Then, to create the deployable jar, we ran sbt assembly.

Then, to deploy the topology, we ran

/opt/storm/storm-0.8.1/bin/storm jar /home/ubuntu/scala-storm-starter-master/target/scala-2.9.2/scala-storm-starter-assembly-0.0.2-SNAPSHOT.jar storm.starter.topology.ExclamationTopology ExclamationTopology

A framework for standing up big data applications

This project is my undergraduate senior project. It aims to provide an opinionated boilerplate and framework for setting up data applications. I have been working with big data tools for a relatively short amount of time, but have noticed that people are often repeating the same patterns. Nathan Marz had the right idea with a real-time complex event processing unit, and a batch processing unit that picked up afterward. However, the ecosystem is rapidly growing and expanding. It would be nice to have a framework that allowed people to switch out functional layers and retain functionality. Similar to the way that web frameworks provide an opinionated way for people to organize the tools of their choice (javascript libraries, databases, etc.).

The first step is creating a deployment for the system using Ubuntu Juju.

After this, I would like to ensure that there is a way to keep track of schema between the different components. If hadoop and storm both save into cassandra, they must speak the same schema.

The ultimate goal is to provide abstractions over these technologies that allow them to be swapped out.