FEATURES

Scribengin exists to provide an easily configurable, general solution to the problem of moving big data in a reliable, highly available way. Scribengin is a cutting edge product built on top of already reliable technologies to provide a reliable, flexible, and scalable solution.

###Flexibility

Support for many different kinds of data sources/sinks
Sources and sinks can easily be expanded to add support for new data sources/sinks
Can move data from any source to any sink
Custom Processors - filter, transform, copy data within the pipeline
Out-the-box support for
- HDFS
- Kafka
- S3

###Scalability

Built with YARN
Easily configure how many nodes to work on any given dataflow
Run multiple dataflows simultaneously
Chain dataflows together
Scalable, Expandable, and Highly Available

###Reliability

Data is guaranteed to not be lost, and duplication kept to a minimum
Automatically replace nodes in the cluster that unexpectedly fail
Dataflows will resume upon unexpected failure
Dataflows can be paused, stopped, resumed by an administrator
Reliable Kafka writer - improvements made to the default Kafka writer to support overcoming Kafka failures
Configuration is stored in a central, highly available place for all nodes in the cluster (based on Zookeeper)

###Visibility

Graphical, real time metrics monitoring using Kibana
Logs stored in an expandable, highly available repository (elasticsearch)

###Testability

Can be deployed to simulate a working cluster locally using Docker images for testing or development
Can be deployed in a single JVM for unit testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

features.md

features.md

FEATURES

Files

features.md

Latest commit

History

features.md

File metadata and controls

FEATURES