Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README updates #6

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
32 changes: 30 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,21 @@
# Cast Iron
# <img src="images/cast-iron.jpg" alt="drawing" width="150" height="100"/> CAST-IRON

The platform-agnostic data storage and ETL system leveraging MinIO and other tools to enable mission end states.

CAST-IRON (Cloud Agnostic Storage Technology for Intelligent Record-enrichment and Open Navigation) is a novel, open source, generalizable big data platform to perform Extract, Transform, and Load (ETL) operations.

CAST-IRON is designed to provide:

* ingestion of a variety of data
* deployable in the myriad of usage contexts (e.g. at the edge, on bare metal, in the cloud)
* plug and play processing definitions that can be user managed from a simple interface
* easy access to containerized services providing novel processing capabilities


CAST-IRON leverages:
* Object Storage ([AWS S3], [MinIO], etc.) for easy notification and data access
* Message Queue ([Kafka], [Nats], etc.) for rapid relay of data to components
* Worker ([Ray], [Faust], etc.) for scalable, distributed data processing
* ETL (Bash Scripts, Python) for rapid and flexible data transformation

This repository contains the docker-compose recipe for staring up the backbone of Cast-Iron as well as the Cast-Iron Worker.

Expand All @@ -10,6 +25,7 @@ Main Parts of the system include:
* Worker ([Celery], [Faust], etc.)
* ETL (Bash Scripts, Python)

<img src="images/cast-iron-workflow.jpg" alt="drawing"/>

## Getting Started

Expand All @@ -29,6 +45,17 @@ Once started, the following areas are accessible:
* Port: 5432
* Database: castiron

## Ingest Some Files

1. Download some data to ingest from `cast-iron-recipe-postgres`

1. Go to http://localhost:9000 and drag+drop a .toml file onto the approapriate directory. This configures how the system will handle files that are uploaded to various buckets and folders.

1. Drag and drop various types of mock data into the appropriate directory in Minio. As each file is ingested, it will be moved between the directories defined by the proocessor and .toml file.

1. After a file has been ingested, you can navigate to the appropriate location defined in the .toml file or processor and find your data results.



[AWS S3]: https://aws.amazon.com/s3/
[Celery]: https://docs.celeryproject.org/en/stable/index.html
Expand All @@ -41,3 +68,4 @@ Once started, the following areas are accessible:
[PostgreSQL]: https://www.postgresql.org/
[SQLite]: https://www.sqlite.org/index.html
[MS SQL]: https://www.microsoft.com/en-us/sql-server
[Ray]: https://www.ray.io
Binary file added images/cast-iron-workflow.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/cast-iron.jpg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.