Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
-
Updated
Mar 1, 2024 - Rust
Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Use SQL to build ELT pipelines on a data lakehouse.
Lakehouse storage system benchmark
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
A sample implementation of stream writes to an Iceberg table on GCS using Flink and reading it using Trino
Streaming ETL job cases in AWS Glue to integrate Iceberg and creating an in-place updatable data lake on Amazon S3
Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect
Hands-on workshop with Apache Iceberg
This is a collecton of Amazon CDK projects to show how to directly ingest streaming data from Amazon Mananged Service for Apache Kafka (MSK) and MSK Serverless into Apache Iceberg table in S3 with AWS Glue Streaming.
Using Apache Flink to write to s3 in Apache Iceberg format
Automated setup of Apache Iceberg on Amazon S3 using Terraform and AWS Glue Data Catalog. Explore the power of a Lakehouse architecture for data management and analysis, featuring schema discovery, metadata management, and efficient querying with Amazon Athena.
Miscellaneous codes and writings for MLOps
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK Serverless and MSK Connect (Debezium)
Run an open-source data LakeHouse locally using Docker Compose
React Components to visualize Apache Iceberg tables
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)
Add a description, image, and links to the apache-iceberg topic page so that developers can more easily learn about it.
To associate your repository with the apache-iceberg topic, visit your repo's landing page and select "manage topics."