delta-lake
Here are 138 public repositories matching this topic...
Qubole Delta Lake Spark Streaming ingestion end to end Demo
-
Updated
May 4, 2020 - Python
Example of how to use Kafka and Spark to handle streaming submissions of urls.
-
Updated
Oct 4, 2021 - Python
Data Science docker environment - Spark cluster with Jupyterlab interface
-
Updated
Jan 18, 2022 - Jupyter Notebook
A transformation pipeline for Delta Lake using AWS SDK for Pandas
-
Updated
Jul 12, 2023 - Python
Schema mappings in SQL and PySpark for ELT pipelines to normalize data to OCSF
-
Updated
Jan 25, 2023 - Python
Distributed Systems - Principles and Paradigms
-
Updated
Nov 2, 2023 - Jupyter Notebook
Spark Structured Streaming application transferring Avro data from Kafka with Schema Registry to Delta Lake
-
Updated
May 1, 2020 - Scala
-
Updated
May 15, 2020
Type annotations for delta-spark
-
Updated
Nov 22, 2021 - Python
Data pipeline that processes Formula1 data with Azure Databricks, DeltaLake, and Azure Data Factory
-
Updated
Jul 14, 2023 - Python
-
Updated
Jun 10, 2023 - Jupyter Notebook
Data Streaming with Debezium, Kafka, Spark Streaming, Delta Lake, and MinIO
-
Updated
May 15, 2024 - Python
Example of local pyspark setup including DeltaLake for unit-testing
-
Updated
May 21, 2024 - Python
🛸 This project showcases an Extract, Load, Transform (ELT) pipeline built with Python, Apache Spark, Delta Lake, and Docker. The objective of the project is to scrape UFO sighting data from NUFORC and process it through the Medallion architecture to create a star schema in the Gold layer that is ready for analysis.
-
Updated
Jul 3, 2023 - Python
Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations should be performed.
-
Updated
Jul 31, 2023 - Scala
Improve this page
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."