The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Updated
May 15, 2024 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift) in real-time.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
Airbyte connectors (sources & destinations) + Airbyte CDK for JavaScript/TypeScript
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Efficient data transformation and modeling framework that is backwards compatible with dbt.
PyAirbyte brings the power of Airbyte to every Python developer.
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
Singer tap for Canny. Built with the Meltano Singer SDK.
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
The open source high performance ELT framework powered by Apache Arrow
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Rapidly Access, Processes, Analyze & Visualize Your Data
One framework to develop, deploy and operate data workflows with Python and SQL.
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases
Flink CDC is a streaming data integration tool
Add a description, image, and links to the elt topic page so that developers can more easily learn about it.
To associate your repository with the elt topic, visit your repo's landing page and select "manage topics."