⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
Updated
May 23, 2024 - Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Collect, aggregate, and visualize a data ecosystem's metadata
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
HiveMQ Edge is an MQTT gateway that enables interoperability between OT devices and IT systems. It translates diverse protocols into MQTT for streamlined communication and helps organize data into a unified namespace, making managing and streaming data across your infrastructure easier.
System Design, Solution Architecture, Data Systems Practice
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Egeria core
SQL Lineage Analysis Tool powered by Python
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Data governance through AWS LakeFormation credentials vending API
Integration for collecting metadata from Great Expectations
On this site I share personal thoughts about data, data governance, data quality, metadata, and side projects.
ODD Specification is a universal open standard for collecting metadata.
Pebblo enables developers to safely load data and promote their Gen AI app to deployment
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
Add a description, image, and links to the data-governance topic page so that developers can more easily learn about it.
To associate your repository with the data-governance topic, visit your repo's landing page and select "manage topics."