Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 1.31 KB

README.md

File metadata and controls

21 lines (14 loc) · 1.31 KB

snowflake-migration

This repo contains some of the utilities that were used for Redshift to Snowflake migration and are described in more detail in the blog post here.

Some of the internal Python libraries that aren't release as part of this start with faire.internal.* and would need to be replaced in order to correctly use the content of this repo.

Redshift to Snowflake parser

Utilities to parse Redshift SQL dialect to Snowflake is located under parser/snowflake_parser.py that contains an exhaustive list of patterns that we encountered while migrating ETLs at Faire.

Mode Reports migration

mode_utils contains the helper script that was reverse-engineered with help of Chrome DevTools to convert and migrate Mode reports from Redshift to Snowflake using Mode API.

Table Validation

table_validation contains implementation of a YAML based framework that we used to perform automated data parity checks between Redshift and Snowflake tables using Datafold.

The framework does the following:

  • Copy Redshift table to S3
  • Create a watermarked Snowflake table from dumped S3 file
  • Create a watermarked copy of the ETL Snowflake table
  • Run data parity between the two copied tables in Snowflake and saves the result in a Mode table, including the diff URL generated by Datafold