Skip to content

Faire/snowflake-migration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

snowflake-migration

This repo contains some of the utilities that were used for Redshift to Snowflake migration and are described in more detail in the blog post here.

Some of the internal Python libraries that aren't release as part of this start with faire.internal.* and would need to be replaced in order to correctly use the content of this repo.

Redshift to Snowflake parser

Utilities to parse Redshift SQL dialect to Snowflake is located under parser/snowflake_parser.py that contains an exhaustive list of patterns that we encountered while migrating ETLs at Faire.

Mode Reports migration

mode_utils contains the helper script that was reverse-engineered with help of Chrome DevTools to convert and migrate Mode reports from Redshift to Snowflake using Mode API.

Table Validation

table_validation contains implementation of a YAML based framework that we used to perform automated data parity checks between Redshift and Snowflake tables using Datafold.

The framework does the following:

  • Copy Redshift table to S3
  • Create a watermarked Snowflake table from dumped S3 file
  • Create a watermarked copy of the ETL Snowflake table
  • Run data parity between the two copied tables in Snowflake and saves the result in a Mode table, including the diff URL generated by Datafold

About

Repository containing various utils related to Snowflake migration at Faire.

Resources

Stars

Watchers

Forks

Languages