Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redshift set VARCHAR(256) to VARCHAR(MAX) #2487

Open
eakmanrq opened this issue Apr 22, 2024 · 1 comment
Open

Redshift set VARCHAR(256) to VARCHAR(MAX) #2487

eakmanrq opened this issue Apr 22, 2024 · 1 comment
Labels
Engine: Redshift Issues related to Redshift Improvement Improves existing functionality

Comments

@eakmanrq
Copy link
Contributor

If Redshift gets a VARCHAR column of an uncertain precision it will default the value to 256. For users who aren't explicitly defining columns to types or casting columns this can be annoying having to remember to cast to VARCHAR(MAX) in order to override the default.

This change would have SQLMesh detect when a VARCHAR(256) is being created and default to VARCHAR(MAX) instead. https://docs.aws.amazon.com/redshift/latest/dg/r_Character_types.html#r_Character_types-varchar-or-character-varying

In order to complete this task, the Schema Differ needs to be updated to be able to detect precision changes between VARCHAR columns and alter if the precision is increased.

@eakmanrq eakmanrq added Improvement Improves existing functionality Engine: Redshift Issues related to Redshift labels Apr 22, 2024
@jmarch
Copy link

jmarch commented May 23, 2024

From my experience, this might be a dangerous path to take...

When I was using Redshift (years ago), there was a hard limit on the maximum row width in the query processing engine. For wide result sets with many VARCHAR columns, it was fairly easy to reach that limit. I also found a significant cost in IO/memory use when using unbounded VARCHAR lengths. Some queries would take "forever", but when typed more strictly would run just fine.

So before I'd presume you can MAX all VARCHARs by default, I'd suggest running some basic tests on a Redshift instance to see how many columns you can use before hitting these limits.

All this said, I tried to find the limit in their docs just now, but was unable to locate it. Perhaps they've improved the engine since I last had these issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engine: Redshift Issues related to Redshift Improvement Improves existing functionality
Projects
None yet
Development

No branches or pull requests

2 participants