Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AWS DMS Partitions in Method store_parquet_metadata #2303

Open
FleischerT opened this issue May 30, 2023 · 2 comments
Open

Support AWS DMS Partitions in Method store_parquet_metadata #2303

FleischerT opened this issue May 30, 2023 · 2 comments
Labels
enhancement New feature or request needs-triage

Comments

@FleischerT
Copy link

Is your idea related to a problem? Please describe.
I am trying to create glue tables for data in S3 written by AWS DMS as partitioned parquet files. The problem is that AWS DMS writes the partitions in the format "2023/05/01/" and not in the Hive standard like "year=2023/month=05/day=01".
Now when I try to create the glue tables using the Wrangler method "store_parquet_metadata", the partitions are not recognized because in the internal method "_extract_partitions_metadata_from_paths" is filtered for "=".

Describe the solution you'd like
Currently only hive conform partitioning seems to be supported. It would be better if you could pass the partition keys when calling the method.

@FleischerT FleischerT added the enhancement New feature or request label May 30, 2023
@kukushking
Copy link
Contributor

Hi @FleischerT correct, we currently only support Hive-style partitions. We'll discuss with the team and get back to you.

@webysther
Copy link
Contributor

Why not support on S3Settings something like DatePartitionHive?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs-triage
Projects
None yet
Development

No branches or pull requests

3 participants