Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Save out SplinkDataFrame metadata #2054

Open
ADBond opened this issue Mar 13, 2024 · 0 comments
Open

[FEAT] Save out SplinkDataFrame metadata #2054

ADBond opened this issue Mar 13, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@ADBond
Copy link
Contributor

ADBond commented Mar 13, 2024

As mentioned in #1971, parquet format supports arbitrary key-value metadata. As SplinkDataFrames now support such metadata (in particularly used for storing table-creation thresholds), it would be nice if this could be written/read from parquet.

Backend notes:

  • Supported in duckdb, though think only (currently, 0.10.0) using a literal struct in SQL (which would thus need to be carefully constructed) rather than via e.g. subquery
  • Doesn't appear to be directly supported in spark, could possibly go via pyarrow
  • athena uses arrow under-the-hood so should be okay.
  • postgres/sqlite we don't currently have a to_parquet(), but could look into implementing
@ADBond ADBond added the enhancement New feature or request label Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant