Partitioned CSV writes with gzip do not add .csv.gz
extension and fail to read back in with read_csv_auto
#11889
Labels
.csv.gz
extension and fail to read back in with read_csv_auto
#11889
What happens?
Using
COPY (FROM ...) TO 'table.csv.d' (FORMAT 'csv', COMPRESSION 'gzip', PARTITION_BY (col1, col2);
generates individual CSV files that are compressed but don't have the.gz
extension. This causes issues with downstream tools that rely on filename investigation to determine compression times.To Reproduce
Example:
Output:
OS:
x64
DuckDB Version:
0.10.2
DuckDB Client:
CLI
Full Name:
Teague Sterling
Affiliation:
23andMe
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
The text was updated successfully, but these errors were encountered: