BUG: failure if manually specifying engine="pyarrow" in to_parquet #214

jorisvandenbossche · 2022-08-12T18:44:47Z

I just noticed that when the argument engine="pyarrow" is provided to to_parquet() the write still fails with the same error.

import pandas as pd
import geopandas as gpd
import dask_geopandas as dgpd

dft = pd.util.testing.makeDataFrame()
dft["geometry"] = gpd.points_from_xy(dft.A, dft.B)
df = gpd.GeoDataFrame(dft)
df = dgpd.from_geopandas(df, npartitions=1)
df.to_parquet("mydf.parquet", engine="pyarrow")

Originally posted by @FlorisCalkoen in #198 (comment)

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2022-08-12T18:45:28Z

Ah, that is "expected", because you are then using dask's built-in "pyarrow" engine, and we actually extend that engine to handle the geometry dtype properly.

But of course, we should avoid that people can accidentally pass engine="pyarrow" and thus silently overwriting our own engine. Seems we need something more elaborate that the simple partial to do that:

dask-geopandas/dask_geopandas/io/parquet.py

Lines 97 to 98 in 2fd1646

    
           to_parquet = partial(dd.to_parquet, engine=GeoArrowEngine) 
        
           to_parquet.__doc__ = dd.to_parquet.__doc__

jorisvandenbossche mentioned this issue Aug 12, 2022

dask 2022.6.0 causes ArrowTypeError in to_parquet #198

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: failure if manually specifying engine="pyarrow" in to_parquet #214

BUG: failure if manually specifying engine="pyarrow" in to_parquet #214

jorisvandenbossche commented Aug 12, 2022

jorisvandenbossche commented Aug 12, 2022

BUG: failure if manually specifying engine="pyarrow" in to_parquet #214

BUG: failure if manually specifying engine="pyarrow" in to_parquet #214

Comments

jorisvandenbossche commented Aug 12, 2022

jorisvandenbossche commented Aug 12, 2022