Skip to content

Version v0.8.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 06 May 22:08
· 1 commit to main since this release
46c35a7

Improvements

  • Support for writing based on Arrow as the transfer mechanism of the data
    from Python to GDAL (requires GDAL >= 3.8). This is provided through the
    new pyogrio.raw.write_arrow function, or by using the use_arrow=True
    option in pyogrio.write_dataframe (#314, #346).
  • Add support for fids filter to read_arrow and open_arrow, and to
    read_dataframe with use_arrow=True (#304).
  • Add some missing properties to read_info, including layer name, geometry name
    and FID column name (#365).
  • read_arrow and open_arrow now provide
    GeoArrow-compliant extension metadata,
    including the CRS, when using GDAL 3.8 or higher (#366).
  • The open_arrow function can now be used without a pyarrow dependency. By
    default, it will now return a stream object implementing the
    Arrow PyCapsule Protocol
    (i.e. having an __arrow_c_stream__method). This object can then be consumed
    by your Arrow implementation of choice that supports this protocol. To keep
    the previous behaviour of returning a pyarrow.RecordBatchReader, specify
    use_pyarrow=True (#349).
  • Warn when reading from a multilayer file without specifying a layer (#362).
  • Allow writing to a new in-memory datasource using io.BytesIO object (#397).

Bug fixes

  • Fix error in write_dataframe if input has a date column and
    non-consecutive index values (#325).
  • Fix encoding issues on windows for some formats (e.g. ".csv") and always write ESRI
    Shapefiles using UTF-8 by default on all platforms (#361).
  • Raise exception in read_arrow or read_dataframe(..., use_arrow=True) if
    a boolean column is detected due to error in GDAL reading boolean values for
    FlatGeobuf / GPKG drivers (#335, #387); this has been fixed in GDAL >= 3.8.3.
  • Properly ignore fields not listed in columns parameter when reading from
    the data source not using the Arrow API (#391).
  • Properly handle decoding of ESRI Shapefiles with user-provided encoding
    option for read, read_dataframe, and open_arrow, and correctly encode
    Shapefile field names and text values to the user-provided encoding for
    write and write_dataframe (#384).
  • Fixed bug preventing reading from bytes or file-like in read_arrow /
    open_arrow (#407).

Packaging

  • The GDAL library included in the wheels is updated from 3.7.2 to GDAL 3.8.5.

Potentially breaking changes

  • Using a where expression combined with a list of columns that does not include
    the column referenced in the expression is not recommended and will now
    return results based on driver-dependent behavior, which may include either
    returning empty results (even if non-empty results are expected from where parameter)
    or raise an exception (#391). Previous versions of pyogrio incorrectly
    set ignored fields against the data source, allowing it to return non-empty
    results in these cases.