Opening duration of large files #3136
Replies: 2 comments
-
The fastest option would be to use gpd.read_file(path, engine="pyogrio", use_arrow=True) Another option is to try reading the file in parallel with Note that the comparison with raster is not very relevant as 1) you may be lazily reading it, not copying to memory, 2) mapping of data from raster to memory is straightforward while deserialisation of geometries and conversion to shapely objects takes time. |
Beta Was this translation helpful? Give feedback.
-
In addition to what Martin wrote... Shapefile is an old file format with quite some disadvantages ... and also regarding performance it isn't the most optimised format. So if wanted/needed, that is a possible additional source of improvement. E.g. reading a Geopackage file is in general already faster, but with For illustration, a comparison between reading a file with 3.2 million polygons:
|
Beta Was this translation helpful? Give feedback.
-
Hello,
I am working with rasters and vector datasets with sizes way beyond 1 GB (for many probably still not very big). However, while rasters just take a few seconds at most to get opened with rasterio, vector datasets (shapefiles) may take up to hours to open using GeoPandas.
The vector datasets may contain dozens of columns and geometries with many vertices.
Is there a way to accelerate this process?
Beta Was this translation helpful? Give feedback.
All reactions