Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening same GPKG file/layer for write => mysterious bug #700

Open
culebron opened this issue Dec 19, 2018 · 6 comments
Open

Opening same GPKG file/layer for write => mysterious bug #700

culebron opened this issue Dec 19, 2018 · 6 comments
Assignees
Labels
Milestone

Comments

@culebron
Copy link

culebron commented Dec 19, 2018

I wrote a class to stream dataframes into GeoPackage, and by mistake it opened a file for writing twice. It would succesfully write the data, but would crash when the program was terminating.

ERROR 1: sqlite3_exec(CREATE TRIGGER "trigger_insert_feature_count_sample" AFTER INSERT ON "sample" BEGIN UPDATE gpkg_ogr_contents SET feature_count = feature_count + 1 WHERE lower(table_name) = lower('sample'); END;) failed: trigger "trigger_insert_feature_count_sample" already exists
ERROR 1: sqlite3_exec(CREATE TRIGGER "trigger_delete_feature_count_sample" AFTER DELETE ON "sample" BEGIN UPDATE gpkg_ogr_contents SET feature_count = feature_count - 1 WHERE lower(table_name) = lower('sample'); END;) failed: trigger "trigger_delete_feature_count_sample" already exists
ERROR 1: Spatial index already existing
Traceback (most recent call last):
  File "fiona/_err.pyx", line 201, in fiona._err.GDALErrCtxManager.__exit__
fiona._err.CPLE_AppDefinedError: b'Spatial index already existing'
Exception ignored in: 'fiona._shim.gdal_flush_cache'
Traceback (most recent call last):
  File "fiona/_err.pyx", line 201, in fiona._err.GDALErrCtxManager.__exit__

I thought it must have had something to do with threads created in the background. But turned out it was just that a condition was checked in a wrong way, and fiona.open ran again and again on every chunk.

Here's a code that reproduces this.

import fiona
from collections import OrderedDict

schema = {'geometry': 'Point', 'properties': OrderedDict()}
_handler = fiona.open('sample.gpkg', 'w', driver='GPKG', schema=schema)
row = {'geometry': {'type': 'Point', 'coordinates': (2, 2)}, 'properties': {}}
data = [row, row]

_handler.writerecords(data)

_handler = fiona.open('sample.gpkg', 'w', driver='GPKG', schema=schema)
_handler.writerecords(data)

_handler.close()

My suggestion is to probably check if a layer exists (or opened for writing if possible), and at least show a warning. Because otherwise I was thinking it's because of not flushing the data, or closing it too quickly.

@drnextgis
Copy link
Contributor

Can you please attach example of tiny *.gpkg to reproduce this issue?

@culebron
Copy link
Author

You don't need one, it's created right there in the code.

@sgillies sgillies added the bug label Dec 21, 2018
@sgillies sgillies self-assigned this Dec 21, 2018
@sgillies
Copy link
Member

@culebron this is an interesting issue. The second call to fiona.open('sample.gpkg', 'w', driver='GPKG', schema=schema) should delete the layer created in the first. We're fortunate that this only results in a Python exception and doesn't crash the Python process itself.

I encourage you to only write to datasets within the context of a with fiona.open() block. This will guard you from issues like you've reported, which I'm not sure how to solve at the moment.

@culebron
Copy link
Author

Yep. In my case with block is not applicable, because open and close happen in different functions. So options are

  • use contextlib.ExitStack
  • open/close in append mode

@culebron
Copy link
Author

culebron commented Nov 2, 2019

Just googled this bug myself another time while writing multiprocessing code. LOL.

@sgillies
Copy link
Member

sgillies commented Nov 2, 2019

@culebron this remains a complicated issue. If Fiona datasets don't call GDALClose when they are deallocated, data will not be written to disk and/or memory will leak in some cases. And this is what happens in your script, GDALClose is called twice for the same file, and very late in an unexpected way the second time. Ideally, the GPKG driver should lock if it's not able to gracefully handle a double close, don't you think? I wonder if the issue isn't better solved in GDAL/OGR... maybe one of us should ask on gdal-dev to see what Even's perspective is.

@sgillies sgillies added this to the 2.0 milestone Jul 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants