Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: self._hds cannot be converted to a Python object for pickling #3

Open
arkanoid87 opened this issue Feb 1, 2019 · 3 comments

Comments

@arkanoid87
Copy link

Seems that rasterio's _hds object is no more serializable

distributed.protocol.pickle - INFO - Failed to serialize ("('filled-2f9fe0560be0502eda038fa941309294', 0, 0)", <dask_rasterio.write.RasterioDataset object at 0x7f8f9deac828>, (slice(0, 748, None), slice(0, 22415, None)), <unlocked _thread.lock object at 0x7f8f9cb2af58>, False). Exception: self._hds cannot be converted to a Python object for pickling
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/miniconda3/envs/jupyter/lib/python3.6/site-packages/distributed/protocol/pickle.py in dumps(x)
     37     try:
---> 38         result = pickle.dumps(x, protocol=pickle.HIGHEST_PROTOCOL)
     39         if len(result) < 1000:

~/miniconda3/envs/jupyter/lib/python3.6/site-packages/rasterio/_io.cpython-36m-x86_64-linux-gnu.so in rasterio._io.DatasetWriterBase.__reduce_cython__()

TypeError: self._hds cannot be converted to a Python object for pickling
@sgillies
Copy link

Rasterio datasets can't be pickled and can't be shared between processes or threads. The work around is to distribute dataset identifiers (paths or URIs) and then open them in new threads. See rasterio/rasterio#1731.

@snowman2
Copy link

@sgillies thanks for your input on this issue: corteva/rioxarray#210

@lionlai1989
Copy link

lionlai1989 commented Apr 22, 2022

Intriguingly, the following code works.

def default_profile():
    return {
        "count": 1,
        "driver": "GTiff",
        "dtype": "float32",
        "nodata": -999999.0,
        "width": 100,
        "height": 100,
        "transform": rasterio.Affine(1.0, 0.0, 0.0, 0.0, 1.0, 0.0),
        "tiled": True,
        "interleave": "band",
        "compress": "lzw",
        "blockxsize": 256,
        "blockysize": 256,
    }

def read_dataset(dataset, window):
    dataset.read(window=window)

def write_dataset(dataset, pixels, window):
    dataset.write(pixels, window=window)

if __name__ == "__main__":
    mp.set_start_method("fork")
    window = rasterio.windows.Window(col_off=0, row_off=0, width=20, height=20)
    pixels = np.ones((1, 20, 20))
    default_profile = default_profile()

    with rasterio.open(Path("test_write.tiff"), mode="w", **default_profile) as dataset_write:
        with rasterio.open(Path("test_read.tiff"), mode="r") as dataset_read:

            p1 = mp.Process(target=read_dataset, args=(dataset_read, window))
            p2 = mp.Process(target=write_dataset, args=(dataset_write, pixels, window))

            p1.start()
            p2.start()
            p1.join()
            p2.join()

The output:

Read dataset successfully.
Write dataset successfully.

Am I missing something here? @sgillies

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants