Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appending values to a numpy array in an HDF5 file created be deepdish.io #22

Open
Omer80 opened this issue Sep 29, 2016 · 4 comments
Open

Comments

@Omer80
Copy link

Omer80 commented Sep 29, 2016

Hi, I like deepdish.io interface to HDF5, and would like to use it to store numpy arrays on the go, during a simulation running. Is it possible to append values to a certain numpy array that I've saved?

That is, I would like to create a dictionary of this sort - {u1:numpy_array1,u2:numpy_array2}, in which numpy_array is, for example, (128 * 128) array. I would like to append another (128 * 128 ) array to a third axis in each step of iteration of my code..

@Shaunakde
Copy link

So far I have got about this by doing an expensive extract, append and write cycle. I will look into this tonight - might help both of us.

SD

@gustavla
Copy link
Member

This is currently not supported, however the HDF5 standard does make this possible using chunking. That is, an array is stored as separate chunks, so when extending the array, the memory does not have to be contiguous in the file (because that would require a full re-allocation of the file). I know the backend PyTables can set chunking, so I think deepdish could support this. This feature request has come up before and I think it would be really neat, so I will consider this as high priority.

If I don't get to this soon, anyone is welcome to propose a PR.

There are some design decisions that need to be made first. The chunking size needs to be known the first time you save the file. Either deepdish sets it to something appropriate, or the user explicitly specifies a chunk size. What is the API for extending an array? Perhaps a separate call altogether, like dd.io.extend('test.h5', '/foo', x, axis=0)?

@Omer80
Copy link
Author

Omer80 commented Oct 1, 2016

I like the idea of creating a separate call for extending an array..

@Omer80
Copy link
Author

Omer80 commented Jan 15, 2017

Hi guys, I wanted to inquire whether there's any update regarding the option of appending data to a numpy array inside a HDF5 file created with deepdish.io ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants