b2h5py

b2h5py provides h5py with transparent, automatic optimized reading of n-dimensional slices of Blosc2-compressed datasets. This optimized slicing leverages direct chunk access (skipping the slow HDF5 filter pipeline) and 2-level partitioning into chunks and then smaller blocks (so that less data is actually decompressed).

Benchmarks of this technique show 2x-5x speed-ups compared with normal filter-based access. Comparable results are obtained with a similar technique in PyTables, see Optimized Hyper-slicing in PyTables with Blosc2 NDim.

Usage

This optimized access works for slices with step 1 on Blosc2-compressed datasets using the native byte order. It is enabled by monkey-patching the h5py.Dataset class to extend the slicing operation. The easiest way to do this is:

import b2h5py.auto

After that, optimization will be attempted for any slicing of a dataset (of the form dataset[...] or dataset.__getitem__(...)). If the optimization is not possible in a particular case, normal h5py slicing code will be used (which performs HDF5 filter-based access, backed by hdf5plugin to support Blosc2).

You may instead just import b2h5py and explicitly enable the optimization globally by calling b2h5py.enable_fast_slicing(), and disable it again with b2h5py.disable_fast_slicing(). You may also enable it temporarily by using a context manager:

with b2h5py.fast_slicing():
    # ... code that will use Blosc2 optimized slicing ...

Finally, you may explicitly enable optimizations for a given h5py dataset by wrapping it in a B2Dataset instance:

b2dset = b2h5py.B2Dataset(dset)
# ... slicing ``b2dset`` will use Blosc2 optimization ...

Building

Just install PyPA build (e.g. pip install build), enter the source code directory and run pyproject-build to get a source tarball and a wheel under the dist directory.

Installing

To install as a wheel from PyPI, run pip install b2h5py.

You may also install the wheel that you built in the previous section, or enter the source code directory and run pip install . from there.

Running tests

If you have installed b2h5py, just run python -m unittest discover b2h5py.tests.

Otherwise, just enter its source code directory and run python -m unittest.

You can also run the h5py tests with the patched Dataset class to check that patching does not break anything. You may install the h5py-test extra (e.g. pip install b2h5py[h5py-test] and run python -m b2h5py.tests.test_patched_h5py.

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github/workflows		.github/workflows
b2h5py		b2h5py
doc		doc
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.rst		README.rst
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

b2h5py

b2h5py

doc

doc

examples

examples

.gitignore

.gitignore

LICENSE

LICENSE

README.rst

README.rst

pyproject.toml

pyproject.toml

Repository files navigation

b2h5py

Usage

Building

Installing

Running tests

About

Releases 6

Packages

Contributors 2

Languages

License

Blosc/b2h5py

Folders and files

Latest commit

History

Repository files navigation

b2h5py

Usage

Building

Installing

Running tests

About

Resources

License

Stars

Watchers

Forks

Languages