GitHub - seung-lab/chunkflow: Compose chunk operators to create a pipeline for local or distributed petabyte-scale computation

Problem

Petabyte scale 3D image processing is slow and computationally demanding;
Computation has to be distributed with linear scalability;
Local cluster and public cloud computing are not fully used at the same time;
Duplicated code across a variety of routine tasks is hard to maintain.

Features

Composable operators. The chunk operators could be composed in a command line for flexible usage.
Hybrid Cloud Distributed computation in both local and cloud computers. The task scheduling frontend and computationally heavy backend are decoupled using AWS Simple Queue Service. The backend could be any computer with an internet connection and cloud authentication. Benefit from the robust design, the cheap unstable instances (preemptable intance in Google Cloud, spot instance in AWS) could be used to reduce cost by about threefold!
Petabyte scale. We have used chunkflow to output over eighteen-petabyte images and scaled up to 3600 nodes with NVIDIA GPUs across three regions in Google Cloud, and chunkflow is still reliable.
Operators work with 3D image volumes.
You can plug in your own code as an operator.

Check out the Documentation for installation and usage. Try it out by following the tutorial.

Image Segmentation Example

Perform Convolutional net inference to segment 3D image volume with one single command!

#!/bin/bash

chunkflow \
    load-tif --file-name path/of/image.tif -o image \
    inference --convnet-model path/of/model.py --convnet-weight-path path/of/weight.pt \
        --input-patch-size 20 256 256 --output-patch-overlap 4 64 64 --num-output-channels 3 \
        -f pytorch --batch-size 12 --mask-output-chunk -i image -o affs \
    plugin -f agglomerate --threshold 0.7 --aff-threshold-low 0.001 --aff-threshold-high 0.9999 -i affs -o seg \
    neuroglancer -i image,affs,seg -p 33333 -v 30 6 6

you can see your 3D image and segmentation directly in Neuroglancer!

Composable Operators

After installation, You can simply type chunkflow and it will list all the operators with help message. We keep adding new operators and will keep it update here. For the detailed usage, please checkout our Documentation.

Operator Name	Function
adjust-bbox	adjust the corner offset of bounding box
channel-voting	Vote across channels of semantic map
cleanup	remove empty files to clean up storage
connected-components	Threshold the boundary map to get a segmentation
copy-var	Copy a variable to a new name
create-chunk	Create a fake chunk for easy test
create-info	Create info file of Neuroglancer Precomputed volume
crop-margin	Crop the margin of a chunk
debug	Add breakpoint to debug the task content
delete-chunk	Delete chunk in task to reduce RAM requirement
delete-task-in-queue	Delete the task in AWS SQS queue
downsample-upload	Downsample the chunk hierarchically and upload to volume
download-mesh	Download meshes from Neuroglancer Precomputed volume
evaluate-segmentation	Compare segmentation chunks
fetch-task-from-file	Fetch task from a file
fetch-task-from-sqs	Fetch task from AWS SQS queue one by one
generate-tasks	Generate tasks one by one
gaussian-filter	2D Gaussian blurring operated in-place
inference	Convolutional net inference
log-summary	Summary of logs
mark-complete	mark task completion as an empty file
mask	Black out the chunk based on another mask chunk
mask-out-objects	Mask out selected or small objects
multiply	Multiply chunks with another chunk
mesh	Build 3D meshes from segmentation chunk
mesh-manifest	Collect mesh fragments for object
neuroglancer	Visualize chunks using neuroglancer
normalize-contrast-nkem	Normalize image contrast using histograms
normalize-intensity	Normalize image intensity to -1:1
normalize-section-shang	Normalization algorithm created by Shang
plugin	Import local code as a customized operator.
quantize	Quantize the affinity map
load-h5	Read HDF5 files
load-npy	Read NPY files
load-json	Read JSON files
load-pngs	Read png files
load-precomputed	Cutout chunk from a local/cloud storage volume
load-tif	Read TIFF files
load-skeleton	Load skeletons
load-synapses	Load synapses from a file
load-zarr	Read Zarr files
setup-env	Prepare storage infor files and produce tasks
skip-task-by-file	If a result/flag file already exists, skip this task
skip-task-by-blocks-in-volume	If all the blocks already exists in volume, skip this task
skip-all-zero	If a chunk has all zero, skip this task
skip-none	If an item in task is None, skip this task
threshold	Use a threshold to segment the probability map
view	Another chunk viewer in browser using CloudVolume
save-h5	Save chunk as HDF5 file
save-points	Save point cloud as a HDF5 file.
save-pngs	Save chunk as a serials of png files
save-precomputed	Save chunk to local/cloud storage volume
save-tif	Save chunk as TIFF file
save-synapses	Save synapses as a HDF5 file.
save-swc	Save skeletons as a SWC file.

Affiliation

This package is developed at Princeton University and Flatiron Institute.

Reference

We have a paper for this repo:

@article{wu_chunkflow_2021,
	title = {Chunkflow: hybrid cloud processing of large {3D} images by convolutional nets},
	issn = {1548-7105},
	shorttitle = {Chunkflow},
	url = {https://www.nature.com/articles/s41592-021-01088-5},
	doi = {10.1038/s41592-021-01088-5},
	journal = {Nature Methods},
	author = {Wu, Jingpeng and Silversmith, William M. and Lee, Kisuk and Seung, H. Sebastian},
	year = {2021},
	pages = {1--2}
}

Name		Name	Last commit message	Last commit date
Latest commit History 609 Commits
.github		.github
chunkflow		chunkflow
distributed		distributed
docker		docker
docs		docs
examples		examples
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pylintrc		.pylintrc
.readthedocs.yaml		.readthedocs.yaml
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
conftest.py		conftest.py
count_operators.sh		count_operators.sh
dev-environments.txt		dev-environments.txt
requirements.txt		requirements.txt
setup.py		setup.py

License

seung-lab/chunkflow

Folders and files

Latest commit

History

Repository files navigation

Problem

Features

Image Segmentation Example

Composable Operators

Affiliation

Reference

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Sponsor this project

Languages