Release GPU support, rotation-based recon, MSVC support · tomopy/tomopy

Version is now pulled from git tags instead of from VERSION file
dxchange is no longer a required dependency
CMake build system to handle addition of code in two new languages: C++ and CUDA
Python bindings to C++/CUDA code still go through C interface (i.e. no direct binding)
SIRT and MLEM have been implemented on the GPU and CPU using rotation-based algorithm
- GPU support has been validated for Windows and Linux
- CPU version uses OpenCV for rotation
  - OpenCV distributed via conda + MinGW on Windows does not work. Use MSVC compiler on Windows.
- GPU version uses NPP for rotation
- benchmarking on NVIDIA P100: ~11x slower than gridrec but vastly improved reconstruction quality
- benchmarking on NVIDIA V100: per-slice speed-up over ray-based algorithm is ~650x, e.g. a TomoBank reconstruction (2048p + 1,500 proj angles) formerly requiring ~6.5 hours is completed in ~40 seconds
Support for Microsoft Visual C++ (MSVC) compiler
- Implemented gridrec in C++ (uses std::complex) which is enabled by default on Windows
To enable new algorithms, include accelerated=True to tomopy.recon for SIRT and MLEM
- there are other options available but unless there is an explicitly understanding the effects of the other parameters, use the defaults.
Multi-GPU support is available
- Automatic detection of number of available devices
- Multiple threads started at Python level automatically spread out over the number of available GPUs
Secondary thread-pools created in C++ code to provide highly efficient communication with the GPU and additional parallelism on the CPU.
- When running on the GPU, set ncore parameter to tomopy.recon to the number of GPUs available.
- Each "Python" thread creates a unique secondary thread-pool with a default size of 2 * number-of-cpus. This is intentional and, in general, the larger the secondary thread-pool, the more efficiently the CPU-GPU communication latency is hidden. However, in general, more than 24 threads per thread-pool provides no benefit (all latency is essentially hidden at that point)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support, rotation-based recon, MSVC support