Center parameter not implemented for all GPU methods #425

lriordanchen · 2019-06-25T10:41:42Z

Hello,

I'm having trouble implementing the GPU support.

When I run the reconstruction without GPU support, it works as expected, but when I add in the line to run it on the GPU, the reconstruction contains artifacts consistent with using the incorrect rotation center in the reconstruction algorithm.

import tomopy

# import data, set theta, find rotation center using tomopy.find_center_pc

params = {'algorithm':'sirt', 'num_iter':10}
recon = zeros(shape=(num_channels, image_height, image_width, image_width))
for ix in range(num_channels):
    print('Reconstructing channel: ', ix+1)
    recon_channel = tomopy.recon(align_proj[ix], theta, center=rot_center[ix], 
                                 **params,
                                 accelerated=True, device="gpu" 
                                 )
    recon[ix] = recon_channel

So when I comment out the line <accelerated=True, device="gpu" >, the reconstruction looks reasonable but when I run it as above, I see the characteristic loops and double edges as if the rotation center is incorrect. What could be the reason for this, and how can I get this to reconstruct correctly when I run on the GPU?

I'm running on:

OS: Windows 10
TomoPy 1.5.0

jrmadsen · 2019-06-26T01:34:54Z

Hi @lriordanchen, the GPU algorithm sets up the problem differently than the CPU algorithms -- instead of keeping the ROI stationary and calculating the chord through the pixel at a given projection angle, it rotates the entire ROI to be perpendicular with the projection angle and interpolates the new pixels. at this point, the algorithms proceeds essentially the same as the CPU algorithms except all the weights are 1. Then the ROI is back rotated and interpolated to the original coordinates.

So TL;DR -- you need to pad your sinogram with ones or zeros so a rotation of 45 degrees doesn't result in a data loss.

For the built-in phantoms, here is an example of the padding being applied:

tomopy/benchmarking/phantom.py

Line 36 in cfeb9ca

obj = tomopy.misc.morph.pad(obj, axis=1, mode='constant')

@dgursoy or @carterbox can probably give you more exact/accurate instructions on how to do this though.

carterbox · 2019-07-02T16:53:09Z

@jrmadsen, you're describing a different problem than what @lriordanchen is reporting. The problem here is that in an experiment, the center of rotation for the object isn't always perfectly aligned with the field of view. This is why we have the center parameter. sirt_cuda takes a parameter const float* center which is an array of x (horizontal) coordinates. This is the location of the "true" center of rotation in each slice. However, this parameter seems to be unused by sirt_cuda.

tomopy/source/gpu/sirt.cu

Lines 162 to 247 in cfeb9ca

    
           sirt_cuda(const float* cpu_data, int dy, int dt, int dx, const float* center, 
        
                     const float* theta, float* cpu_recon, int ngridx, int ngridy, int num_iter, 
        
                     RuntimeOptions* opts) 
        
           { 
        
               printf("[%lu]> %s : nitr = %i, dy = %i, dt = %i, dx = %i, nx = %i, ny = %i\n", 
        
                      GetThisThreadID(), __FUNCTION__, num_iter, dy, dt, dx, ngridx, ngridy); 
        
               // thread counter for device assignment 
        
               static std::atomic<int> ntid; 
        
               // compute some properties (expected python threads, max threads, device assignment) 
        
               int pythread_num = ntid++; 
        
               int device       = pythread_num % cuda_device_count();  // assign to device 
        
               TIMEMORY_AUTO_TIMER(""); 
        
               // GPU allocated copies 
        
               cuda_set_device(device); 
        
               printf("[%lu] Running on device %i...\n", GetThisThreadID(), device); 
        
               uintmax_t    recon_pixels = scast<uintmax_t>(dy * ngridx * ngridy); 
        
               auto         block        = opts->block_size[0]; 
        
               auto         grid         = ComputeGridSize(recon_pixels, block); 
        
               auto         main_stream  = create_streams(1); 
        
               float*       update    = gpu_malloc_and_memset<float>(recon_pixels, 0, *main_stream); 
        
               init_data_t  init_data = GpuData::initialize(opts, device, dy, dt, dx, ngridx, ngridy, 
        
                                                           cpu_recon, cpu_data, update); 
        
               data_array_t gpu_data  = std::get<0>(init_data); 
        
               float*       recon     = std::get<1>(init_data); 
        
               float*       data      = std::get<2>(init_data); 
        
               uint32_t*    sum_dist  = cuda_compute_sum_dist(dy, dt, dx, ngridx, ngridy, theta); 
        
               NVTX_RANGE_PUSH(&nvtx_total); 
        
               for(int i = 0; i < num_iter; i++) 
        
               { 
        
                   // timing and profiling 
        
                   TIMEMORY_AUTO_TIMER(""); 
        
                   NVTX_RANGE_PUSH(&nvtx_iteration); 
        
                   START_TIMER(t_start); 
        
                   // sync the main stream 
        
                   stream_sync(*main_stream); 
        
                   // reset global update and sum_dist 
        
                   gpu_memset(update, 0, recon_pixels, *main_stream); 
        
                   // sync 
        
                   GpuData::sync(gpu_data); 
        
                   // execute the loop over slices and projection angles 
        
                   execute<data_array_t>(opts, dt, std::ref(gpu_data), sirt_gpu_compute_projection, 
        
                                         dy, dt, dx, ngridx, ngridy, theta); 
        
                   // sync the thread streams 
        
                   GpuData::sync(gpu_data); 
        
                   // sync the main stream 
        
                   stream_sync(*main_stream); 
        
                   // update the global recon with global update and sum_dist 
        
                   cuda_sirt_update_kernel<<<grid, block, 0, *main_stream>>>(recon, update, sum_dist, 
        
                                                                             dx, recon_pixels); 
        
                   // stop profile range and report timing 
        
                   NVTX_RANGE_POP(0); 
        
                   REPORT_TIMER(t_start, "iteration", i, num_iter); 
        
               } 
        
               // copy to cpu 
        
               gpu2cpu_memcpy<float>(cpu_recon, recon, recon_pixels, *main_stream); 
        
               // sync and destroy main stream 
        
               destroy_streams(main_stream, 1); 
        
               // cleanup 
        
               cudaFree(recon); 
        
               cudaFree(data); 
        
               cudaFree(update); 
        
               cudaFree(sum_dist); 
        
               NVTX_RANGE_POP(0); 
        
               // sync the device 
        
               cudaDeviceSynchronize(); 
        
           }

Right now, it looks like we are assuming that the center of rotation is located at (nx/2, ny/2).

tomopy/source/gpu/utils.cu

Lines 244 to 245 in cfeb9ca

    
           double center_x = (0.5 * nx) - 0.5; 
        
           double center_y = (0.5 * ny) - 0.5;

However, the old sirt uses the center parameter so the center of rotation is (center, ny/2).

I haven't run my own tests to confirm this is the case, but if @lriordanchen posts a photo of one of their reconstructions, we can easily confirm that this is the center issue and not a padding issue.

@lriordanchen, as a temporary measure, you can adjust the rotation centers of the sinograms before handing them off to SIRT by shifting them along the x-axis. However, the reason why the center parameter exists is probably to avoid interpolation errors from shifting the sinogram.

carterbox · 2019-07-02T16:59:10Z

This is both a bug and feature request. At the very least, there should be a warning that sirt_cuda does not have feature parity with sirt, but it is also requesting that a feature is implemented.

jrmadsen · 2019-07-02T18:17:48Z

Ah ok. Thanks for correcting this. Yes it should not be difficult to adapt the affine transform in utils.cu to handle this.

lriordanchen · 2019-07-03T13:22:11Z

Thanks @jrmadsen and @carterbox! Here is an example of what I was seeing:

Below I have an slice generated by running without the gpu:

and here I have the same image generated by running with the gpu:

Do you agree that it is a problem with the rotation center? Also, I found the same thing happened with mlem.

Adding on to the feature request in issue #422, would it be possible to have the padding @jrmadsen talked about automatically taken care of after the code has determined not to fallback to the cpu? That way the result, whether arrived at by using the gpu as asked for or by using the cpu as fallback, is calculated correctly.

Thanks again for all your help.

dgursoy · 2019-07-03T14:48:00Z

Thanks for bringing this up.

Currently the GPU implementation assumes that the center is at the middle of the detector's field of view no matter what you provide with the center argument. For a quick fix, one can shift the data by padding or interpolation such that it is centered correctly.

But in the long run we need to fix it in TomoPy. This requires implementation of rotation around an arbitrary point and deserves a new pull request. First, we should generate a warning for users that center is currently a dummy argument.

I'll post it here once the PR is open.

jdgelb · 2019-07-17T20:05:54Z

I've noticed that this issue also seems to apply when using Astra for SIRT (haven't yet tried any of the other Astra recon modules). Passing in the center argument seems to have no effect on the reconstruction (i.e., it assumes the center is the middle of the image even though I have specified the center shift). This issue does not apply to TomoPy's own Gridrec.

Here's my recon call for SIRT using Astra:

options = {'proj_type': 'cuda', 'method': 'SIRT_CUDA', 'num_iter': 100}
recon = tomopy.recon(proj, theta, center=484, algorithm=tomopy.astra, options=options, ncore=1)

And here is my call for Gridrec:

recon = tomopy.recon(proj, theta, center=484, algorithm='gridrec', filter_name='shepp', ncore=16)

Below is the output of each... the Gridrec results have clearly been center-shifted, while the Astra ones have not.

decarlof · 2019-08-02T19:01:46Z

@jdgelb as mentioned by @dgursoy for now you can just shift with:

options = {'proj_type': 'cuda', 'method': 'SIRT_CUDA', 'num_iter': 100}
shift = 28
proj = np.roll(proj, shift, axis=2)
recon = tomopy.recon(proj, theta, algorithm=tomopy.astra, options=options, ncore=1)

assuming you had a 1024 detector (512 - 484) = 28

canismarko · 2019-09-13T20:18:46Z

I think the @jdgelb astra issue is different from the tomopy GPU. I was getting ready to open a new issue but since it's related I'll post it here.

In the astra implementation, tomopy accounts for the center parameter by setting the ExtraDetectorOffset option on the projection geometry object.

However, this doesn't actually change the reconstruction in astra, as demonstrated in this MWE. The example is a modification of sample s003, with the addition of the extra detector offset on line 34. If I run the example, the reconstruction is the same no matter what value I put in for center.

I'm not sure on whether astra or tomopy has the issue, though.

lriordanchen added the question Troubleshooting requests label Jun 25, 2019

carterbox added bug A confirmed issue that needs to be fixed feature request Feature requests or discussions and removed question Troubleshooting requests labels Jul 2, 2019

carterbox changed the title ~~GPU accelerated method producing different result from non-accelerated method~~ Center parameter not implemented for all GPU methods Aug 5, 2019

carterbox added this to Needs triage in Bug Board Aug 9, 2019

carterbox moved this from Needs triage to Low priority in Bug Board Sep 24, 2019

carterbox added this to In progress in GPU Feature Development for TomoPy Jan 17, 2020

carterbox moved this from In progress to To do in GPU Feature Development for TomoPy Jan 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Center parameter not implemented for all GPU methods #425

Center parameter not implemented for all GPU methods #425

lriordanchen commented Jun 25, 2019

jrmadsen commented Jun 26, 2019

carterbox commented Jul 2, 2019 •

edited

carterbox commented Jul 2, 2019

jrmadsen commented Jul 2, 2019

lriordanchen commented Jul 3, 2019

dgursoy commented Jul 3, 2019

jdgelb commented Jul 17, 2019 •

edited

decarlof commented Aug 2, 2019 •

edited

canismarko commented Sep 13, 2019

Center parameter not implemented for all GPU methods #425

Center parameter not implemented for all GPU methods #425

Comments

lriordanchen commented Jun 25, 2019

jrmadsen commented Jun 26, 2019

carterbox commented Jul 2, 2019 • edited

carterbox commented Jul 2, 2019

jrmadsen commented Jul 2, 2019

lriordanchen commented Jul 3, 2019

dgursoy commented Jul 3, 2019

jdgelb commented Jul 17, 2019 • edited

decarlof commented Aug 2, 2019 • edited

canismarko commented Sep 13, 2019

carterbox commented Jul 2, 2019 •

edited

jdgelb commented Jul 17, 2019 •

edited

decarlof commented Aug 2, 2019 •

edited