Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Center parameter not implemented for all GPU methods #425

Open
lriordanchen opened this issue Jun 25, 2019 · 9 comments
Open

Center parameter not implemented for all GPU methods #425

lriordanchen opened this issue Jun 25, 2019 · 9 comments
Labels
bug A confirmed issue that needs to be fixed feature request Feature requests or discussions

Comments

@lriordanchen
Copy link

Hello,

I'm having trouble implementing the GPU support.

When I run the reconstruction without GPU support, it works as expected, but when I add in the line to run it on the GPU, the reconstruction contains artifacts consistent with using the incorrect rotation center in the reconstruction algorithm.

import tomopy

# import data, set theta, find rotation center using tomopy.find_center_pc

params = {'algorithm':'sirt', 'num_iter':10}
recon = zeros(shape=(num_channels, image_height, image_width, image_width))
for ix in range(num_channels):
    print('Reconstructing channel: ', ix+1)
    recon_channel = tomopy.recon(align_proj[ix], theta, center=rot_center[ix], 
                                 **params,
                                 accelerated=True, device="gpu" 
                                 )
    recon[ix] = recon_channel

So when I comment out the line <accelerated=True, device="gpu" >, the reconstruction looks reasonable but when I run it as above, I see the characteristic loops and double edges as if the rotation center is incorrect. What could be the reason for this, and how can I get this to reconstruct correctly when I run on the GPU?

I'm running on:

  • OS: Windows 10
  • TomoPy 1.5.0
@lriordanchen lriordanchen added the question Troubleshooting requests label Jun 25, 2019
@jrmadsen
Copy link
Collaborator

Hi @lriordanchen, the GPU algorithm sets up the problem differently than the CPU algorithms -- instead of keeping the ROI stationary and calculating the chord through the pixel at a given projection angle, it rotates the entire ROI to be perpendicular with the projection angle and interpolates the new pixels. at this point, the algorithms proceeds essentially the same as the CPU algorithms except all the weights are 1. Then the ROI is back rotated and interpolated to the original coordinates.

So TL;DR -- you need to pad your sinogram with ones or zeros so a rotation of 45 degrees doesn't result in a data loss.

For the built-in phantoms, here is an example of the padding being applied:

obj = tomopy.misc.morph.pad(obj, axis=1, mode='constant')

@dgursoy or @carterbox can probably give you more exact/accurate instructions on how to do this though.

@carterbox
Copy link
Member

carterbox commented Jul 2, 2019

@jrmadsen, you're describing a different problem than what @lriordanchen is reporting. The problem here is that in an experiment, the center of rotation for the object isn't always perfectly aligned with the field of view. This is why we have the center parameter. sirt_cuda takes a parameter const float* center which is an array of x (horizontal) coordinates. This is the location of the "true" center of rotation in each slice. However, this parameter seems to be unused by sirt_cuda.

tomopy/source/gpu/sirt.cu

Lines 162 to 247 in cfeb9ca

sirt_cuda(const float* cpu_data, int dy, int dt, int dx, const float* center,
const float* theta, float* cpu_recon, int ngridx, int ngridy, int num_iter,
RuntimeOptions* opts)
{
printf("[%lu]> %s : nitr = %i, dy = %i, dt = %i, dx = %i, nx = %i, ny = %i\n",
GetThisThreadID(), __FUNCTION__, num_iter, dy, dt, dx, ngridx, ngridy);
// thread counter for device assignment
static std::atomic<int> ntid;
// compute some properties (expected python threads, max threads, device assignment)
int pythread_num = ntid++;
int device = pythread_num % cuda_device_count(); // assign to device
TIMEMORY_AUTO_TIMER("");
// GPU allocated copies
cuda_set_device(device);
printf("[%lu] Running on device %i...\n", GetThisThreadID(), device);
uintmax_t recon_pixels = scast<uintmax_t>(dy * ngridx * ngridy);
auto block = opts->block_size[0];
auto grid = ComputeGridSize(recon_pixels, block);
auto main_stream = create_streams(1);
float* update = gpu_malloc_and_memset<float>(recon_pixels, 0, *main_stream);
init_data_t init_data = GpuData::initialize(opts, device, dy, dt, dx, ngridx, ngridy,
cpu_recon, cpu_data, update);
data_array_t gpu_data = std::get<0>(init_data);
float* recon = std::get<1>(init_data);
float* data = std::get<2>(init_data);
uint32_t* sum_dist = cuda_compute_sum_dist(dy, dt, dx, ngridx, ngridy, theta);
NVTX_RANGE_PUSH(&nvtx_total);
for(int i = 0; i < num_iter; i++)
{
// timing and profiling
TIMEMORY_AUTO_TIMER("");
NVTX_RANGE_PUSH(&nvtx_iteration);
START_TIMER(t_start);
// sync the main stream
stream_sync(*main_stream);
// reset global update and sum_dist
gpu_memset(update, 0, recon_pixels, *main_stream);
// sync
GpuData::sync(gpu_data);
// execute the loop over slices and projection angles
execute<data_array_t>(opts, dt, std::ref(gpu_data), sirt_gpu_compute_projection,
dy, dt, dx, ngridx, ngridy, theta);
// sync the thread streams
GpuData::sync(gpu_data);
// sync the main stream
stream_sync(*main_stream);
// update the global recon with global update and sum_dist
cuda_sirt_update_kernel<<<grid, block, 0, *main_stream>>>(recon, update, sum_dist,
dx, recon_pixels);
// stop profile range and report timing
NVTX_RANGE_POP(0);
REPORT_TIMER(t_start, "iteration", i, num_iter);
}
// copy to cpu
gpu2cpu_memcpy<float>(cpu_recon, recon, recon_pixels, *main_stream);
// sync and destroy main stream
destroy_streams(main_stream, 1);
// cleanup
cudaFree(recon);
cudaFree(data);
cudaFree(update);
cudaFree(sum_dist);
NVTX_RANGE_POP(0);
// sync the device
cudaDeviceSynchronize();
}

Right now, it looks like we are assuming that the center of rotation is located at (nx/2, ny/2).

tomopy/source/gpu/utils.cu

Lines 244 to 245 in cfeb9ca

double center_x = (0.5 * nx) - 0.5;
double center_y = (0.5 * ny) - 0.5;

However, the old sirt uses the center parameter so the center of rotation is (center, ny/2).

I haven't run my own tests to confirm this is the case, but if @lriordanchen posts a photo of one of their reconstructions, we can easily confirm that this is the center issue and not a padding issue.

@lriordanchen, as a temporary measure, you can adjust the rotation centers of the sinograms before handing them off to SIRT by shifting them along the x-axis. However, the reason why the center parameter exists is probably to avoid interpolation errors from shifting the sinogram.

@carterbox carterbox added bug A confirmed issue that needs to be fixed feature request Feature requests or discussions and removed question Troubleshooting requests labels Jul 2, 2019
@carterbox
Copy link
Member

This is both a bug and feature request. At the very least, there should be a warning that sirt_cuda does not have feature parity with sirt, but it is also requesting that a feature is implemented.

@jrmadsen
Copy link
Collaborator

jrmadsen commented Jul 2, 2019

Ah ok. Thanks for correcting this. Yes it should not be difficult to adapt the affine transform in utils.cu to handle this.

@lriordanchen
Copy link
Author

Thanks @jrmadsen and @carterbox! Here is an example of what I was seeing:

Below I have an slice generated by running without the gpu:
image_no_gpu

and here I have the same image generated by running with the gpu:
image_gpu

Do you agree that it is a problem with the rotation center? Also, I found the same thing happened with mlem.

Adding on to the feature request in issue #422, would it be possible to have the padding @jrmadsen talked about automatically taken care of after the code has determined not to fallback to the cpu? That way the result, whether arrived at by using the gpu as asked for or by using the cpu as fallback, is calculated correctly.

Thanks again for all your help.

@dgursoy
Copy link
Collaborator

dgursoy commented Jul 3, 2019

Thanks for bringing this up.

Currently the GPU implementation assumes that the center is at the middle of the detector's field of view no matter what you provide with the center argument. For a quick fix, one can shift the data by padding or interpolation such that it is centered correctly.

But in the long run we need to fix it in TomoPy. This requires implementation of rotation around an arbitrary point and deserves a new pull request. First, we should generate a warning for users that center is currently a dummy argument.

I'll post it here once the PR is open.

@jdgelb
Copy link

jdgelb commented Jul 17, 2019

I've noticed that this issue also seems to apply when using Astra for SIRT (haven't yet tried any of the other Astra recon modules). Passing in the center argument seems to have no effect on the reconstruction (i.e., it assumes the center is the middle of the image even though I have specified the center shift). This issue does not apply to TomoPy's own Gridrec.

Here's my recon call for SIRT using Astra:

options = {'proj_type': 'cuda', 'method': 'SIRT_CUDA', 'num_iter': 100}
recon = tomopy.recon(proj, theta, center=484, algorithm=tomopy.astra, options=options, ncore=1)

And here is my call for Gridrec:

recon = tomopy.recon(proj, theta, center=484, algorithm='gridrec', filter_name='shepp', ncore=16)

Below is the output of each... the Gridrec results have clearly been center-shifted, while the Astra ones have not.
image

@decarlof
Copy link
Contributor

decarlof commented Aug 2, 2019

@jdgelb as mentioned by @dgursoy for now you can just shift with:

options = {'proj_type': 'cuda', 'method': 'SIRT_CUDA', 'num_iter': 100}
shift = 28
proj = np.roll(proj, shift, axis=2)
recon = tomopy.recon(proj, theta, algorithm=tomopy.astra, options=options, ncore=1)

assuming you had a 1024 detector (512 - 484) = 28

@carterbox carterbox changed the title GPU accelerated method producing different result from non-accelerated method Center parameter not implemented for all GPU methods Aug 5, 2019
@carterbox carterbox added this to Needs triage in Bug Board Aug 9, 2019
@canismarko
Copy link

I think the @jdgelb astra issue is different from the tomopy GPU. I was getting ready to open a new issue but since it's related I'll post it here.

In the astra implementation, tomopy accounts for the center parameter by setting the ExtraDetectorOffset option on the projection geometry object.

However, this doesn't actually change the reconstruction in astra, as demonstrated in this MWE. The example is a modification of sample s003, with the addition of the extra detector offset on line 34. If I run the example, the reconstruction is the same no matter what value I put in for center.

I'm not sure on whether astra or tomopy has the issue, though.

@carterbox carterbox moved this from Needs triage to Low priority in Bug Board Sep 24, 2019
@carterbox carterbox moved this from In progress to To do in GPU Feature Development for TomoPy Jan 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A confirmed issue that needs to be fixed feature request Feature requests or discussions
Projects
Bug Board
  
Low priority
Development

No branches or pull requests

7 participants