Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in calculation that leads to out of bounds memory write. #99

Open
nikitanodar opened this issue Jul 28, 2023 · 0 comments
Open

Error in calculation that leads to out of bounds memory write. #99

nikitanodar opened this issue Jul 28, 2023 · 0 comments

Comments

@nikitanodar
Copy link

nikitanodar commented Jul 28, 2023

Hello. I hit a strange behavior when my app crashes and it took me a while to figure out what's going on.

I think everything starts in DOGM::gridCellOccupancyUpdate function. Here the accumulate(weight_array, weights_accum); function is executed which calls thrust::inclusive_scan. The thrust documentation says:

inclusive_scan is similar to std::partial_sum in the STL. The primary difference between the two functions is that std::partial_sum guarantees a serial summation order, while inclusive_scan requires associativity of the binary operation to parallelize the prefix sum.

Results are not deterministic for pseudo-associative operators (e.g., addition of floating-point types). Results for pseudo-associative operators may vary from run to run.

After the inclusive scan it can happen that weights_accum array has NON-INCREASING values, i.e. weights_accum[i] < weights_accum[i - 1], in other words, the next value in the array can be greater than the previous. Later this array is used in gridCellPredictionUpdateKernel kernel. Because weight_array_accum has the non-increasing numbers, the line

m_occ_pred = subtract(weight_array_accum, start_idx, end_idx);

can return a negative value. I can observe this with simple printf in the kernel. The negative m_occ_pred can lead to a negative rho_b value (also observed by printing the value), which is stored in born_masses_array.

Later in DOGM::initializeNewParticles() the array born_masses_array is used in inclusive scan to update particle_orders_accum, which is later used in normalize_particle_orders function. Inside this function it is assumed that the last value of the array is the maximum and the normalization happens:

void normalize_particle_orders(float* particle_orders_array_accum, int particle_orders_count, int v_B)
{
    thrust::device_ptr<float> particle_orders_accum(particle_orders_array_accum);

    float max = 1.0f;
    cudaMemcpy(&max, &particle_orders_array_accum[particle_orders_count - 1], sizeof(float), cudaMemcpyDeviceToHost);
    thrust::transform(
        particle_orders_accum, particle_orders_accum + particle_orders_count, particle_orders_accum,
        GPU_LAMBDA(float x) { return x * (v_B / max); });
}

But because born_masses_array has negative values, it can happen that the last value of particle_orders_array_accum is NOT maximum. This can lead that after the thrust::transform, the particle_orders_array_accum array can have values, greater than v_B (which is new_born_particle_count).

Later the array particle_orders_array_accum is passed to the initNewParticlesKernel1. Inside the kernel the start_idx is calculated, which, as mentioned, can be greater than new_born_particle_count (observed with printf inside the kernel). Next, this wrong index is used to update the birth_particle_array and this causes write to out-of-bounds write - compute sanitizer complains in this kernel and later a thrust::vector throws an error, because it's designed to throw in a destructor.

The problem is that all this is difficult to reproduce in a small demo, but in our bigger application it crashes all the time. Atm, I'm simply checking rho_b and set it to 0 in case it's negative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant