Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

w0utert · 2024-04-25T12:55:12Z

In virtualized environments that slice up the GPU and share it between instances as vGPU's, GPU unified memory is usually disabled out of security considerations. Asynchronous CUDA malloc/free depends on GPU unified memory, so before, it was not possible to deploy and run NanoVDB code in such environments.

This commit adds macros CUDA_MALLOC and CUDA_FREE and replaces all CUDA alloc/free calls with these macros. CUDA_MALLOC and CUDA_FREE expand to asynchronous CUDA malloc & free if the following two conditions are met:

CUDA version needs to be >= 11.2 as this is the first version that supports cudaMallocAsync/cudaMallocFree
NANOVDB_USE_SYNC_CUDA_MALLOC needs to undefined

In all other cases, CUDA_MALLOC and CUDA_FREE expand to synchronous cudaMalloc/cudaFree.

Since NanoVDB is distributed as header-only, setting the NANOVDB_USE_SYNC_CUDA_MALLOC flag should be handled by the project's build system itself.

linux-foundation-easycla · 2024-04-25T12:55:16Z

✅login: w0utert / (0f46c21)
✅login: w0utert / (0f46c21, 4c01b38)

The committers listed above are authorized under a signed CLA.

In virtualized environments that slice up the GPU and share it between instances as vGPU's, GPU unified memory is usually disabled out of security considerations. Asynchronous CUDA malloc/free depends on GPU unified memory, so before, it was not possible to deploy and run NanoVDB code in such environments. This commit adds macros CUDA_MALLOC and CUDA_FREE and replaces all CUDA alloc/free calls with these macros. CUDA_MALLOC and CUDA_FREE expand to asynchronous CUDA malloc & free if the following two conditions are met: - CUDA version needs to be >= 11.2 as this is the first version that supports cudaMallocAsync/cudaMallocFree - NANOVDB_USE_SYNC_CUDA_MALLOC needs to undefined In all other cases, CUDA_MALLOC and CUDA_FREE expand to synchronous cudaMalloc/cudaFree. Since NanoVDB is distributed as header-only, setting the NANOVDB_USE_SYNC_CUDA_MALLOC flag should be handled by the project's build system itself. Signed-off-by: Wouter Bijlsma <wouter.bijlsma@asml.com>

Signed-off-by: Wouter Bijlsma <wouter.bijlsma@asml.com>

kmuseth

great contribution ! However, before I approve it let me try out your fix in the private development branch of NanoVDB. I will sync that repo up with the this (public) repo in the coming week - it includes several changes and improvements :)

kmuseth

at closer inspection, why not simply replace this existing line:

#if CUDART_VERSION < 11020

with

#if (CUDART_VERSION < 11020) || defined(NANOVDB_USE_SYNC_CUDA_MALLOC)

This avoids the need to introduce the new macro and also works with existing client code of NanoVDB that may already be using cudaMallocAsync and cudaFreeAsync. I tried it on Linux but not yet on Windows :)

w0utert · 2024-05-03T08:03:35Z

@kmuseth that was the first solution I tried, but it didn’t work, because cudaMalloc and cudaFree calls would still be resolved to the CUDA ones and not the redefined ones from the NanoVDB header. Without any modification to our build I still got CUDA ‘not supported’ errors on allocations.

Maybe this can be worked around with some linker directives but that seems brittle and could require possible annoying changes to the build system of projects that include the NanoVDB headers.

w0utert · 2024-05-03T08:10:29Z

This was on Linux by the way, so it’s interesting it did work on your side, I assume there could be some link differences between our builds. I could double check next week to verify again to be sure and to find out why.

kmuseth · 2024-05-03T18:04:32Z

@w0utert I think you're right so I changed my implementation by simply placing the definitions of the functions in a namespace (nanovdb). So in the end my solution looks very much like yours, except I define functions vs macros. Let me create a PR that includes this fix plus a ton of other (long overdue) improvements to NanoVDB. I'll point you to the relevant changes so you can validate that it does indeed work for you.

A warning, this new PR introduces new namespaces in NanoVDB so your client code might need to be tweaked. I can of course help.

w0utert · 2024-05-03T18:57:16Z

@kmuseth sounds good, that’s actually nicer than using macro’s! I will try your PR early next week and report back, but I’m pretty sure it will work.

w0utert · 2024-05-17T10:02:11Z

@kmuseth
I tested the feature/nanovdb_v32.7 branch from your fork and it works on Linux as well, thanks!

kmuseth · 2024-05-24T03:32:04Z

@w0utert excellent - so are you okay if we close this PR?

w0utert · 2024-05-24T07:42:20Z

@kmuseth yes, you can close this PR!

w0utert requested a review from kmuseth as a code owner April 25, 2024 12:55

w0utert mentioned this pull request Apr 25, 2024

[REQUEST] make nanoVDB CUDA async allocation optional so it can be used on vGPU #1798

Open

w0utert added 2 commits April 25, 2024 15:18

Fix typo in comment

4c01b38

Signed-off-by: Wouter Bijlsma <wouter.bijlsma@asml.com>

w0utert force-pushed the sync-cuda-malloc-flag branch from 2e63b1e to 4c01b38 Compare April 25, 2024 13:18

kmuseth reviewed May 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

w0utert commented Apr 25, 2024

linux-foundation-easycla bot commented Apr 25, 2024 •

edited

kmuseth left a comment

kmuseth left a comment •

edited

w0utert commented May 3, 2024 •

edited

w0utert commented May 3, 2024

kmuseth commented May 3, 2024 •

edited

w0utert commented May 3, 2024

w0utert commented May 17, 2024 •

edited

kmuseth commented May 24, 2024

w0utert commented May 24, 2024

Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

Are you sure you want to change the base?

Add NANOVDB_USE_SYNC_CUDA_MALLOC define to force sync CUDA malloc #1799

Conversation

w0utert commented Apr 25, 2024

linux-foundation-easycla bot commented Apr 25, 2024 • edited

kmuseth left a comment

Choose a reason for hiding this comment

kmuseth left a comment • edited

Choose a reason for hiding this comment

w0utert commented May 3, 2024 • edited

w0utert commented May 3, 2024

kmuseth commented May 3, 2024 • edited

w0utert commented May 3, 2024

w0utert commented May 17, 2024 • edited

kmuseth commented May 24, 2024

w0utert commented May 24, 2024

linux-foundation-easycla bot commented Apr 25, 2024 •

edited

kmuseth left a comment •

edited

w0utert commented May 3, 2024 •

edited

kmuseth commented May 3, 2024 •

edited

w0utert commented May 17, 2024 •

edited