Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creat Gpu TFTensor from Cuda array on GPU to avoid the deviceTohost copy. #478

Open
serjl opened this issue Mar 2, 2020 · 0 comments
Open

Comments

@serjl
Copy link

serjl commented Mar 2, 2020

Is your feature request related to a problem? Please describe.
TFTensor object usually obtained from a CPU array meaning a need of data copy from a device(GPU) to a host (CPU). This pipleine architecture is considerably slow for a large dataset (e.g. large images).

Describe the solution you'd like
My pipeline considers the image processing via cuda(managedCuda wrapper) and its libraries (npp). At some point i would like to feed my CNN with an image stored on GPU as npp Image or just cuda array or just device pointer - call this d_array for the sake of convenience. Of course, one can copy it to the host to get a standard host

d_array.CopytoHost(h_array);

and then define the usual

var tensor = new TFTensor (h_array);

Is there and option to get the device tensor d_tensor from d_array directly avoiding CopytoHost operation and feed it to CNN?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant