You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently working on a project that involves repeated interpolation of values, and I'm running into some performance issues. The current process involves loading grid values from a file and then interpolating them in each iteration. Unfortunately, the constant loading and data transfer between host and device is causing a significant bottleneck.
I've thought about utilizing the constant memory on NVIDIA GPUs to store my grid, but I'm unsure how to implement this or if it's even the best solution. Moreover, I'm stumped on how to optimize this process for TPUs.
If anyone has experience with similar challenges or can offer suggestions on how to overcome this performance overhead, I'd greatly appreciate it! Some potential solutions I'm open to exploring include:
Optimizing data transfer and loading
Leveraging GPU/TPU architecture for faster computation
Alternative interpolation methods or libraries
Any other creative solutions you might have!
Thanks in advance ❤️ for your input and expertise 🤝!
Below I am adding the code and the grid values. They are stored in .hdf5 file format, available on this link.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey fellow JAX enthusiasts 👋🏼,
I'm currently working on a project that involves repeated interpolation of values, and I'm running into some performance issues. The current process involves loading grid values from a file and then interpolating them in each iteration. Unfortunately, the constant loading and data transfer between host and device is causing a significant bottleneck.
I've thought about utilizing the constant memory on NVIDIA GPUs to store my grid, but I'm unsure how to implement this or if it's even the best solution. Moreover, I'm stumped on how to optimize this process for TPUs.
If anyone has experience with similar challenges or can offer suggestions on how to overcome this performance overhead, I'd greatly appreciate it! Some potential solutions I'm open to exploring include:
Thanks in advance ❤️ for your input and expertise 🤝!
Below I am adding the code and the grid values. They are stored in
.hdf5
file format, available on this link.Beta Was this translation helpful? Give feedback.
All reactions