-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nd rasterizer is 10x slower than rasterizer #68
Comments
Hi! What N are you using? In rasterization, each pixel requires an N-d
array of workspace memory. For RGB, we can fit that in register memory, and
can specify this statically at compile time. We wrote N-d for the case that
the necessary workspace exceeds available register memory, and must be in
global memory. This means we can't make the same kinds of optimizations in
the RGB rasterizer. If this is the case for you, then you can either stick
with the global memory situation, or you can rasterize in batches with the
current optimized RGB rasterizer (channels 0-3, 3-6, etc). We're
considering adding an in-between version of the rasterizer for
`MAX_REGISTER_CHANNELS=16` with similar optimizations to the RGB rasterizer.
…On Thu, Nov 2, 2023 at 3:52 PM Zubair Irshad ***@***.***> wrote:
Hi, Great work! nd rasterizer is around 10x slower than sh rasterizer. To
be precise, my model inference time with sh rasterization is 0.008s which
gives me >100FPS as described in the original gaussian splatting paper but
just adding nd rasterizaiton reduces it to 0.075 s and 13 FPS.
Is there a way to make it better? Any intuition would be greatly
appreciated? With nd rasterization, it looks like we loose the benefits
i.e. speed of gaussian splatting. Thank you again for the awesome work!
—
Reply to this email directly, view it on GitHub
<#68>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLOKW3JAPS4MXO7IX2BRLTYCQP4PAVCNFSM6AAAAAA63TUTHCVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3TKMJWGQ3DKMA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you for the great intuition and detailed response. My channel size is currently 29 but I am considering increasing the feature size to 128 or even 256 which my worry is it will be slower than 13 FPS. I will try the batched RGB rasterizer as you suggested in a for-loop manner and see if it gives a higher FPS, thank you! |
@vye16 Reporting back what I found. Implementing a for loop to rasterize multiple channels in batches i.e. 0-3, 3-5 etc instead of ND rasterization is slightly worse in performance and I didn't find it to improve performance. My guess is due to the for loop which has to run 10 times for the channel size I am trying i.e. 30. Any other intuition to improve performance is greatly appreciated, thank you! Just to provide more specifics, per iteration time for 640 by 480 image for nd rasterization with N=30 is |
Update: with batched implementation fps increased to |
@vye16 @maturk Any plans on supporting larger register channels i.e. MAX_REGISTER_CHANNELS>3 perhaps 16 or 32 to achieve same level of optimization that native sh rasterizer gives? I am happy to create a PR. Though just increasing this number gives some errors elsewhere for instance I am wondering if there are any downsides of specifying 128, 256 or 512 MAX_REGISTER_CHANNELS, would it affect the memory? I think GPUs with larger sizes can support this? Any intuition is greatly appreciated. |
Hi Zubair, sorry for the late response. Currently the color rasterization
represents color in float3 (CUDA vectorized type). We can make a version
that accepts N-d colors up to ~32 channels that could fit in shared memory
during rasterization. Unfortunately 128, 256, 512 would be too big to fit
in shared memory in one pass, but it is possible to rasterize them in
batches of channels (0-32) that fit in shared memory. This is unlikely to
reach similar performance, but would be better than the current
ND-rasterizer. In the near-term we're not currently working on it, but I'm
happy to guide you if you'd like to make a PR.
|
Thanks @vye16! I am happy to work on it and make a PR. Any pointers on where I start/which parts I look at changing first would be appreciated, thanks a lot! |
Any update to this issue? I am also working on rendering high dimensional features, and want to know how to speed up nd rasterizer |
#130 works towards this issue, let me know if you try it out! @zubair-irshad |
This is great, I will check it asap. Thanks @kerrj. |
Hi, Great work! nd rasterizer is around 10x slower than sh rasterizer. To be precise, my model inference time with sh rasterization is 0.008s which gives me >100FPS as described in the original gaussian splatting paper but just adding nd rasterizaiton reduces it to 0.075 s and 13 FPS.
Is there a way to make it better? Any intuition would be greatly appreciated. With nd rasterization, it looks like we lose the benefits i.e. speed of gaussian splatting. Thank you again for the awesome work!
The text was updated successfully, but these errors were encountered: