Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Just single Pixel rendering #108

Open
pknmax opened this issue Jan 23, 2024 · 18 comments
Open

Just single Pixel rendering #108

pknmax opened this issue Jan 23, 2024 · 18 comments

Comments

@pknmax
Copy link

pknmax commented Jan 23, 2024

Hello @vye16, @liruilong940607, @kerrj,
Is it possible to render just single pixel which returns the color of each single ray from different colmap camera positions using gsplat?

I am interested in getting final color at particular location for different view directions, by providing GS pointcloud, colmap camera poses, "location of target pixel for tracking in first fully rasterized image using first colmap camera pose from colmap pose sequence" or "3D location in scene (how to find it?)".

@hariharan1412
Copy link

Hey Hi again, are you able to achive this ?

@liruilong940607
Copy link
Collaborator

Hey. I'm not sure what is the actual need here. If you want to get the rendering for a particular pixel, is there anything preventing you from rendering a image and get the pixel color from it?

@hariharan1412
Copy link

Hey. I'm not sure what is the actual need here. If you want to get the rendering for a particular pixel, is there anything preventing you from rendering a image and get the pixel color from it?

Yeah, I tell you what is stopping us. We don't just need the pixel; we need that Ray of Gaussians with different opacity projected onto the image on that particular pixel. In short, I need the corresponding (x,y,z) position of the point from the point cloud that produced this particular pixel.

@liruilong940607
Copy link
Collaborator

liruilong940607 commented May 6, 2024

Ah I see. The current master does not support that because everything is fused in CUDA.

However, we are working on an upgrade to the CUDA backend in this PR #172, where a pure python implementation is support so that you can easily get that information. In the code below we are accumulating all Gaussians that contribute to each pixel in Python:

deltas = pixel_coords - means2d[camera_ids, gauss_ids] # [M, 2]
c = conics[camera_ids, gauss_ids] # [M, 3]
sigmas = (
0.5 * (c[:, 0] * deltas[:, 0] ** 2 + c[:, 2] * deltas[:, 1] ** 2)
+ c[:, 1] * deltas[:, 0] * deltas[:, 1]
) # [M]
alphas = torch.clamp_max(opacities[gauss_ids] * torch.exp(-sigmas), 0.999)
if prefix_trans is not None:
prefix_trans = prefix_trans[camera_ids, pixel_ids_y, pixel_ids_x]
indices = (camera_ids * image_height * image_width + pixel_ids).long()
total_pixels = C * image_height * image_width
weights, trans = render_weight_from_alpha(
alphas, ray_indices=indices, n_rays=total_pixels, prefix_trans=prefix_trans
)

If you are in a hurry need then you can just fetch that PR and use it. The PR itself is already seriously verified. Just we are thinking a best way of merging it without too much of interrupt on the master

@hariharan1412
Copy link

Thank you so much ❤️ I'll look into it.

@hariharan1412
Copy link

Ah I see. The current master does not support that because everything is fused in CUDA.

However, we are working on an upgrade to the CUDA backend in this PR #172, where a pure python implementation is support so that you can easily get that information. In the code below we are accumulating all Gaussians that contribute to each pixel in Python:

deltas = pixel_coords - means2d[camera_ids, gauss_ids] # [M, 2]
c = conics[camera_ids, gauss_ids] # [M, 3]
sigmas = (
0.5 * (c[:, 0] * deltas[:, 0] ** 2 + c[:, 2] * deltas[:, 1] ** 2)
+ c[:, 1] * deltas[:, 0] * deltas[:, 1]
) # [M]
alphas = torch.clamp_max(opacities[gauss_ids] * torch.exp(-sigmas), 0.999)
if prefix_trans is not None:
prefix_trans = prefix_trans[camera_ids, pixel_ids_y, pixel_ids_x]
indices = (camera_ids * image_height * image_width + pixel_ids).long()
total_pixels = C * image_height * image_width
weights, trans = render_weight_from_alpha(
alphas, ray_indices=indices, n_rays=total_pixels, prefix_trans=prefix_trans
)

If you are in a hurry need then you can just fetch that PR and use it. The PR itself is already seriously verified. Just we are thinking a best way of merging it without too much of interrupt on the master

I'm facing issues while I'm trying to run the example script alone

Traceback (most recent call last):
  File "example.py", line 33, in <module>
    from nerfdata.dataset.colmap.dataset import Dataset
ModuleNotFoundError: No module named 'nerfdata'

could you please help me solve this ?

@liruilong940607
Copy link
Collaborator

liruilong940607 commented May 7, 2024

That is a dependence we are not open-source yet.

So that means the example script in this PR is not directly executable. As I mentioned above, we are figuring out the best way of merging it in without outsourced dependence.

However it should be fairly easy to swap out nerfdata with your own data parser -- it is simply just for parsing a colmap capture. And if you are using gsplat in your own codebase, you don't need that nerfdata. The PR itself is ready to be plugged into your existing codebase and achieve your goal.

@hariharan1412
Copy link

Thanks for your reply. I'm working on it, and in the meantime, I have a small doubt: is there any render.py available in Gsplat like in '''graphdeco-inria''' where I can render the splats directly once the model is trained entirely so I can debug and modify code easily?

@liruilong940607
Copy link
Collaborator

If what you are looking for is a minimal script only to render a splat, we don't have it yet but we will add it together with that PR soon.

But it should not be very hard to create one quickly from that standalone example script in this PR.

@hariharan1412
Copy link

Thank

Ah I see. The current master does not support that because everything is fused in CUDA.

However, we are working on an upgrade to the CUDA backend in this PR #172, where a pure python implementation is support so that you can easily get that information. In the code below we are accumulating all Gaussians that contribute to each pixel in Python:

deltas = pixel_coords - means2d[camera_ids, gauss_ids] # [M, 2]
c = conics[camera_ids, gauss_ids] # [M, 3]
sigmas = (
0.5 * (c[:, 0] * deltas[:, 0] ** 2 + c[:, 2] * deltas[:, 1] ** 2)
+ c[:, 1] * deltas[:, 0] * deltas[:, 1]
) # [M]
alphas = torch.clamp_max(opacities[gauss_ids] * torch.exp(-sigmas), 0.999)
if prefix_trans is not None:
prefix_trans = prefix_trans[camera_ids, pixel_ids_y, pixel_ids_x]
indices = (camera_ids * image_height * image_width + pixel_ids).long()
total_pixels = C * image_height * image_width
weights, trans = render_weight_from_alpha(
alphas, ray_indices=indices, n_rays=total_pixels, prefix_trans=prefix_trans
)

If you are in a hurry need then you can just fetch that PR and use it. The PR itself is already seriously verified. Just we are thinking a best way of merging it without too much of interrupt on the master

Thank you for your work. I was successfully able to train the model and I have one doubt about where you are calling this accumulate() in your code. I read your code, but I don't think you are calling it anywhere, so could you please tell me where to use this function exactly?

@liruilong940607
Copy link
Collaborator

liruilong940607 commented May 9, 2024

That's used in this function:

_render_colors, _render_alphas = _rendering(
means, quats, scales, opacities, colors, viewmats, Ks, width, height
)

Which is a exact substitute of the rendering function. We are currently only using it as a gradient checker to our fully fused implementation.

Note that if you call _rendering function, you should expect much higher memory usages and much slower speed as it is running rasterization in pure python. That's just the price have to pay in exchange for flexibility

@hariharan1412
Copy link

When the training is happening, I clearly see that rasterization is happening; however, when I try to run the rasterization alone, I keep ending up with a big, plain single-color image. Could you please help me? 

rendering Script :

def test_rasterize_to_pixels():
      from gsplat.experimental.cuda import _rendering, 
      data_path = "results/splatfacto_model_data.npz"
  
      data = np.load(data_path)  # 3 cameras
      height, width = data["height"].item(), data["width"].item()
      viewmats = torch.from_numpy(data["viewmats"]).to(device)
      Ks = torch.from_numpy(data["Ks"]).to(device)
  
      means = torch.from_numpy(data["means3d"]).to(device)
      
      scales = torch.from_numpy(data["scales"]).to(device) * 0.1 + 0.1
  
      quats = torch.from_numpy(data["quats"]).to(device) 
          
      opacities = torch.from_numpy(data["opacities"]).to(device) 
      opacities = opacities.squeeze()
  
      
      Ks = torch.tensor([[320.0, 0.0, 320.0],
                         [0.0, 320.0, 320.0],
                         [0.0, 0.0, 1.0]]).unsqueeze(0).to(device=device)  # Single camera matrix
      N = data["viewmats"].shape[0]  # Assuming viewmats is [C, 4, 4]
      Ks = Ks.repeat(N, 1, 1)  
  
  
      colors = torch.from_numpy(data["colors"]).to(device)
      colors = colors.unsqueeze(0).repeat(N, 1, 1)
  
  
      print(colors.shape)
      C = len(Ks)
  
      viewmats.requires_grad = True
      quats.requires_grad = True
      scales.requires_grad = True
      means.requires_grad = True
  
  
  
      render_colors, render_alphas = _rendering(
          means, quats, scales, opacities, colors, viewmats, Ks, width, height
      )
      
      return render_colors, render_alphas
      
 render_colors, render_alphs = test_rasterize_to_pixels()

But when I plot the render_colors, I end up with the same color on all pixels.

image

@liruilong940607
Copy link
Collaborator

You probably don't want *0.1+0.1 in this line if you have a trained model already.
scales = torch.from_numpy(data["scales"]).to(device) * 0.1 + 0.1

@hariharan1412
Copy link

Yeah, I changed that, but still, I'm facing the same issue. Could you please tell me how I can pass my own camera extrinsic from transforms? json, when I try to load it, it shows grad_fn is required, so I tried gradient = true, false, and still I can't pass my own camera extrinsic. So far, I'm passing dummy.

@abrahamezzeddine
Copy link

abrahamezzeddine commented May 22, 2024

Ah I see. The current master does not support that because everything is fused in CUDA.

However, we are working on an upgrade to the CUDA backend in this PR #172, where a pure python implementation is support so that you can easily get that information. In the code below we are accumulating all Gaussians that contribute to each pixel in Python:

deltas = pixel_coords - means2d[camera_ids, gauss_ids] # [M, 2]
c = conics[camera_ids, gauss_ids] # [M, 3]
sigmas = (
0.5 * (c[:, 0] * deltas[:, 0] ** 2 + c[:, 2] * deltas[:, 1] ** 2)
+ c[:, 1] * deltas[:, 0] * deltas[:, 1]
) # [M]
alphas = torch.clamp_max(opacities[gauss_ids] * torch.exp(-sigmas), 0.999)
if prefix_trans is not None:
prefix_trans = prefix_trans[camera_ids, pixel_ids_y, pixel_ids_x]
indices = (camera_ids * image_height * image_width + pixel_ids).long()
total_pixels = C * image_height * image_width
weights, trans = render_weight_from_alpha(
alphas, ray_indices=indices, n_rays=total_pixels, prefix_trans=prefix_trans
)

If you are in a hurry need then you can just fetch that PR and use it. The PR itself is already seriously verified. Just we are thinking a best way of merging it without too much of interrupt on the master

@liruilong940607 Thank you for your great work. I am wondering if this something it will do, or information that will be possible to get? From another thread, I asked for this but I do not think the community knows this yet as it might yet to be implemented. But as I read this, it seems that this will be supported soon with this PR? See below comment if you think it will solve it;

With COLMAP, it is possible to extract data to determine which images and keypoints correspond to specific 3D points and to identify which images have been matched.

I am wondering if it is possible to track which Gaussian splats correspond to which keypoints, along with their associated image files and pixel coordinates via some sort of index? Is there a way to "tap" into the code to keep track of this when using Nerfstudio/GSplat or is that lost in the process?

Specifically, I want to identify the originating image and pixel coordinate for each Gaussian splat (with respect to the gaussian splatted point cloud). While COLMAP provides this information for 3D points, I would like to know if it is possible to extend this tracking to the exported PLY file for the splats.

I've created a script for COLMAP that consolidates the points3d.bin, camera.bin, and images.bin files into a single file. This simplifies the interpretation of data, clarifying contributions to pixel coordinates, image filenames, and their indices.

Example such as below;

POINT3D_ID 3D_X 3D_Y 3D_Z R G B POINT2D_IDX PIXEL_X PIXEL_Y IMAGE_ID IMAGE_NAME
1054 -2.2514710427743658 0.34472229191518061 -0.31922396259922448 169 167 142 2166 2189.711181640625 2253.42578125 83 IMG_2692.JPG
1054 -2.2514710427743658 0.34472229191518061 -0.31922396259922448 169 167 142 3340 2936.756103515625 2486.4677734375 85 IMG_2694.JPG
1054 -2.2514710427743658 0.34472229191518061 -0.31922396259922448 169 167 142 2382 367.91998291015625 2102.93994140625 80 IMG_2689.JPG
446883 3.5806275984290243 0.52853412315400827 2.527234807406364 193 188 168 7873 3242.96533203125 2536.799072265625 262 IMG_2871.JPG
446883 3.5806275984290243 0.52853412315400827 2.527234807406364 193 188 168 7991 2603.289794921875 2463.16650390625 263 IMG_2872.JPG

I am looking to do the same with GSplat, if possible. Would be great to know how many times a point has been split, from what point they were split from and from what image those splitted points are based on, (or, if I at least know from which initial point the splits are based on, I can "backtrack" this to the COLMAP index) then it would be possible for me to create a complete index, all the way from the splats, to the very first point from COLMAP and their associatied image.

In the end, I am looking to project the gaussian splat points ONTO the images as an overlay to see how well they match against the images and compare them with the COLMAP sparse keypoints overlayed on the same images. If I can get some sort of index data, then that would be golden!

@liruilong940607
Copy link
Collaborator

Hi @abrahamezzeddine, the splitting of the Gaussians are control on the python side, not CUDA, so it should be very doable to track the IDs of the GSs by hacking the python code.

@abrahamezzeddine
Copy link

Hi @abrahamezzeddine, the splitting of the Gaussians are control on the python side, not CUDA, so it should be very doable to track the IDs of the GSs by hacking the python code.

Hello @liruilong940607
Could you please guide me a little bit? I am not very familiar with the GSplat Code in general, so a little push in the right direction and I should be able to do the rest, hopefully. :)

@liruilong940607
Copy link
Collaborator

Gsplat repo will have a big update very soon, in which we will put a standalone training script in. For now I could only point you to the nerfstudio repo.

The splitting & cloning & pruning is happening here in nerfstudio's implementation.

https://github.com/nerfstudio-project/nerfstudio/blob/742963e07369b70912b786a100e94c98dec1b742/nerfstudio/models/splatfacto.py#L439-L494

So I guess what you want is to maintain a tensor storing the initial IDs, and split/clone/prune that tensor together with all other attributes of GS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants