ScanNet data generation #18

sangrockEG · 2022-12-12T03:45:28Z

First of all, thank you for publishing your implementation.

I want to generate the ScanNet dataset using the learned weights.
For this, from the huggingface, I downloaded the files including last.ckpt.

Then, using the demo code, I tried to render the images of the first scene (scene0000_00).
For rendering without additional training or evaluation, I slightly modified the final block of scannet.gin as follows:

run.run_render = True
run.run_train = False
run.run_eval = False

After that, I run the demo code with

python -m run --ginc configs/scannet.gin --scene_name scene0000_00

However, when I run the demo code, it seems taking too much memory and returns the following message.

Unable to allocate array with shape (1210619520, 3) and data type float64

This issue also had been mentioned by #11.
The rendering loop (predict_step in /model/plenoxel_torch/model.py) seems to sequentially render the image tensors and keep all of them on RAM.
Maybe this part has better to be fixed for better accessibility of the dataset.

Anyway, in my case, I just picked one pose (frame_id=0) and rendered a single image.
The code runs without error, but it returns an unexpected result.
Fortunately, at least I can see the room-like shape (probably the room of scene0000_00, right?).

It seems that there is a pose-related problem.
The following (intermediate) pose tensors might be helpful for figuring out what is wrong.

original pose (before processing with pcd-related things)

[[[-9.554210e-01  1.196160e-01 -2.699320e-01  2.655830e+00]
  [ 2.952480e-01  3.883390e-01 -8.729390e-01  2.981598e+00]
  [ 4.080000e-04 -9.137200e-01 -4.063430e-01  1.368648e+00]
  [ 0.000000e+00  0.000000e+00  0.000000e+00  1.000000e+00]]]

render_pose (the finally returned one)

[[[-9.80858835e-01  2.35084399e-18 -1.94721569e-01  2.96767746e-01]
  [-1.16803752e-07  9.99999718e-01 -7.10082718e-07  3.07291136e-02]
  [ 1.94722179e-01 -1.46270149e-17 -9.80858767e-01  1.29165942e+00]
  [ 0.00000000e+00  0.00000000e+00  0.00000000e+00  1.00000000e+00]]]

I'm not very familiar with NeRF-related things, so the aforementioned trials might be wrong somewhere.
Any help would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

Minhluu2911 · 2023-03-21T11:06:03Z

Have you try to use trans_info.npz to convert the pose. After loading pose from ScanNet convert it using the code below:

trans_info = np.load("path/to/trans_info.npz")
T = trans_info['T']
pcd_mean = trans_info['pcd_mean'] 
scene_scale = trans_info['scene_scale']
poses = T @ poses
poses[:, :3, 3] -= pcd_mean
poses[:, :3, 3] *= scene_scale
poses = poses.astype(np.float32)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ScanNet data generation #18

ScanNet data generation #18

sangrockEG commented Dec 12, 2022

Minhluu2911 commented Mar 21, 2023

ScanNet data generation #18

ScanNet data generation #18

Comments

sangrockEG commented Dec 12, 2022

Minhluu2911 commented Mar 21, 2023