Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add high resolution monocular priors script #76

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pablovela5620
Copy link
Contributor

@pablovela5620 pablovela5620 commented Apr 4, 2023

I added an option to generate high resolution monocular priors similar to monosdf script with a few modifications. I made this a draft as there are some issues I found that I wanted to ask some questions. I also temporarily added some visualization scripts to help debug. This is what most outputs currently look like

cues-comparison

You can see that the high resolutions cues look much better v.s. the interpolated (especially around edges) but there are some problems @niujinshuchong may be able to help answer?

Here is a link the custom dataset I generated using nerfstudio ns-process-data
https://drive.google.com/file/d/1EPVHdDuV3vCEaF2852FeJR9yi-kUblZS/view?usp=sharing

to generate the low resolution cues use

python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data INPUT_DIR --output OUTPUT_DIR --data-type colmap --scene-type object --mono-prior --crop-mult 2 --omnidata-path OMNIDATA_PATH --pretrained-models PRETRAINED_MODELS

to generate the high resolution cues use

python scripts/datasets/process_nerfstudio_to_sdfstudio.py --data INPUT_DIR --output OUTPUT_DIR --data-type colmap --scene-type object --mono-prior --highres-mono-prior --crop-mult 2 --omnidata-path OMNIDATA_PATH --pretrained-models PRETRAINED_MODELS

To use the visualizer first

pip install rerun-sdk

then from the sdfstudio directory you can use

python compare_mono_priors.py --lowres-path data/corazon_studio-sdf-lowres/ --highres-path data/corazon_studio-sdf-highres/

assuming that is where the dataset is

  1. I assume that we are being given a 768x768 image rather then a 1920x1080 image (this can be done using the --crop-mult 2 in the process_nerfstudio_to_sdfstudio.py. This is mostly because I noticed that using a 1920x1080 takes an EXTREMELY long time compared to the original 384x384 due to the large number of patches created (a few hours). 768x768 seems like a good middle ground (a few minutes) v.s. the original 384x384 (a few seconds)
  2. The outputs from the function seem fine (higher fidelity v.s. the interpolated) until I train a network and notice a significant degradation compared to just using the 384x384 depths and upscaling them with bilinear interpolation. Here is a sample frame where the high cue generation just seems to fall apart and produce weird garbled values
    bad-cues

I think this is probably due to the large white wall that makes it difficult for patches to be correctly merged together and to have some sort of coherency since the depth/normal values don't align well. is this something you've come across @niujinshuchong ?

A solution I could come up with is just to simply throw these clearly bad frames away using some sort of heuristic taking into account previous frames but that seems very brittle

@niujinshuchong
Copy link
Member

Hi, thanks for working on this. How to generating a high-resolution monocular depth and normal is still a open research problem. I don't think using our patch-merge way in MonoSDF would generalise to a wide range of scenarios. As you shown above, the merged depth is not good in the white wall and the merged normal map looks different with low res normal map . Here is a method https://github.com/compphoto/BoostingMonocularDepth that can generate high resolution depth map and the results look very good but unfortunately normal input is not supported.

@pablovela5620
Copy link
Contributor Author

Understood, I'm not sure if this is worth merging or not then. It could be helpful for folks to try and have the generation code available, but I'll leave it to your discretion. If you deem it worth merging I'll remove the visualization code and clean things up a bit. Otherwise you're more than welcome to close this PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants