Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use minimal solvers from poselib #2288

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Conversation

ahojnnes
Copy link
Contributor

@ahojnnes ahojnnes commented Dec 3, 2023

No description provided.

CMakeLists.txt Show resolved Hide resolved
Comment on lines 48 to 55
include(FetchContent)
FetchContent_Declare(PoseLib
GIT_REPOSITORY https://github.com/PoseLib/PoseLib.git
GIT_TAG c23af0c51924e3337f7bb193bb72899ec6dd4d3d
)
message(STATUS "Configuring PoseLib...")
FetchContent_MakeAvailable(PoseLib)
message(STATUS "Configuring PoseLib... done")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pablospe Here. IMHO the alternative (PoseLib as submodule) is less convenient because it requires users to install an additional dependency or to add --recursive when cloning colmap (which is easy to forget).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right and I agree. I was actually suggesting FetchContent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... actually thinking it could be an option and see the opinions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Viktor's PR has been merged into master, see PoseLib/PoseLib#86

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, consider a scenario where someone is working on modifications into PoseLib which are used/assessed in context of colmap and such work is performed on local machine.

If i understand correctly, in case of FetchContent (if we ignore relatively new Dependency Providers functionality) it would be necessary to modify a call to FetchContent_Declare in order to point to a local folder with modified PoseLib sources, also, for those who use IDE, working on such sources would require one more IDE context.

In case of PoseLib dependency satisfied via git submodule no "temporary" modifications to colmap's cmake scripts would be required and such work could be performed in single IDE context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious about what you meant exactly by ignoring Dependency Providers. Maybe you can expand more on this, I was reading the doc and it seems that it is not ignored (but I am not sure what you refer):
https://cmake.org/cmake/help/latest/command/cmake_language.html#dependency-providers

My concern with FetchContent is how it plays with code autocomplete (like in vscode or another IDE), since you need to configure it first with cmake before the tools can understand the PoseLib code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for a bit misleading wording, i have meant that one could use a Dependency Providers mechanism to override default behavior and to point cmake into standalone PoseLib instance which state could be preserved/shared between multiple configurations. The part about "ignoring" was to highlight the most probable action for a user facing the scenario i proposed to consider.

Regarding your concern about IDE features that becomes available only after cmake project is configured - i agree that this is another angle of the same issue.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Dec 16, 2023

On exact same two-view geometries, P3P from poselib vs. P3P from colmap doesn't seem to make any difference in practice. Runtime difference is not measureable.

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/poselib/
I1216 13:33:07.592862 48175 model.cc:429] Cameras: 1
I1216 13:33:07.592973 48175 model.cc:430] Images: 128
I1216 13:33:07.592978 48175 model.cc:431] Registered images: 128
I1216 13:33:07.592980 48175 model.cc:433] Points: 61160
I1216 13:33:07.592981 48175 model.cc:434] Observations: 326980
I1216 13:33:07.592985 48175 model.cc:436] Mean track length: 5.346305
I1216 13:33:07.592995 48175 model.cc:438] Mean observations per image: 2554.531250
I1216 13:33:07.592999 48175 model.cc:441] Mean reprojection error: 0.512072px

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/main/
I1216 13:33:13.860183 48237 model.cc:429] Cameras: 1
I1216 13:33:13.860323 48237 model.cc:430] Images: 128
I1216 13:33:13.860327 48237 model.cc:431] Registered images: 128
I1216 13:33:13.860330 48237 model.cc:433] Points: 61160
I1216 13:33:13.860332 48237 model.cc:434] Observations: 326980
I1216 13:33:13.860337 48237 model.cc:436] Mean track length: 5.346305
I1216 13:33:13.860347 48237 model.cc:438] Mean observations per image: 2554.531250
I1216 13:33:13.860350 48237 model.cc:441] Mean reprojection error: 0.512072px

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Dec 16, 2023

Essentially identical results on E2E metrics when including poselib solvers in two-view geometry estimation. Poselib matching is ~10% faster due to faster minimal solver implementation.

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/poselib-with-matching/
I1216 18:21:10.982481 109487 model.cc:429] Cameras: 1
I1216 18:21:10.982582 109487 model.cc:430] Images: 128
I1216 18:21:10.982586 109487 model.cc:431] Registered images: 128
I1216 18:21:10.982589 109487 model.cc:433] Points: 82357
I1216 18:21:10.982590 109487 model.cc:434] Observations: 482856
I1216 18:21:10.982595 109487 model.cc:436] Mean track length: 5.862962
I1216 18:21:10.982606 109487 model.cc:438] Mean observations per image: 3772.312500
I1216 18:21:10.982610 109487 model.cc:441] Mean reprojection error: 0.615324px

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/main-with-matching/
I1216 18:21:16.408385 109543 model.cc:429] Cameras: 1
I1216 18:21:16.408489 109543 model.cc:430] Images: 128
I1216 18:21:16.408493 109543 model.cc:431] Registered images: 128
I1216 18:21:16.408496 109543 model.cc:433] Points: 82330
I1216 18:21:16.408499 109543 model.cc:434] Observations: 482931
I1216 18:21:16.408504 109543 model.cc:436] Mean track length: 5.865796
I1216 18:21:16.408514 109543 model.cc:438] Mean observations per image: 3772.898438
I1216 18:21:16.408517 109543 model.cc:441] Mean reprojection error: 0.615088px

@ahojnnes
Copy link
Contributor Author

Assuming unknown focal length, the results are also virtually identical. Overall reconstruction runtime is ~5% lower.

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/main-with-focal-estimation/
I1217 15:56:08.626852 171591 model.cc:429] Cameras: 1
I1217 15:56:08.626966 171591 model.cc:430] Images: 128
I1217 15:56:08.626971 171591 model.cc:431] Registered images: 128
I1217 15:56:08.626973 171591 model.cc:433] Points: 82334
I1217 15:56:08.626976 171591 model.cc:434] Observations: 482940
I1217 15:56:08.626981 171591 model.cc:436] Mean track length: 5.865621
I1217 15:56:08.626991 171591 model.cc:438] Mean observations per image: 3772.968750
I1217 15:56:08.626996 171591 model.cc:441] Mean reprojection error: 0.615133px
$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/poselib-with-focal-estimation
I1217 15:56:00.155539 171526 model.cc:429] Cameras: 1
I1217 15:56:00.155656 171526 model.cc:430] Images: 128
I1217 15:56:00.155660 171526 model.cc:431] Registered images: 128
I1217 15:56:00.155663 171526 model.cc:433] Points: 82330
I1217 15:56:00.155665 171526 model.cc:434] Observations: 482936
I1217 15:56:00.155670 171526 model.cc:436] Mean track length: 5.865857
I1217 15:56:00.155681 171526 model.cc:438] Mean observations per image: 3772.937500
I1217 15:56:00.155685 171526 model.cc:441] Mean reprojection error: 0.615118px

@vlarsson
Copy link
Contributor

Assuming unknown focal length, the results are also virtually identical. Overall reconstruction runtime is ~5% lower.

$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/main-with-focal-estimation/
I1217 15:56:08.626852 171591 model.cc:429] Cameras: 1
I1217 15:56:08.626966 171591 model.cc:430] Images: 128
I1217 15:56:08.626971 171591 model.cc:431] Registered images: 128
I1217 15:56:08.626973 171591 model.cc:433] Points: 82334
I1217 15:56:08.626976 171591 model.cc:434] Observations: 482940
I1217 15:56:08.626981 171591 model.cc:436] Mean track length: 5.865621
I1217 15:56:08.626991 171591 model.cc:438] Mean observations per image: 3772.968750
I1217 15:56:08.626996 171591 model.cc:441] Mean reprojection error: 0.615133px
$ ./src/colmap/exe/colmap model_analyzer --path ~/data/south-building/poselib-with-focal-estimation
I1217 15:56:00.155539 171526 model.cc:429] Cameras: 1
I1217 15:56:00.155656 171526 model.cc:430] Images: 128
I1217 15:56:00.155660 171526 model.cc:431] Registered images: 128
I1217 15:56:00.155663 171526 model.cc:433] Points: 82330
I1217 15:56:00.155665 171526 model.cc:434] Observations: 482936
I1217 15:56:00.155670 171526 model.cc:436] Mean track length: 5.865857
I1217 15:56:00.155681 171526 model.cc:438] Mean observations per image: 3772.937500
I1217 15:56:00.155685 171526 model.cc:441] Mean reprojection error: 0.615118px

If you have a single camera, is the unknown focal length solver used? I assumed it would be running P3P once the camera is in the reconstruction? Also comparing to pycolmap at least, H matrix estimation was the problem where I saw the biggest improvement in terms of runtime. (iirc this is run for the geometric verification as well?) Might be worthwhile to replace this solver too.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Dec 18, 2023

@vlarsson I hardcoded to always use p4pf in the experiment above, but you are right that I probably did not do the same for the existing focal length estimation experiment. I will also take a look at replacing the homography solver. Thanks for the hint.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Dec 18, 2023

@vlarsson I double confirmed and my experimental results above are correct. In both cases, I forced focal length estimation for each camera from scratch. The runtime of p4pf is still faster, because we don't have to run 20 P3Ps in parallel but only a single RANSAC. p4pf likely also works better than the sampling based approach when the initial guess of the focal length is very far off from the true focal length.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Dec 18, 2023

@vlarsson I noticed that your homography solver doesn't normalize the image coordinates, which may not behave well for images with large image resolution due to squaring of pixel coordinates in the design matrix (e.g., for images with size of ~10'000 pixels, the matrix will have values of ~100M). Did you not observe any issues in this case?

EDIT: I did some quick and dirty experiments and it seems like the normalization of points is not necessary. I assume it accounts for a significant fraction of the overhead of the colmap homography solver...

@vlarsson
Copy link
Contributor

@vlarsson I noticed that your homography solver doesn't normalize the image coordinates, which may not behave well for images with large image resolution due to squaring of pixel coordinates in the design matrix (e.g., for images with size of ~10'000 pixels, the matrix will have values of ~100M). Did you not observe any issues in this case?

EDIT: I did some quick and dirty experiments and it seems like the normalization of points is not necessary. I assume it accounts for a significant fraction of the overhead of the colmap homography solver...

I have not seen any issues with this either, but I have also not done extensive experiments on it :) My gut feeling is that normalisation does not matter so much for the minimal 4p problem, but is mostly important if you want to run DLT with more points.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Jan 7, 2024

@tsattler Do you have any datasets that could be suitable to evaluate the changes in this branch? I ran the changes against some of the standard datasets from UNC and it all looks good. Do you have anything more suitable like your Aachen dataset to check in particular the new p4pf solver against? Cheers.

@tsattler
Copy link
Contributor

tsattler commented Jan 7, 2024

@ahojnnes Sure, I can do this. Can look into this in a week or so once I finished CVPR reviews and other things. If I don't reply in some time, can you please ping me?

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Jan 7, 2024

Great, thank you. No urgency. I will try my best to remember :-)

@tsattler
Copy link
Contributor

tsattler commented Jan 7, 2024

Actually, playing with focal length solvers has been on my TODO list for a while. Is it only the P4Pf solver or also something that estimates the radial distortion?

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Jan 7, 2024

Only focal length. I just swapped out the existing sampling+p3p with poselib's p4pf solver. As a next step, we could also try to estimate radial distortion but I wanted to keep this for a future pull request.

@vlarsson
Copy link
Contributor

vlarsson commented Jan 8, 2024

@tsattler I am also interested in this. Also for the two-view case. Let's talk more :)

@tsattler
Copy link
Contributor

tsattler commented Jan 8, 2024

@vlarsson Sounds good. We can try to setup a benchmark for these methods together.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Jan 23, 2024

@tsattler Gentle reminder for the above 🙂😉 T - 24 hours.

@tsattler
Copy link
Contributor

@ahojnnes Thanks for the reminder. It is on my TODO list and climbing higher in terms of priorities :)
What are the most important parts to cover in terms of experiments? Which solvers? Which types of experiments?

@ahojnnes
Copy link
Contributor Author

I am quite confident about all solvers except for the new p4pf estimator. The Aachen dataset could be useful?

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Mar 1, 2024

@vlarsson @tsattler I now tested the performance of this branch, in particular the p4pf solver in replacement of the sampling based approach, and it appears that p4pf performs worse than sampling on a number of datasets with unknown focal length from the internet. Given my current doubts about the quality of p4pf, I am probably going to go ahead with this PR but undo the p4pf changes.

@ahojnnes
Copy link
Contributor Author

ahojnnes commented Mar 1, 2024

For example, on the cornell arts quad dataset, I get the following with p4pf:
image
versus the following for colmap's sampling based approach:
image

It's not that either of them are perfect but the p4pf one is visibly worse.

@S-o-T
Copy link
Contributor

S-o-T commented Mar 28, 2024

@ahojnnes i would like to one more time kindly ask to consider not using cmake's FetchContent as a mean to satisfy PoseLib dependency and rather integrate PoseLib either as a git submodule or as an externally provided library to be discovered via find_package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants