Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does the vocabulary we should use depend on our data ? #33

Open
RashidLadj opened this issue Aug 13, 2020 · 2 comments
Open

does the vocabulary we should use depend on our data ? #33

RashidLadj opened this issue Aug 13, 2020 · 2 comments

Comments

@RashidLadj
Copy link

RashidLadj commented Aug 13, 2020

Hello @rmsalinas @shinsumicco ,

I had to test DBoW2, DBoW3, and FBoW, and I didn't understand something important.
On DBoW2, a demo.cpp code has been provided with a dataset of 4 images, the first step is to retrieve the features, then create the vocabulary with these features, and see the score between each pair of images.

On FBoW, it's a little bit the same, except that a "vocabulary" file with the "Orb" descriptor was provided, so I used it directly to see the correspondence between each pair of images in my dataset which gave me pretty good results, but I also built my own vocabulary with my dataset, and I redid the test on my dataset, I get fairly good results (maybe less good than the first ones), and so my question is:

  • the exiting vocabulary, it was built with which image dataset?
  • the vocabulary to use, does it depend on our data? (I think not, since I tested my own dataset with the existing vocabulary, and it gave me good results)
  • To say that a vocabulary is rich and robust, should it be created with a huge dataset containing images from different places?
  • In addition, I would like to know what is the impact of the choice of L (depth/level of the tree), and of K (number of children of each node)

Thank you kindly for clarifying my ideas a little bit, and thank you for your codes which are very clean, and which will be used by a lot of people

Good luck for the future.

@shinsumicco
Copy link
Contributor

shinsumicco commented Aug 15, 2020

@RashidLadj
I'm not an author of DBoW2, DBoW3, or FBoW, so let me answer your question as far as my understanding allows.

The exiting vocabulary, it was built with which image dataset?

Sorry that I'm not sure. Please ask the owner of it.

The vocabulary to use, does it depend on our data? (I think not, since I tested my own dataset with the existing vocabulary, and it gave me good results)

The answer is YES.
But I expected that the performance doesn't decrease greatly if you use the existing vocabulary, because it is intended to be cover various types of scenes by learning with huge image dataset.

To say that a vocabulary is rich and robust, should it be created with a huge dataset containing images from different places?

The answer is YES.
To make the vocabulary tree more "generalized", it should be built with images whose have a lot of variation.

In addition, I would like to know what is the impact of the choice of L (depth/level of the tree), and of K (number of children of each node)

The greater K -> improves the robustness and generalization performance. Also increases the cost of the nearest-neighbor search at each node.
The greater L -> improves the robustness and generalization performance. Also increases the number of times of the nearest-neighbor search.
Please consider an extreme example, e.g. L=1, K=3, L=1, K=100, and L=100, K=100.

@RashidLadj
Copy link
Author

RashidLadj commented Aug 15, 2020

@shinsumicco

First of all, thank you very much for your answer, I know very well that you are not the author, but as you used it on OpenVSLAM, I thought to myself that you understood the stakes of BOW quite well, thank you once again.

I will do a more advanced test in the week concerning the choice of L and K to fully understand their impact.

for the moment I kept the default values for the construction of my own vocabulary with 17 images which generated me a FBoW file of 4.6 MB while the one given by the owner was about 46 MB for 190 images, can we to say that with 190 images we have covered enough link with different brightness and texture to judge it as being Robust? I find that a little weird for such a small number, more extensive tests on my side are to be expected.

I have a last question if it is possible to answer me, in the case of OpenVSLAM, you used the vocabulary given on FBoW?

There is a lot of question like is BOW effective on Equirectangular images for example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants