Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationale for Vector of CartesianIndex{2} for output 'matches' of match_keypoints #45

Open
zygmuntszpak opened this issue Nov 9, 2017 · 16 comments

Comments

@zygmuntszpak
Copy link
Member

zygmuntszpak commented Nov 9, 2017

I was wondering if someone could explain the design choice for making the match_keypoints function return Vector{CartesianIndex{2}}.

Coming from MATLAB, I was expecting this function to return two matrices, either an N x 2 matrix, or a 2 x N matrix, where the first matrix contains the points in the first view, and the second matrix contains the points in the second view. I work in the field of Multiple View Geometry (MVG), and am in the process of writing a package for Julia that achieves feature parity with MATLAB's Computer Vision toolbox. Typical steps in MVG include converting between homogeneous coordinates (i.e. appending an extra dimension with the value of '1'), standardising the homogeneous coordinates (dividing each point by its third dimension) and converting back to Cartesian coordinates (removing the third coordinate after standardising). Furthermore, we often need to transform coordinates by making the data points have zero mean, and scaling each data point to lie inside a unit box. These steps are very easy when the points are represented as matrices. For example, it allows one to write code such as mean(pts[1,:]) to obtain the mean x coordinate. Transforming all of the data points is also simple, since it just involves multiplying the points by a matrix.

I am struggling to perform these types of operations elegantly with the Vector{CartesianIndex{2}} data structure. At present, I intend to loop over this data structure and explicitly form the two matrices I mentioned. However, this duplication of work and unnecessary performance hit feels like a hack.

Perhaps we could write a second variant of the match_keypoints function that returns two matrices instead? Or alternatively, there is some elegant Julian way of achieving the transformations I mentioned with this Vector of CartesianIndex type?

@zygmuntszpak zygmuntszpak changed the title Rational for Vector of CartesianIndex{2} for output 'matches' of match_keypoints Rationale for Vector of CartesianIndex{2} for output 'matches' of match_keypoints Nov 9, 2017
@timholy
Copy link
Member

timholy commented Nov 9, 2017

CartesianIndex is the canonical way to specify a multidimensional location: https://julialang.org/blog/2016/02/iteration. Matlab used arrays for everything because they traditionally didn't have any other types available. A Vector{CartesianIndex{2}} is much clearer than a N x M matrix because with the matrix you have to remember what N and M mean. (Rows or columns for the coordinates?)

Also relevant: while this package is pretty 2d-focused, I intend someday to add some features suitable for 3d biomedical images. Overall JuliaImages is in much better shape than most image frameworks for unifying "computer vision" and "multidimensional biomedical imaging"; traditionally those are pretty separate fields, but the intent with JuliaImages is to make sure the two communities can leverage each others' efforts where applicable. Hence we emphasize general constructs rather than special hacks when possible.

@timholy
Copy link
Member

timholy commented Nov 9, 2017

I think I focused on only a portion of your question, the CartesianIndex bit. Looking again, I see the output is a single array, a Vector{Vector{CartesianIndex{N}}}. (If nothing else, we should change that to a Vector{NTuple{2,CartesianIndex{N}}}.)

I would agree that one vector or 2 is more debatable. This is really the age-old struct-of-arrays vs array-of-structs question. In the current model we return a list of pairs, and I think you're asking for a pair of lists. That's not unreasonable, but I'm not certain one is a lot better than another.

If we switched to a tuple model, you could do the following:

julia> a = rand(1:100, 4, 100)
4×100 Array{Int64,2}:
 69  26  73  57  70   82  96  16  90  58  87   7  24  37  51  20   3  88   7  52  42    70  76  31  30  59  97  93  61  84  45  12  36  42  98  67  79   4  76  48   68
 54  87  69  62  60    4  52  39  25  93   1   3  53  26  52  41  99  52  95   7  46     79  72  39  46  29  82  18  73  21  21  15  36  37  67  41  43  61  33  42   87
 36  13  97   2   9   93  96  91  20  36  12  60  65  83  96  16  99  37  59  63  89     95  41  64  15  21   7  15  35   5  53  64  57  98  18  34  20  12  26  48  100
 11  81  43  52   8  100   6  21  25  20   1  90   4  23  56  78  71  59  93  48  12     86  70  25  54   2  23  75  55  13  54  21  54  20  67  30  98  36  39  94   62

julia> b = reinterpret(NTuple{2,CartesianIndex{2}}, a, (100,))
100-element Array{Tuple{CartesianIndex{2},CartesianIndex{2}},1}:
 (CartesianIndex{2}((69, 54)), CartesianIndex{2}((36, 11))) 
...

and its converse. Would that help?

As far as elegant ways to work with the current structure, now that I've looked I'd agree some operations are a little awkward now. You have to implement the mean of matches across the first image like this:

julia> (sum(p->p[1], b).I)./length(b)
(49.89, 51.13)

You're right that's not as easy as mean(b, 2). Are there things that are easier with the current structure? If we implemented the tuple change, would using reinterpret solve your problems? Or would returning 2 arrays be the better way to go?

@zygmuntszpak
Copy link
Member Author

Many thanks for the clarifications and examples. I think switching to a tuple model, Vector{NTuple{2,CartesianIndex{N}}}, would indeed be helpful. The reinterpret function doesn't actually copy any data does it? It just changes how the data is 'viewed'?

One benefit of keeping a 'list of pairs' model like you currently do would be when one considers matches between a sequence of views. In that case, one might naturally expect the match_keypoints function to return a list of triplets if considering three images etc.

If you switch to a 'pair of lists', then when you have three views you would need to return three lists etc. and when working with a video sequence this may get unwieldy very quickly.

One feature that would be nice to have is if there was an option to ask the match_keypoints to return homogeneous coordinates (i.e. an extra dimension with value 1). Projective geometry, and hence Multiple-View Geometry, operates on homogeneous coordinates, and so the first thing I have to do is to make a copy of the matches to convert them to homogeneous coordinates. I'm hoping that if the keypoints are already in homogeneous coordinates, then I could perform a reinterpret to convert them into a matrix, and then using in-place matrix multiplication, I could transform the points into a normalised coordinate system. Essentially, I am searching for a way to avoid having to make a copy of the points.

@SimonDanisch
Copy link
Member

I usually use GeometryTypes.Point for transformations. I just like to call it Point in GeometryTypes, but this is under the hood just a StaticArrays.SVector, so they're interchangeable.
With that you can do:

points::Vector{Point{2, Float32}}
const mat = eye(Mat4f0) # static 4x4 matrix of Float32
map!(points, points) do p
    p4 = mat * Point(p[1], p[2], 0, 1)
    (p4[1], p4[2]) # might as well return a tuple, since `convert(Point, x::Tuple)` is defined
end

Which uses much less memory and is very fast!

Of course you can also do:

indices::Vector{CartesianIndex{2}}
points = reinterpret(SVector{2, Int}, indices)

@zygmuntszpak
Copy link
Member Author

@timholy How do you handle the case where Keypoints are detected at the sub-pixel level? If the Keypoints are defined as CartesianIndex does that limit us to matching at the level of resolution of the pixel grid? Is it the case that the CartesianIndex constructor accepts only integers?

@timholy
Copy link
Member

timholy commented Nov 11, 2017

Yes, CartesianIndex requires integers, as its main role is for multidimensional indexing.

I agree there's reason to support sub-pixel alignment. Are there established methods for finding such keypoints? If so, I'm open to switching.

@zygmuntszpak
Copy link
Member Author

Yes, there are established methods for finding keypoints to sub-pixel accuracy. Actually, the methods are quite generic and can be applied regardless of what particular keypoint detection algorithm was employed.

For example, one approach is based on fitting a quadratic to a neighbourhood of pixels surrounding the keypoint and taking the sub-pixel estimate as the minimum of the quadratic. An alternative approach is to consider the x and y coordinates separately and then fit a parabola through 3 points---the keypoint and its two immediate neighbours. There are of course several other methods, some of which are customised for camera calibration. Sub-pixel accuracy is a necessity in multiple-view geometry.

What alternative to CartesianIndex do you propose?
Should it be a Vector{NTuple{2,SVector{N,Float64}}} ?

@timholy
Copy link
Member

timholy commented Nov 12, 2017

Yes, I think that's a great choice.

@zygmuntszpak
Copy link
Member Author

If we make the proposed change, we may presumably break peoples code? How do we proceed from here? What is the formal process of deprecating the code? Do you have an inkling of other code in JuliaImages that may be affected?

@timholy
Copy link
Member

timholy commented Nov 14, 2017

Great questions. Wherever possible, we don't break things, and Julia has a nice @deprecate mechanism for making these changes as gentle as possible. However, changing only the outputs of a function is a particularly difficult problem, one for which @deprecate isn't applicable.

I think there are two potential ways we could go. One would be to create a match_keypoints! function that allows the user to specify the desired output type, e.g.,

matches = NTuple{2,SVector{N,Float64}}[]  # N will be something specific here, of course
match_keypoints!(matches, keypoints1, keypoints2, ...)

If it's possible to make match_keypoints! sufficiently flexible that it can also handle the current

matches = Vector{Vector{CartesianIndex{N}}}

then there's really nothing lost in this change. It might be hard to handle both a vector and a tuple however (I wouldn't know until trying). In that case I'm inclined to switch to the tuple anyway, all those Vectors take up quite a lot more memory and get in the way of reinterpret.

The second is to not worry about it, make the switch, and hope that by using a version bump we allow people to cope as needed by introducing version bounds in the package manager. We're going to have to bump the minor version anyway because if we change the default output type, that's a breaking change.

As far as other dependencies, I'm not sure. I tihnk of ImageFeatures as one of the "top level" packages in the JuliaImages ecosystem, which is borne out by

julia> Pkg.dependents("ImageFeatures")
0-element Array{AbstractString,1}

(It was registered only a few months ago, so is fairly new in terms of its visibility to the package manager.) So in terms of registered packages, there is nothing to worry about. People's private code might be a different matter, of course.

@zygmuntszpak
Copy link
Member Author

zygmuntszpak commented Nov 19, 2017

@timholy What if we modify CartesianIndex so that it can store Real numbers, but when it is used to index into arrays, or used to iterate, only the integer components of the numbers are used? I've been thinking about the cleanest way of modifying the ImageFeatures.jl in light of having obtained a deeper understanding of the current implementation (see [https://github.com/JuliaImages/Images.jl/issues/682]). If I could store the sub-pixel locations in a CartesianIndex which behaves like the existing integer CartesianIndex for the purposes of indexing into an array, then I don't think any code change would be required for the ImageFeatures.jl package.

Thinking beyond the particular problem of sub-pixel corners, I can imagine that one may want to be able to iterate over a multi-dimensional array by using the integer part of a set of coordinates. For example, perhaps our data is represented by a discrete set of voxels, but we have done some interpolation and so our coordinates do not fall on the voxel grid. Nevertheless, we may want to write algorithms that access the discrete neighbours.

Do you think a modification of CartesianIndex along the lines of what I suggested is feasible?

@SimonDanisch
Copy link
Member

SimonDanisch commented Nov 19, 2017

You could also define your own point type, and define a Base.to_index function for it, to make indexing magically work ;) If you use your own descendant of StaticVector, you'd also get a massive amount of vector functionality for free!

struct Point{N, T}
    x::NTuple{N, T}
end

Base.to_indices(A::AbstractArray, p::Tuple{<: Point}) = round.(Int, p[1].x)
A = rand(2, 2)
p = Point((1.5, 1.5))
A[p]

You could also implement interpolation for that type!

It would be nice if this could be a descendant of StaticVector to get loads of advanced features for free, but getindex get's overloaded in StaticArrays in a way that this doesn't work - would be interesting to figure out if it was overloaded incorrectly and if it's easily fixable ;)
I opened an issue: JuliaArrays/StaticArrays.jl#344

@zygmuntszpak
Copy link
Member Author

@SimonDanisch Thank you very much for the advice and for taking the time to come up with an example. The example was very helpful. Since I am still familiarising myself with Julia it took me a while to understand the example. Hence I decided to write down what I understood for the benefit of any other newcomer who may stumble upon this thread.

The example:

struct Point{N, T}
    x::NTuple{N, T}
end

Base.to_indices(A::AbstractArray, p::Tuple{<: Point}) = round.(Int, p[1].x)
A = rand(2, 2)
p = Point((1.5, 1.5))
A[p]

The type Point contains a field x, which is a tuple type containing exactly N elements of type T.
The statement p = Point((1.5, 1.5)) constructs a new instance of a Point. In particular, it produces a Point{2,Float64}, so N = 2 and T = Float64.

The function Base.to_indices(A, I::Tuple) is what one would need to implement in order to convert the tuple I to a tuple of indices for use in indexing into an array A (see https://docs.julialang.org/en/stable/stdlib/arrays/#Base.to_indices).

This function is implemented as follows:

Base.to_indices(A::AbstractArray, p::Tuple{<: Point}) = round.(Int, p[1].x)

The statement :

round.(Int, p[1].x)

says that each element of the x should be rounded to the nearest integer. The ., called broadcasting, is what applies the round function to each entry of x.

The second argument of Base.to_indices , p::Tuple{<: Point}, declares that the variable p is a tuple type which contains Point types.

I got stuck on the fact that p is a declared to be a tuple. I was expecting p to be declared as NTuple{N, T}, and hence was expecting round.(Int, p.x). I suppose that the statement A[p] must implicitly wrap p, which is a Point{2,Float64}, with a single additional tuple. Since tuples can be references by square brackets, that would explain why we have p[1].x.

I was thinking that if we make the change to CartesianIndex as I suggest, then I would probably not need to touch most of the ImageFeatures.jl code. However, SimonDanish's suggestion is also an excellent solution and I will try to proceed with that in the meantime.

@SimonDanisch
Copy link
Member

I got stuck on the fact that p is a declared to be a tuple.

That's indeed confusing and is the aftermath of indexing a multi dimensional array with a single element.
to_indices takes a tuple because it gets called called like this in the abstract array code:

#note, that indices isn't splatted, so it turns into a tuple:
getindex(A::AbstractArray, indices...) = getindex(A, to_indices(indices)...) 

so calling getindex(A, Point(1.5, 1.5)) will call to_indices(A, (Point(1.5, 1.5),)).

@SimonDanisch
Copy link
Member

@timholy What if we modify CartesianIndex so that it can store Real numbers

I think that's a bad idea - personally I don't have anything against it. But all floating point indexing got deprecated in Julia Base, so I imagine that you will have a hard time getting back floating point indexing into a base type.

@SimonDanisch
Copy link
Member

Especially if one considers, that an interpolated getindex would be the correct overload for a floating point variant - I'd be surprised to see that make its way into Julia Base ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants