Unify boxes in result by default? #48

dpsanders · 2019-12-14T16:49:19Z

There's an implementation using LightGraphs in #5

It doesn't seem so easy to write an implementation by hand. It's basically a union-find algorithm

dpsanders · 2019-12-14T16:49:47Z

There should be a unify=true keyword argument (true by default)

Suavesito-Olimpiada · 2021-05-27T21:17:34Z

Hi! I decided to take a chance into this and found there is something weird with the implementation described above for 2 dimensions and above. The weird case is easily described with the next example

       ┌───────────┐
       │           │
       │           │
       │      B₁   │
┌──────┼───┐       │
│      │   │       │
│      └───┼───────┘
│          │
│    B₂    │ ┌────┐
│          │ │ B₃ │
│          │ └────┘
└──────────┘

with the above algorithm the boxes are reduced to this

┌──────────────────┐
│                  │
│                  │
│                  │
│                  │
│                  │
│       B₁∪B₂      │
│                  │
│           ┌────┐ │
│           │ B₃ │ │
│           └────┘ │
└──────────────────┘

But I think thay should be reduced to just B₁∪B₂∪B₃.

In #64 I add the unify keyword with a small but fast implementation of union-find (weighted and with path compression, reference on the algorithm here).

The implementation is quite fast, two examples are the next ones (first from the tutorial)

julia> using IntervalArithmetic, IntervalOptimisation, BenchmarkTools

julia> f(x) = x^2 - 2x + 1
f (generic function with 1 method)

julia> @time minval, minimisers = minimise(f, 0..2; tol=1e-6, unify=false);
  3.626429 seconds (15.91 M allocations: 1.621 GiB, 8.09% gc time)

julia> @show length(minimisers)
length(minimisers) = 2990
2990

julia> @btime unified_minimisers = IntervalOptimisation.unify_boxes(minimisers)
  29.142 ms (75 allocations: 117.45 KiB)
14-element Vector{Interval{Float64}}:
 [0.998687, 1.00136]
 [0.998642, 0.998687]
 [0.998635, 0.998642]
 [0.998623, 0.998627]
 [1.00136, 1.00137]
 [1.00135, 1.00137]
 [1.00138, 1.00139]
 [1.00137, 1.00138]
 [1.00135, 1.00136]
 [0.998632, 0.998634]
 [0.998621, 0.998623]
 [1.00137, 1.00138]
 [1.00137, 1.00138]
 [0.998617, 0.998619]

the other mine

julia> g(x) = abs(x[1]-x[2]) + (x[1]+x[2])^2
g (generic function with 1 method)

julia> @time minval, minimisers = minimise(g, (-1+1e-10..1)×(-1..1); tol=1e-10, unify=false); # initialize with these give a special bad behaviour
  1.815546 seconds (8.21 M allocations: 747.165 MiB, 18.85% gc time)

julia> @show length(minimisers)
length(minimisers) = 27670
27670

julia> @btime unified_minimisers = IntervalOptimisation.unify_boxes(minimisers)
  2.666 s (28 allocations: 946.11 KiB)
1-element Vector{IntervalBox{2, Float64}}:
 [-3.80622e-07, 3.80622e-07] × [-3.80672e-07, 3.80625e-07]

One thing is that if the implementation is changed so that the example showed at the begginig returns B₁∪B₂∪B₃, unify_boxes can be written to be faster (now is cuadratic on the number of minimisers, but it can be O(n log(n))) and avoid the union-find algorithm altogether. (edit: typos and wording)

I'm open to thoughts and ideas. Hope is useful.

P.D. Kudos to asciiflow for the ascii drawing. 😛

dpsanders · 2021-05-28T06:27:18Z

I don't understand why the algorithm doesn't reduce your example to B₁ ∪ B₂ ∪ B₃.

Suavesito-Olimpiada · 2021-05-28T06:53:45Z

Is because the B₁∩B₃ and B₂∩B₃ is empty, even if (B₁∪B₂)∩B₃ is not, and the algorithm is just testing pairwise the intersections.

dpsanders · 2021-05-28T15:47:29Z

Nice catch! Does your version give the correct result?

dpsanders · 2021-05-28T16:01:57Z

A solution seems to be to modify the list of intervals as you go by taking the union (hull) when two intervals are combined, and checking the intersection with this modified list instead of the original list.

dpsanders · 2021-05-28T16:04:10Z

Hmmm. But actually in your original example we would prefer not to unify (B1 ∪ B2) with B3, since they are actually disjoint. Maybe taking the hull is actually not a good idea.

dpsanders · 2021-05-28T16:05:23Z

We can just return separate lists for each connected component and the user can take the hull if they prefer; or we can give a convenience function to do so.

Suavesito-Olimpiada · 2021-05-28T22:00:41Z

Maybe we can add three options for unify, :hull, :connected and :none?

Suavesito-Olimpiada · 2021-05-28T22:06:14Z

I can add the algorithm for the hull too. I'll add it to the PR and make the changes on the documentation and add tests.

dpsanders · 2021-06-09T16:47:31Z

@Suavesito-Olimpiada: Sorry for the delay in replying. I think it would be great to have these different options. Is there any progress on this?

dpsanders · 2021-06-09T17:38:19Z

Note that there is also an implementation of disjoint sets in DataStructures.jl. I'm not sure how it compares to yours.

Suavesito-Olimpiada · 2021-06-09T20:49:26Z

I can check the differences between my inplementation and theirs, also see performan differences and reply back later.

Suavesito-Olimpiada · 2021-06-09T20:53:37Z

On the hull algorithm, I still don't have progress. But once I finish my semester I'll do it. :)

dpsanders · 2021-06-11T14:35:30Z

I think there should be a way to use interval trees for this. There is an implementation in Julia.

dpsanders · 2021-06-11T14:38:20Z

Or rather R-trees

Suavesito-Olimpiada · 2021-06-11T14:51:58Z

I'll check out both R-trees and Interval Trees, so I can understand better both structures. But as I see (just over-looking both articles) I think a variant of R-trees is closer to the hull idea (that is, so we can get the maximum number of non-intersecting hulls of intersecting intervals).

dpsanders · 2021-06-11T14:54:43Z

I think the idea of R-trees is to make it more efficient to check which other boxes a given box intersects.

Suavesito-Olimpiada · 2021-06-11T15:02:53Z

From the R-Tree article

The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree

and from the Interval Tree article

...it allows one to efficiently find all intervals that overlap with any given interval or point

dpsanders · 2021-06-11T15:05:31Z

And then

The key idea is to use the bounding boxes to decide whether or not to search inside a subtree. In this way, most of the nodes in the tree are never read during a search

The bounding boxes are used to narrow down the set of existing boxes that a new box must be checked against, as far as I understand.

dpsanders · 2021-06-11T15:06:31Z

See Samet, Foundations of Multidimensional and Metric Data Structures, chap. 3

Suavesito-Olimpiada · 2021-06-11T15:14:45Z

Ok, ok I think both structures are used to optimise the search of intervals that overlaps other intervals, but I need to check the differences. Thanks for the information and all the references, I'll check them out. 😃

Suavesito-Olimpiada linked a pull request May 27, 2021 that will close this issue

Add unify keyword and default if to true #64

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unify boxes in result by default? #48

Unify boxes in result by default? #48

dpsanders commented Dec 14, 2019

dpsanders commented Dec 14, 2019

Suavesito-Olimpiada commented May 27, 2021 •

edited

dpsanders commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021 •

edited

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021

dpsanders commented Jun 9, 2021

dpsanders commented Jun 9, 2021

Suavesito-Olimpiada commented Jun 9, 2021

Suavesito-Olimpiada commented Jun 9, 2021

dpsanders commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

dpsanders commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

Unify boxes in result by default? #48

Unify boxes in result by default? #48

Comments

dpsanders commented Dec 14, 2019

dpsanders commented Dec 14, 2019

Suavesito-Olimpiada commented May 27, 2021 • edited

dpsanders commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021 • edited

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

dpsanders commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021

Suavesito-Olimpiada commented May 28, 2021

dpsanders commented Jun 9, 2021

dpsanders commented Jun 9, 2021

Suavesito-Olimpiada commented Jun 9, 2021

Suavesito-Olimpiada commented Jun 9, 2021

dpsanders commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

dpsanders commented Jun 11, 2021

dpsanders commented Jun 11, 2021

Suavesito-Olimpiada commented Jun 11, 2021

Suavesito-Olimpiada commented May 27, 2021 •

edited

Suavesito-Olimpiada commented May 28, 2021 •

edited