optimized DFS iterator and added additional functionality #423

jkinable · 2017-08-04T20:45:14Z

I've re-implemented the DFS iterator:

new implementation is faster, more concise and better readable than original
added additional functionality which allows one to query the depth of a node in the DFS tree. Moreover, you can now also query the parent of a node in the DFS tree, as well as the edge between a child node and its parent. The latter is particularly useful in a multigraph. This functionality could (as far as I could tell) not be implemented through a TraversalListener. Obviously, this additional functionality requires a little more bookkeeping which comes with some computational overhead.

Let me know what you think about this new extension. I could also add the same functionality to the BFS iterator.

Finally, I believe there's a minor bug in the BFS iterator: the BFS iterator never invokes finishVertex(vertex);, i.e. the code should probably read:

@Override
    protected void encounterVertex(V vertex, E edge)
    {
        putSeenData(vertex, null);
        finishVertex(vertex);
        queue.add(vertex);
    }

jbduncan · 2017-08-04T21:59:03Z

jgrapht-core/src/main/java/org/jgrapht/traverse/DepthFirstIterator.java


-    private Deque<Object> stack = new ArrayDeque<>();
+    //private Stack<V> stack = new Stack<>();
+    private Stack<StackEntry> stack = new Stack<>();


Hi @jkinable. I've not had a look at the rest of the PR yet, but I wanted to say, looking at this, that I feel a bit iffy with Stack being used here rather than Deque.

If I may, I would like to suggest using Deque here instead, for a couple of reasons.

Firstly, according to the ArrayDeque javadoc, it's explained that ArrayDeque "is likely to be faster than Stack when used as a stack".

And finally, the Stack javadoc explains that it extends Vector, which IIRC has built-in synchronization, which I would reasonably guess makes Stack slower than necessary in this case, since AFAICT the stack isn't being used in an asynchronous fashion.

You seem to raise a good point. I completely forgot about this. I always find this amusingly contradictory: "This class is likely to be faster than Stack when used as a stack".
Any idea whether it is better to use a ArrayDeque or a LinkedList as stack?

Probably ArrayDeque I think.

LinkedList uses more memory per element than array-based data structures if I understand correctly (although, thinking more about it, this may be moot, since array-based data structures by their nature need to grow their internal arrays occasionally if they get too full, meaning array-based data structures are usually larger than necessary at any given point in time).

Furthermore I would guess that LinkedList is slower on modern hardware, since it uses pointers a lot, which I believe is not CPU-cache-friendly.

So I'm inclined to go for ArrayDeque unless profiling shows that LinkedList is better in some way that matters for JGraphT.

Hope this helps. :)

jsichi · 2017-08-16T21:51:32Z

@jkinable I haven't had a chance to review the DFS change yet...could we keep that separate from the other changes to make reviewing easier?

jkinable · 2017-08-17T13:52:16Z

@jsichi I've separated the DFS changes and the connectivity changes (separate PR).

jsichi · 2017-08-25T23:15:37Z

jgrapht-core/src/test/java/org/jgrapht/traverse/DepthFirstIteratorTest.java

@@ -55,7 +60,7 @@ String getExpectedStr2()
    @Override
    String getExpectedFinishString()
    {
-        return "6:4:9:2:8:7:5:3:1:orphan:";
+        return "1:3:6:5:7:9:4:8:2:orphan:";


You've eliminated the old white/gray/black bookkeeping that was producing the correct finish order. Why?

I'm not entirely sure that I understand your comment. In DepthFirstSearch there is no such thing as a 'correct finish order'. Quoting from the original author of the test class:

NOTE: This test uses hard-coded expected ordering isn't really guaranteed by the specification of
the algorithm. This could cause false failures if the traversal implementation changes.
In other words, alternative Depth First Search implementations may produce a different traversal order, while still producing a different but correct Depth First Search Tree.

The original implementation using white/gray/black bookkeeping was surprisingly slow. Some tests on reasonable sized graphs gave me one to two orders of magnitude speedup with the new implementation. Unless I misunderstood something, the book keeping was rather unnecessary.

It's true that the finish order is not deterministic, which I believe is what that comment meant. However, there are constraints on the allowable orderings:

http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/GraphAlgor/depthSearch.htm

So I don't think we can eliminate the existing logic. I'm guessing the dominant cost is in finding+removing nodes from the middle of the deque, so a better data structure is called for if we want to optimize that.

(If that's not possible, then instead of replacing the existing iterator, we could just add a new one.)

I studied the reference you stated - the coloring logic being used (white/grey/black) is just syntactic sugar which simplifies the discussion of the algorithm, as well as some of the proofs. The implementation in this PR satisfies all constraints wrt allowable orderings, while being significantly more efficient. It's quite easy to see how different (but valid) orderings can be obtained.

Example 1: triangle graph (a,b)(b,c)(c,a). A valid DFS ordering: [a,b,c]. Another valid ordering: [a,c,b].

Example 2: even without cycles, different orderings exist. For instance, in a graph with 2 edge (a,b)(a,c) again two orderings exist: [a,b,c] and [a,c,b].
The existence of different orderings is a direct consequence of the fact that in a graph, the neighbors of a vertex form an unordered set.

Just to be sure, I also checked some other reference works:

Graphs Networks and algorithms by Dieter Jungnickel

Introduction to Algorithms by Thomas H. Cormen , Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
None of these references state additional constraints on feasible orderings aside from the ones that have already been implemented in this PR.

I'm referring to the constraint between discovery order and finish order. See the "parenthesis theorem" in the link above. Your getExpectedFinishString is identical to getExpectedStr2 because you are calling finishVertex as soon as you encounter a vertex, which is incorrect. We're not supposed to finish a vertex until we've finished everything reachable from it.

@jkinable @jsichi Indeed, you need some extra bookkeeping in order to call finishVertex() at the correct time. One solution is to add an extra flag on the stack entry to indicate whether this is the first time that this vertex has been added in the stack. When you pop() that entry, you can call finishVertex().

jsichi · 2017-08-25T23:15:49Z

jgrapht-core/src/test/java/org/jgrapht/traverse/DepthFirstIteratorTest.java

@@ -35,6 +39,7 @@
 * @author Liviu Rau, Patrick Sharp
 * @since Jul 30, 2003
 */
+//public class DepthFirstIteratorTest{}


Delete comment

jsichi · 2017-08-31T05:59:16Z

jgrapht-core/src/test/java/org/jgrapht/traverse/DepthFirstIteratorTest.java

@@ -55,7 +60,7 @@ String getExpectedStr2()
    @Override
    String getExpectedFinishString()
    {
-        return "6:4:9:2:8:7:5:3:1:orphan:";
+        return "1:3:6:5:7:9:4:8:2:orphan:";


I'm referring to the constraint between discovery order and finish order. See the "parenthesis theorem" in the link above. Your getExpectedFinishString is identical to getExpectedStr2 because you are calling finishVertex as soon as you encounter a vertex, which is incorrect. We're not supposed to finish a vertex until we've finished everything reachable from it.

d-michail · 2017-09-15T14:42:25Z

jgrapht-core/src/main/java/org/jgrapht/traverse/DepthFirstIterator.java

 */
 public class DepthFirstIterator<V, E>
-    extends CrossComponentIterator<V, E, DepthFirstIterator.VisitColor>
+    extends CrossComponentIterator<V, E, DepthFirstIterator.SearchNodeData>


Change to extends CrossComponentIterator<V, E, DepthFirstIterator<V, E>.SearchNodeData>.

mykolapolonskyi · 2018-08-15T14:59:31Z

hi there
I`ve found this PR very useful - can't understand status - has it been rejected?

jkinable · 2018-08-17T03:22:50Z

@mykolapolonskyi in order to make the DFS implementation much more efficient I had to take some shortcuts. The algorithm is supposed to invoke the finishVertex() method in a rather specific order, see: http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/GraphAlgor/depthSearch.htm
At least this is what the old (less efficient) implementation did. We always try to keep things the same as they were.
Notice however that:

the BestFirstIterator (incorrectly) doesn't call the finishVertex() method
aside from the above website, I don't know any other source that specifically states the correct finish order.

Either way, I need to find some time to finish this PR (meaning that I need to add some extra bookkeeping to call finishVertex in the right order).

Overmind91 · 2021-02-08T07:11:13Z

It is 2020 already, how much longer will it take?
For me, the library almost useless without this PR

Joris Kinable added 4 commits August 4, 2017 14:56

optimized DFS iterator and added additional functionality

6d39f13

Merge branch 'master' of https://github.com/jgrapht/jgrapht into dfs

1d3b72c

fixed merge conflict

e0a7821

fixed typo

d89dff5

jbduncan reviewed Aug 4, 2017

View reviewed changes

Joris Kinable and others added 3 commits August 4, 2017 18:14

swapped Stack for Deque

8bd3b2a

revising blockcutpointgraph

04ff42a

reimplemented biconnectivityinspector and blockcutpointgraph

a2c68a2

Joris Kinable added 2 commits August 17, 2017 09:34

reworked connectivity package

6fa0240

Reverting to commit 8bd3b2a

f23adda

jsichi requested changes Aug 25, 2017

View reviewed changes

deleted comment

95daef5

jsichi requested changes Aug 31, 2017

View reviewed changes

jkinable mentioned this pull request Sep 5, 2017

Feature request: Last traversed edge in BFI and DFI #427

Open

d-michail reviewed Sep 15, 2017

View reviewed changes

jkinable mentioned this pull request Apr 23, 2018

added methods to query the search tree in BFS iterator #560

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimized DFS iterator and added additional functionality #423

optimized DFS iterator and added additional functionality #423

jkinable commented Aug 4, 2017 •

edited

jbduncan Aug 4, 2017

jkinable Aug 4, 2017

jbduncan Aug 4, 2017 •

edited

jbduncan Aug 4, 2017

jsichi commented Aug 16, 2017

jkinable commented Aug 17, 2017

jsichi Aug 25, 2017

jkinable Aug 26, 2017

jsichi Aug 27, 2017

jkinable Aug 30, 2017

jsichi Aug 31, 2017

d-michail Sep 15, 2017

jsichi Aug 25, 2017

jsichi Aug 31, 2017

d-michail Sep 15, 2017

mykolapolonskyi commented Aug 15, 2018

jkinable commented Aug 17, 2018

Overmind91 commented Feb 8, 2021

optimized DFS iterator and added additional functionality #423

Are you sure you want to change the base?

optimized DFS iterator and added additional functionality #423

Conversation

jkinable commented Aug 4, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbduncan Aug 4, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsichi commented Aug 16, 2017

jkinable commented Aug 17, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mykolapolonskyi commented Aug 15, 2018

jkinable commented Aug 17, 2018

Overmind91 commented Feb 8, 2021

jkinable commented Aug 4, 2017 •

edited

jbduncan Aug 4, 2017 •

edited