Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Cache graphs objects when converting to a backend #7345

Merged
merged 13 commits into from Mar 31, 2024

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented Mar 13, 2024

This builds off of #7344.

Caching backend graph conversions can provide a huge performance (and usability) benefit to users. It comes at the cost of increased memory, so it would be nice to be able to control caching via config (#7225).

CC @rlratzel

@rlratzel
Copy link

Thanks! This is great. I'll work on an example to demonstrate the speedup when multiple algo calls are made using the same Graph. It should be significant based on benchmark comparisons with and without conversion cost.

@eriknw
Copy link
Contributor Author

eriknw commented Mar 14, 2024

How do y'all like this warning message?

In [4]: nx.pagerank(G, backend="cugraph")
/home/erwelch/git/networkx/networkx/utils/backends.py:910: UserWarning: Using cached graph for 'cugraph' backend in call to pagerank.

For the cache to be consistent (i.e., correct), the input graph must not have been manually mutated since the cached graph was created. Examples of manually mutating the graph data structures resulting in an inconsistent cache include:

>>> G[u][v][key] = val

and

>>> for u, v, d in G.edges(data=True):
...     d[key] = val

Using methods such as G.add_edge(u, v, weight=val) will correctly clear the cache to keep it consistent.
warnings.warn(

@eriknw
Copy link
Contributor Author

eriknw commented Mar 15, 2024

In the dispatching meeting today, we proposed cache_converted_graphs as the configuration name (see #7225) and NETWORKX_CACHE_CONVERTED_GRAPHS as the environment variable to control whether to cache graph conversions.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a pass through. I hope my comments are clear. please ask for clarity or push back against suggestions as desired. :}

@@ -146,7 +146,7 @@ def test_modularity_communities_directed_weighted():

# A large weight of the edge (2, 6) causes 6 to change group, even if it shares
# only one connection with the new group and 3 with the old one.
G[2][6]["weight"] = 20
G.add_edge(2, 6, weight=20)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indicates that caching would not work for this test. Did you anticipate that, or was it the results of trying the test in a cached environment that led to this test change.

I'm reluctant to switch all of our testing to use code that won't break caching.
Perhaps we could do it both ways -- a test that caching works (using add_edge) and a test that changing directly gives the wrong answer when using a backend with caching turned on. What do you think?

What breaks if we leave this test as is?

Copy link
Contributor Author

@eriknw eriknw Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing breaks if I leave this as is, because we disable caching when testing backends. But, I may sometimes experiment with enabling caching while testing, and if we can get it to work, it may be a nice option to add for testing (but probably not b/c of the maintenance burden).

Up to you. I can revert these changes to the tests. I don't think there would be that many more changes to be cache-friendly though, so I thought "might as well".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about, for now, revert and add a line like G.__networkx_cache__.clear()?
That signals that this kind of change would break caching. And when we get to testing caching in the tests, we can replace it with a test that caching would break and add the add_edge version that doesn't break.

Comment on lines 632 to 633
if attr is not None:
G.__networkx_cache__.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should leave out this barycenter cache clear. But I'm willing to leave it in for this PR.

I'd like to change barycenter to avoid the problem. When we make that change we can take this line out again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, sounds good. I'd like to keep it in for now for maximum safety, but, yeah, I'd prefer to change the API so we can remove this. We can look towards more optimizations in the future.

@@ -94,6 +94,7 @@ def bidirectional_bfs():
path.append(u)
flow_value += augment(path)

R.__networkx_cache__.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that this function (edmonds_karp_core) should not be dispatchable. It is a helper function for the dispatchable function edmonds_karp. If it was written today it would be a private function. I'm fine with making it a private function -- but it hasn't been (inertia and very unlikely backwards compatibility).

The other function in this module is edmonds_karp_impl which is also a helper function. Why is one helper function decorated as dispatchable and the other not? Can we pull the dispatchable from edmonds_karp_core? Would this mess you up in some way? Would it be better for me to make it a private function then?

(I guess I'm saying that tis function should not be part of the networkx api in the sense of having backends implement it.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can make it not dispatchable. When I initially added the dispatch decorator, I cast a pretty wide net, so if it had a graph argument and name didn't start with an underscore, I probably decorated it. We should probably remove it from more of these "supposed to be private" functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it probably would be good to remove the dispatchable decorator from the helper functions that should be private. But it's tricky to determine which are helper functions. There are unfortunately quite a few of those in this library, so it makes sense to cast a wide net and go back to filter them out.

use_cache
and (cache := getattr(graph, "__networkx_cache__", None)) is not None
):
cache = cache.setdefault("backends", {}).setdefault(backend_name, {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line doesn't do what I think you intend here (cache is an empty dict after this line). When I proposed improving the setdefault function 15 years ago to CPython, the reviewer responded that setdefault probably shouldn't be used anymore -- it doesn't serve any function not provided elsewhere and its slow, confusing and causes surprises. They couldn't remove it for backward compat reasons. So... i was surprised it didn't disappear in python3. But in any case, this is more evidence that we should just ban setdefault from the library. :)

           if "backends" not in cache:
               cache["backends"] = {}
           if backend_name not in cache:
               cache[backend_name] = {}

Or, if you prefer

            cache["backends"] = cache.get("backends", {})
            cache[backend_name] = cache.get(backend_name, {})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun story, thanks for sharing :)

I'll say the same thing to you--I don't think that code does what I think you intend. It's not what I intend.

My code is as I intend, and it works (I would have given a demo on Thursday if we had time). Consider this example:

>>> d = {}
>>> y = d.setdefault('x', {}).setdefault('y', {})
>>> d
{'x': {'y': {}}}
>>> y
{}
>>> d.setdefault('x', {}).setdefault('y', {}) is y
True

I don't use setdefault very often, but sometimes it's the right pattern to use, and I think this is one of those times. Heh, this is pretty much the only pattern it's good for. It's also gotten love since 15 years ago--it's faster, and it's now atomic (this is important to me).

I should probably add a comment that this creates nested dicts such as:

# Using `setdefault` creates nested dicts if they don't already exist.
# After calling setdefaults, `cache` could be gotten like this:
# `cache = graph.__networkx_cache__["backends"][backend_name]`

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow -- the resulting cache dict is the inner dict -- not the outer dict. Not what I thought this code was doing. And it all makes very good sense. And, I'll take back my comment about never using setdefault.

I think the code would be easier (at least for me) to read with different names for the two cache objects.
I also had to pause to re-read the if-clause here -- the "walrus" makes the code compact but harder for me to read (does two things, so I read it twice). I guess it saves a hasattr call.

What do you think about something like this code:

        if use_cache and hasattr(graph, "__networkx_cache__"):
            nx_cache = getattr(graph, "__networkx_cache__")
            # make nested {"backends": {backend_name: cache}} structure if not there.
            cache = nx_cache.setdefault("backends", {}).setdefault(backend_name, {})

Introducing a name like nx_cache is the part of this I think is really helpful. I think two names and the comment with a nested dict literal are sufficient for me to be able to read this 2 years from now.

@eriknw
Copy link
Contributor Author

eriknw commented Mar 17, 2024

Thanks for taking a close look @dschult. I really appreciate suggestions that help improve clarity. I've made the changes locally. I'm also paying off some technical debt, so it may be a day or two before I can update the PR. For example, I'm changing residual= so it's no longer a graph that the dispatcher knows about, which "feels right" to me and I hope will simplify some things.

@dschult
Copy link
Member

dschult commented Mar 20, 2024

I guess we should add a sentence to the warning message about how to handle cases where they do want to mutate the graph directly. All they need to do is clear the cache. So perhaps add a final sentence like:

If you manually mutate the graph and you use caching, you should probably use `G.__networkx_cache.clear()`.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eriknw ! These changes look good to me.

I do have a question about how the backends are envisioned for adjacency matrix info.
Q: When there are more than one edge attribute, there is more than one adjacency matrix to consider. Do we expect the backend graph structure to cache all the edge attribute matrices at once? I was thinking that each edge attribute would lead to a different backend representation (my grounding example being a sparse matrix representation). But reading through the latest commit I am beginning to think we are designing this to clear all edge-attribute matrices when an unrelated edge attribute is changed.

How do nx-cugraph and nx-graphblas reprresent the graph? Does it include all edge attributes or only one at a time?

@eriknw
Copy link
Contributor Author

eriknw commented Mar 21, 2024

nx-cugraph stores all edge attributes. It has a dict of arrays of values, and a dict of arrays of masks if necessary. The GraphBLAS backend uses a GraphBLAS Matrix to represent the graph. This can only handle a single attribute today, which can lead to multiple cached graphs.

How about I add a new function such as nx._clear_cache(G) that clears the cache. I like the idea of selectively clearing the cache based on what is changed, which will be easier to add to such a function.

@dschult
Copy link
Member

dschult commented Mar 21, 2024

I like the idea of a function to clear the cache so if we change how we want to handle it, the changes in code will be less widely spread. Should that be a method on the Graph? That's keep the function close to the data it is manipulating. But I'm open to a function if there's an advantage to that over a method.

@eriknw
Copy link
Contributor Author

eriknw commented Mar 21, 2024

Good question. I prefer to use a function for now to see if we can keep it generic enough to operate on any backend graph or object that has __networkx_cache__ (edit: this will allow backends to use nx._clear_cache to clear their cache when appropriate) . Function is also friendlier to duck-typed graphs (who knows what all networkx users do?), and it's safer to not add a required private method.

Similarly, a reason I don't assume graph objects have __networkx_cache__ is to not break users who are loading a networkx graph that was pickled with an earlier networkx version.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR. I'm sure there will be more changes as we move forward. But this is a good step!

@eriknw
Copy link
Contributor Author

eriknw commented Mar 22, 2024

I think it probably would be good to remove the dispatchable decorator from the helper functions that should be private. But it's tricky to determine which are helper functions. There are unfortunately quite a few of those in this library, so it makes sense to cast a wide net and go back to filter them out.

fyi, here is a tree view of all dispatched functions. Do any jump out at you that obviously shouldn't be dispatched?

Dispatched NetworkX functions (dispatch name in parentheses)
 ├─ algorithms
 │   ├─ approximation
 │   │   ├─ clique
 │   │   │   ├─ clique_removal
 │   │   │   ├─ large_clique_size
 │   │   │   ├─ max_clique
 │   │   │   └─ maximum_independent_set
 │   │   ├─ clustering_coefficient
 │   │   │   └─ average_clustering (approximate_average_clustering)
 │   │   ├─ connectivity
 │   │   │   ├─ all_pairs_node_connectivity (approximate_all_pairs_node_connectivity)
 │   │   │   ├─ local_node_connectivity (approximate_local_node_connectivity)
 │   │   │   └─ node_connectivity (approximate_node_connectivity)
 │   │   ├─ distance_measures
 │   │   │   └─ diameter (approximate_diameter)
 │   │   ├─ dominating_set
 │   │   │   ├─ min_edge_dominating_set
 │   │   │   └─ min_weighted_dominating_set
 │   │   ├─ kcomponents
 │   │   │   └─ k_components (approximate_k_components)
 │   │   ├─ matching
 │   │   │   └─ min_maximal_matching
 │   │   ├─ maxcut
 │   │   │   ├─ one_exchange
 │   │   │   └─ randomized_partitioning
 │   │   ├─ ramsey
 │   │   │   └─ ramsey_R2
 │   │   ├─ steinertree
 │   │   │   ├─ metric_closure
 │   │   │   └─ steiner_tree
 │   │   ├─ traveling_salesman
 │   │   │   ├─ asadpour_atsp
 │   │   │   ├─ christofides
 │   │   │   ├─ greedy_tsp
 │   │   │   ├─ held_karp_ascent
 │   │   │   ├─ simulated_annealing_tsp
 │   │   │   ├─ spanning_tree_distribution
 │   │   │   ├─ threshold_accepting_tsp
 │   │   │   └─ traveling_salesman_problem
 │   │   ├─ treewidth
 │   │   │   ├─ treewidth_decomp
 │   │   │   ├─ treewidth_min_degree
 │   │   │   └─ treewidth_min_fill_in
 │   │   └─ vertex_cover
 │   │       └─ min_weighted_vertex_cover
 │   ├─ assortativity
 │   │   ├─ connectivity
 │   │   │   └─ average_degree_connectivity
 │   │   ├─ correlation
 │   │   │   ├─ attribute_assortativity_coefficient
 │   │   │   ├─ degree_assortativity_coefficient
 │   │   │   ├─ degree_pearson_correlation_coefficient
 │   │   │   └─ numeric_assortativity_coefficient
 │   │   ├─ mixing
 │   │   │   ├─ attribute_mixing_dict
 │   │   │   ├─ attribute_mixing_matrix
 │   │   │   ├─ degree_mixing_dict
 │   │   │   └─ degree_mixing_matrix
 │   │   ├─ neighbor_degree
 │   │   │   └─ average_neighbor_degree
 │   │   └─ pairs
 │   │       ├─ node_attribute_xy
 │   │       └─ node_degree_xy
 │   ├─ asteroidal
 │   │   ├─ create_component_structure
 │   │   ├─ find_asteroidal_triple
 │   │   └─ is_at_free
 │   ├─ bipartite
 │   │   ├─ basic
 │   │   │   ├─ color
 │   │   │   ├─ degrees
 │   │   │   ├─ density
 │   │   │   ├─ is_bipartite
 │   │   │   ├─ is_bipartite_node_set
 │   │   │   └─ sets
 │   │   ├─ centrality
 │   │   │   ├─ betweenness_centrality (bipartite_betweenness_centrality)
 │   │   │   ├─ closeness_centrality (bipartite_closeness_centrality)
 │   │   │   └─ degree_centrality (bipartite_degree_centrality)
 │   │   ├─ cluster
 │   │   │   ├─ average_clustering (bipartite_average_clustering)
 │   │   │   ├─ latapy_clustering
 │   │   │   └─ robins_alexander_clustering
 │   │   ├─ covering
 │   │   │   └─ min_edge_cover (bipartite_min_edge_cover)
 │   │   ├─ edgelist
 │   │   │   ├─ parse_edgelist (bipartite_parse_edgelist)
 │   │   │   └─ read_edgelist (bipartite_read_edgelist)
 │   │   ├─ extendability
 │   │   │   └─ maximal_extendability
 │   │   ├─ generators
 │   │   │   ├─ alternating_havel_hakimi_graph
 │   │   │   ├─ complete_bipartite_graph
 │   │   │   ├─ configuration_model (bipartite_configuration_model)
 │   │   │   ├─ gnmk_random_graph
 │   │   │   ├─ havel_hakimi_graph (bipartite_havel_hakimi_graph)
 │   │   │   ├─ preferential_attachment_graph
 │   │   │   ├─ random_graph
 │   │   │   └─ reverse_havel_hakimi_graph
 │   │   ├─ matching
 │   │   │   ├─ eppstein_matching
 │   │   │   ├─ hopcroft_karp_matching
 │   │   │   ├─ minimum_weight_full_matching
 │   │   │   └─ to_vertex_cover
 │   │   ├─ matrix
 │   │   │   ├─ biadjacency_matrix
 │   │   │   └─ from_biadjacency_matrix
 │   │   ├─ projection
 │   │   │   ├─ collaboration_weighted_projected_graph
 │   │   │   ├─ generic_weighted_projected_graph
 │   │   │   ├─ overlap_weighted_projected_graph
 │   │   │   ├─ projected_graph
 │   │   │   └─ weighted_projected_graph
 │   │   ├─ redundancy
 │   │   │   └─ node_redundancy
 │   │   └─ spectral
 │   │       └─ spectral_bipartivity
 │   ├─ boundary
 │   │   ├─ edge_boundary
 │   │   └─ node_boundary
 │   ├─ bridges
 │   │   ├─ bridges
 │   │   ├─ has_bridges
 │   │   └─ local_bridges
 │   ├─ centrality
 │   │   ├─ betweenness
 │   │   │   ├─ betweenness_centrality
 │   │   │   └─ edge_betweenness_centrality
 │   │   ├─ betweenness_subset
 │   │   │   ├─ betweenness_centrality_subset
 │   │   │   └─ edge_betweenness_centrality_subset
 │   │   ├─ closeness
 │   │   │   ├─ closeness_centrality
 │   │   │   └─ incremental_closeness_centrality
 │   │   ├─ current_flow_betweenness
 │   │   │   ├─ approximate_current_flow_betweenness_centrality
 │   │   │   ├─ current_flow_betweenness_centrality
 │   │   │   └─ edge_current_flow_betweenness_centrality
 │   │   ├─ current_flow_betweenness_subset
 │   │   │   ├─ current_flow_betweenness_centrality_subset
 │   │   │   └─ edge_current_flow_betweenness_centrality_subset
 │   │   ├─ current_flow_closeness
 │   │   │   └─ current_flow_closeness_centrality
 │   │   ├─ degree_alg
 │   │   │   ├─ degree_centrality
 │   │   │   ├─ in_degree_centrality
 │   │   │   └─ out_degree_centrality
 │   │   ├─ dispersion
 │   │   │   └─ dispersion
 │   │   ├─ eigenvector
 │   │   │   ├─ eigenvector_centrality
 │   │   │   └─ eigenvector_centrality_numpy
 │   │   ├─ flow_matrix
 │   │   │   └─ flow_matrix_row
 │   │   ├─ group
 │   │   │   ├─ group_betweenness_centrality
 │   │   │   ├─ group_closeness_centrality
 │   │   │   ├─ group_degree_centrality
 │   │   │   ├─ group_in_degree_centrality
 │   │   │   ├─ group_out_degree_centrality
 │   │   │   └─ prominent_group
 │   │   ├─ harmonic
 │   │   │   └─ harmonic_centrality
 │   │   ├─ katz
 │   │   │   ├─ katz_centrality
 │   │   │   └─ katz_centrality_numpy
 │   │   ├─ laplacian
 │   │   │   └─ laplacian_centrality
 │   │   ├─ load
 │   │   │   ├─ edge_load_centrality
 │   │   │   └─ newman_betweenness_centrality
 │   │   ├─ percolation
 │   │   │   └─ percolation_centrality
 │   │   ├─ reaching
 │   │   │   ├─ global_reaching_centrality
 │   │   │   └─ local_reaching_centrality
 │   │   ├─ second_order
 │   │   │   └─ second_order_centrality
 │   │   ├─ subgraph_alg
 │   │   │   ├─ communicability_betweenness_centrality
 │   │   │   ├─ estrada_index
 │   │   │   ├─ subgraph_centrality
 │   │   │   └─ subgraph_centrality_exp
 │   │   ├─ trophic
 │   │   │   ├─ trophic_differences
 │   │   │   ├─ trophic_incoherence_parameter
 │   │   │   └─ trophic_levels
 │   │   └─ voterank_alg
 │   │       └─ voterank
 │   ├─ chains
 │   │   └─ chain_decomposition
 │   ├─ chordal
 │   │   ├─ chordal_graph_cliques
 │   │   ├─ chordal_graph_treewidth
 │   │   ├─ complete_to_chordal_graph
 │   │   ├─ find_induced_nodes
 │   │   └─ is_chordal
 │   ├─ clique
 │   │   ├─ enumerate_all_cliques
 │   │   ├─ find_cliques
 │   │   ├─ find_cliques_recursive
 │   │   ├─ make_clique_bipartite
 │   │   ├─ make_max_clique_graph
 │   │   ├─ max_weight_clique
 │   │   └─ node_clique_number
 │   ├─ cluster
 │   │   ├─ average_clustering
 │   │   ├─ clustering
 │   │   ├─ generalized_degree
 │   │   ├─ square_clustering
 │   │   ├─ transitivity
 │   │   └─ triangles
 │   ├─ coloring
 │   │   ├─ equitable_coloring
 │   │   │   ├─ equitable_color
 │   │   │   ├─ is_coloring
 │   │   │   ├─ is_equitable
 │   │   │   └─ pad_graph
 │   │   └─ greedy_coloring
 │   │       └─ greedy_color
 │   ├─ communicability_alg
 │   │   ├─ communicability
 │   │   └─ communicability_exp
 │   ├─ community
 │   │   ├─ asyn_fluid
 │   │   │   └─ asyn_fluidc
 │   │   ├─ centrality
 │   │   │   └─ girvan_newman
 │   │   ├─ community_utils
 │   │   │   └─ is_partition
 │   │   ├─ divisive
 │   │   │   ├─ edge_betweenness_partition
 │   │   │   └─ edge_current_flow_betweenness_partition
 │   │   ├─ kclique
 │   │   │   └─ k_clique_communities
 │   │   ├─ kernighan_lin
 │   │   │   └─ kernighan_lin_bisection
 │   │   ├─ label_propagation
 │   │   │   ├─ asyn_lpa_communities
 │   │   │   ├─ fast_label_propagation_communities
 │   │   │   └─ label_propagation_communities
 │   │   ├─ louvain
 │   │   │   ├─ louvain_communities
 │   │   │   └─ louvain_partitions
 │   │   ├─ lukes
 │   │   │   └─ lukes_partitioning
 │   │   ├─ modularity_max
 │   │   │   ├─ greedy_modularity_communities
 │   │   │   └─ naive_greedy_modularity_communities
 │   │   └─ quality
 │   │       ├─ inter_community_edges
 │   │       ├─ inter_community_non_edges
 │   │       ├─ intra_community_edges
 │   │       ├─ modularity
 │   │       └─ partition_quality
 │   ├─ components
 │   │   ├─ attracting
 │   │   │   ├─ attracting_components
 │   │   │   ├─ is_attracting_component
 │   │   │   └─ number_attracting_components
 │   │   ├─ biconnected
 │   │   │   ├─ articulation_points
 │   │   │   ├─ biconnected_component_edges
 │   │   │   ├─ biconnected_components
 │   │   │   └─ is_biconnected
 │   │   ├─ connected
 │   │   │   ├─ connected_components
 │   │   │   ├─ is_connected
 │   │   │   ├─ node_connected_component
 │   │   │   └─ number_connected_components
 │   │   ├─ semiconnected
 │   │   │   └─ is_semiconnected
 │   │   ├─ strongly_connected
 │   │   │   ├─ condensation
 │   │   │   ├─ is_strongly_connected
 │   │   │   ├─ kosaraju_strongly_connected_components
 │   │   │   ├─ number_strongly_connected_components
 │   │   │   ├─ strongly_connected_components
 │   │   │   └─ strongly_connected_components_recursive
 │   │   └─ weakly_connected
 │   │       ├─ is_weakly_connected
 │   │       ├─ number_weakly_connected_components
 │   │       └─ weakly_connected_components
 │   ├─ connectivity
 │   │   ├─ connectivity
 │   │   │   ├─ all_pairs_node_connectivity
 │   │   │   ├─ average_node_connectivity
 │   │   │   ├─ edge_connectivity
 │   │   │   ├─ local_edge_connectivity
 │   │   │   ├─ local_node_connectivity
 │   │   │   └─ node_connectivity
 │   │   ├─ cuts
 │   │   │   ├─ minimum_edge_cut
 │   │   │   ├─ minimum_node_cut
 │   │   │   ├─ minimum_st_edge_cut
 │   │   │   └─ minimum_st_node_cut
 │   │   ├─ disjoint_paths
 │   │   │   ├─ edge_disjoint_paths
 │   │   │   └─ node_disjoint_paths
 │   │   ├─ edge_augmentation
 │   │   │   ├─ bridge_augmentation
 │   │   │   ├─ collapse
 │   │   │   ├─ complement_edges
 │   │   │   ├─ greedy_k_edge_augmentation
 │   │   │   ├─ is_k_edge_connected
 │   │   │   ├─ is_locally_k_edge_connected
 │   │   │   ├─ k_edge_augmentation
 │   │   │   ├─ one_edge_augmentation
 │   │   │   ├─ partial_k_edge_augmentation
 │   │   │   ├─ unconstrained_bridge_augmentation
 │   │   │   ├─ unconstrained_one_edge_augmentation
 │   │   │   ├─ weighted_bridge_augmentation
 │   │   │   └─ weighted_one_edge_augmentation
 │   │   ├─ edge_kcomponents
 │   │   │   ├─ bridge_components
 │   │   │   ├─ general_k_edge_subgraphs
 │   │   │   ├─ k_edge_components
 │   │   │   └─ k_edge_subgraphs
 │   │   ├─ kcomponents
 │   │   │   └─ k_components
 │   │   ├─ kcutsets
 │   │   │   └─ all_node_cuts
 │   │   ├─ stoerwagner
 │   │   │   └─ stoer_wagner
 │   │   └─ utils
 │   │       ├─ build_auxiliary_edge_connectivity
 │   │       └─ build_auxiliary_node_connectivity
 │   ├─ core
 │   │   ├─ core_number
 │   │   ├─ k_core
 │   │   ├─ k_corona
 │   │   ├─ k_crust
 │   │   ├─ k_shell
 │   │   ├─ k_truss
 │   │   └─ onion_layers
 │   ├─ covering
 │   │   ├─ is_edge_cover
 │   │   └─ min_edge_cover
 │   ├─ cuts
 │   │   ├─ boundary_expansion
 │   │   ├─ conductance
 │   │   ├─ cut_size
 │   │   ├─ edge_expansion
 │   │   ├─ mixing_expansion
 │   │   ├─ node_expansion
 │   │   ├─ normalized_cut_size
 │   │   └─ volume
 │   ├─ cycles
 │   │   ├─ chordless_cycles
 │   │   ├─ cycle_basis
 │   │   ├─ find_cycle
 │   │   ├─ girth
 │   │   ├─ minimum_cycle_basis
 │   │   ├─ recursive_simple_cycles
 │   │   └─ simple_cycles
 │   ├─ d_separation
 │   │   ├─ find_minimal_d_separator
 │   │   ├─ is_d_separator
 │   │   └─ is_minimal_d_separator
 │   ├─ dag
 │   │   ├─ all_topological_sorts
 │   │   ├─ ancestors
 │   │   ├─ antichains
 │   │   ├─ compute_v_structures
 │   │   ├─ dag_longest_path
 │   │   ├─ dag_longest_path_length
 │   │   ├─ dag_to_branching
 │   │   ├─ descendants
 │   │   ├─ has_cycle
 │   │   ├─ is_aperiodic
 │   │   ├─ is_directed_acyclic_graph
 │   │   ├─ lexicographical_topological_sort
 │   │   ├─ root_to_leaf_paths
 │   │   ├─ topological_generations
 │   │   ├─ topological_sort
 │   │   ├─ transitive_closure
 │   │   ├─ transitive_closure_dag
 │   │   └─ transitive_reduction
 │   ├─ distance_measures
 │   │   ├─ barycenter
 │   │   ├─ center
 │   │   ├─ diameter
 │   │   ├─ eccentricity
 │   │   ├─ effective_graph_resistance
 │   │   ├─ kemeny_constant
 │   │   ├─ periphery
 │   │   ├─ radius
 │   │   └─ resistance_distance
 │   ├─ distance_regular
 │   │   ├─ intersection_array
 │   │   ├─ is_distance_regular
 │   │   └─ is_strongly_regular
 │   ├─ dominance
 │   │   ├─ dominance_frontiers
 │   │   └─ immediate_dominators
 │   ├─ dominating
 │   │   ├─ dominating_set
 │   │   └─ is_dominating_set
 │   ├─ efficiency_measures
 │   │   ├─ efficiency
 │   │   ├─ global_efficiency
 │   │   └─ local_efficiency
 │   ├─ euler
 │   │   ├─ eulerian_circuit
 │   │   ├─ eulerian_path
 │   │   ├─ eulerize
 │   │   ├─ has_eulerian_path
 │   │   ├─ is_eulerian
 │   │   └─ is_semieulerian
 │   ├─ flow
 │   │   ├─ boykovkolmogorov
 │   │   │   └─ boykov_kolmogorov
 │   │   ├─ capacityscaling
 │   │   │   └─ capacity_scaling
 │   │   ├─ dinitz_alg
 │   │   │   └─ dinitz
 │   │   ├─ edmondskarp
 │   │   │   └─ edmonds_karp
 │   │   ├─ gomory_hu
 │   │   │   └─ gomory_hu_tree
 │   │   ├─ maxflow
 │   │   │   ├─ maximum_flow
 │   │   │   ├─ maximum_flow_value
 │   │   │   ├─ minimum_cut
 │   │   │   └─ minimum_cut_value
 │   │   ├─ mincost
 │   │   │   ├─ cost_of_flow
 │   │   │   ├─ max_flow_min_cost
 │   │   │   ├─ min_cost_flow
 │   │   │   └─ min_cost_flow_cost
 │   │   ├─ networksimplex
 │   │   │   └─ network_simplex
 │   │   ├─ preflowpush
 │   │   │   └─ preflow_push
 │   │   ├─ shortestaugmentingpath
 │   │   │   └─ shortest_augmenting_path
 │   │   └─ utils
 │   │       ├─ build_flow_dict
 │   │       ├─ build_residual_network
 │   │       └─ detect_unboundedness
 │   ├─ graph_hashing
 │   │   ├─ weisfeiler_lehman_graph_hash
 │   │   └─ weisfeiler_lehman_subgraph_hashes
 │   ├─ graphical
 │   │   ├─ is_digraphical
 │   │   ├─ is_graphical
 │   │   ├─ is_multigraphical
 │   │   ├─ is_pseudographical
 │   │   ├─ is_valid_degree_sequence_erdos_gallai
 │   │   └─ is_valid_degree_sequence_havel_hakimi
 │   ├─ hierarchy
 │   │   └─ flow_hierarchy
 │   ├─ hybrid
 │   │   ├─ is_kl_connected
 │   │   └─ kl_connected_subgraph
 │   ├─ isolate
 │   │   ├─ is_isolate
 │   │   ├─ isolates
 │   │   └─ number_of_isolates
 │   ├─ isomorphism
 │   │   ├─ isomorph
 │   │   │   ├─ could_be_isomorphic
 │   │   │   ├─ fast_could_be_isomorphic
 │   │   │   ├─ faster_could_be_isomorphic
 │   │   │   └─ is_isomorphic
 │   │   ├─ tree_isomorphism
 │   │   │   ├─ assign_levels
 │   │   │   ├─ root_trees
 │   │   │   ├─ rooted_tree_isomorphism
 │   │   │   └─ tree_isomorphism
 │   │   └─ vf2pp
 │   │       ├─ vf2pp_all_isomorphisms
 │   │       ├─ vf2pp_is_isomorphic
 │   │       └─ vf2pp_isomorphism
 │   ├─ link_analysis
 │   │   ├─ hits_alg
 │   │   │   └─ hits
 │   │   └─ pagerank_alg
 │   │       ├─ google_matrix
 │   │       └─ pagerank
 │   ├─ link_prediction
 │   │   ├─ adamic_adar_index
 │   │   ├─ cn_soundarajan_hopcroft
 │   │   ├─ common_neighbor_centrality
 │   │   ├─ jaccard_coefficient
 │   │   ├─ preferential_attachment
 │   │   ├─ ra_index_soundarajan_hopcroft
 │   │   ├─ resource_allocation_index
 │   │   └─ within_inter_cluster
 │   ├─ lowest_common_ancestors
 │   │   ├─ all_pairs_lowest_common_ancestor
 │   │   ├─ lowest_common_ancestor
 │   │   └─ tree_all_pairs_lowest_common_ancestor
 │   ├─ matching
 │   │   ├─ is_matching
 │   │   ├─ is_maximal_matching
 │   │   ├─ is_perfect_matching
 │   │   ├─ max_weight_matching
 │   │   ├─ maximal_matching
 │   │   └─ min_weight_matching
 │   ├─ minors
 │   │   └─ contraction
 │   │       ├─ contracted_edge
 │   │       ├─ contracted_nodes
 │   │       └─ quotient_graph
 │   ├─ mis
 │   │   └─ maximal_independent_set
 │   ├─ moral
 │   │   └─ moral_graph
 │   ├─ node_classification
 │   │   ├─ harmonic_function
 │   │   └─ local_and_global_consistency
 │   ├─ non_randomness
 │   │   └─ non_randomness
 │   ├─ operators
 │   │   ├─ all
 │   │   │   ├─ compose_all
 │   │   │   ├─ disjoint_union_all
 │   │   │   ├─ intersection_all
 │   │   │   └─ union_all
 │   │   ├─ binary
 │   │   │   ├─ compose
 │   │   │   ├─ difference
 │   │   │   ├─ disjoint_union
 │   │   │   ├─ full_join
 │   │   │   ├─ intersection
 │   │   │   ├─ symmetric_difference
 │   │   │   └─ union
 │   │   ├─ product
 │   │   │   ├─ cartesian_product
 │   │   │   ├─ corona_product
 │   │   │   ├─ lexicographic_product
 │   │   │   ├─ modular_product
 │   │   │   ├─ power
 │   │   │   ├─ rooted_product
 │   │   │   ├─ strong_product
 │   │   │   └─ tensor_product
 │   │   └─ unary
 │   │       ├─ complement
 │   │       └─ reverse
 │   ├─ planarity
 │   │   ├─ check_planarity
 │   │   ├─ check_planarity_recursive
 │   │   ├─ get_counterexample
 │   │   ├─ get_counterexample_recursive
 │   │   └─ is_planar
 │   ├─ polynomials
 │   │   ├─ chromatic_polynomial
 │   │   └─ tutte_polynomial
 │   ├─ reciprocity
 │   │   ├─ overall_reciprocity
 │   │   └─ reciprocity
 │   ├─ regular
 │   │   ├─ is_k_regular
 │   │   ├─ is_regular
 │   │   └─ k_factor
 │   ├─ richclub
 │   │   └─ rich_club_coefficient
 │   ├─ shortest_paths
 │   │   ├─ astar
 │   │   │   ├─ astar_path
 │   │   │   └─ astar_path_length
 │   │   ├─ dense
 │   │   │   ├─ floyd_warshall
 │   │   │   ├─ floyd_warshall_numpy
 │   │   │   ├─ floyd_warshall_predecessor_and_distance
 │   │   │   └─ reconstruct_path
 │   │   ├─ generic
 │   │   │   ├─ all_pairs_all_shortest_paths
 │   │   │   ├─ all_shortest_paths
 │   │   │   ├─ average_shortest_path_length
 │   │   │   ├─ has_path
 │   │   │   ├─ shortest_path
 │   │   │   ├─ shortest_path_length
 │   │   │   └─ single_source_all_shortest_paths
 │   │   ├─ unweighted
 │   │   │   ├─ all_pairs_shortest_path
 │   │   │   ├─ all_pairs_shortest_path_length
 │   │   │   ├─ bidirectional_shortest_path
 │   │   │   ├─ predecessor
 │   │   │   ├─ single_source_shortest_path
 │   │   │   ├─ single_source_shortest_path_length
 │   │   │   ├─ single_target_shortest_path
 │   │   │   └─ single_target_shortest_path_length
 │   │   └─ weighted
 │   │       ├─ all_pairs_bellman_ford_path
 │   │       ├─ all_pairs_bellman_ford_path_length
 │   │       ├─ all_pairs_dijkstra
 │   │       ├─ all_pairs_dijkstra_path
 │   │       ├─ all_pairs_dijkstra_path_length
 │   │       ├─ bellman_ford_path
 │   │       ├─ bellman_ford_path_length
 │   │       ├─ bellman_ford_predecessor_and_distance
 │   │       ├─ bidirectional_dijkstra
 │   │       ├─ dijkstra_path
 │   │       ├─ dijkstra_path_length
 │   │       ├─ dijkstra_predecessor_and_distance
 │   │       ├─ find_negative_cycle
 │   │       ├─ goldberg_radzik
 │   │       ├─ johnson
 │   │       ├─ multi_source_dijkstra
 │   │       ├─ multi_source_dijkstra_path
 │   │       ├─ multi_source_dijkstra_path_length
 │   │       ├─ negative_edge_cycle
 │   │       ├─ single_source_bellman_ford
 │   │       ├─ single_source_bellman_ford_path
 │   │       ├─ single_source_bellman_ford_path_length
 │   │       ├─ single_source_dijkstra
 │   │       ├─ single_source_dijkstra_path
 │   │       └─ single_source_dijkstra_path_length
 │   ├─ similarity
 │   │   ├─ generate_random_paths
 │   │   ├─ graph_edit_distance
 │   │   ├─ optimal_edit_paths
 │   │   ├─ optimize_edit_paths
 │   │   ├─ optimize_graph_edit_distance
 │   │   ├─ panther_similarity
 │   │   └─ simrank_similarity
 │   ├─ simple_paths
 │   │   ├─ all_simple_edge_paths
 │   │   ├─ all_simple_paths
 │   │   ├─ is_simple_path
 │   │   └─ shortest_simple_paths
 │   ├─ smallworld
 │   │   ├─ lattice_reference
 │   │   ├─ omega
 │   │   ├─ random_reference
 │   │   └─ sigma
 │   ├─ smetric
 │   │   └─ s_metric
 │   ├─ sparsifiers
 │   │   └─ spanner
 │   ├─ structuralholes
 │   │   ├─ constraint
 │   │   ├─ effective_size
 │   │   ├─ local_constraint
 │   │   ├─ mutual_weight
 │   │   └─ normalized_mutual_weight
 │   ├─ summarization
 │   │   ├─ dedensify
 │   │   └─ snap_aggregation
 │   ├─ swap
 │   │   ├─ connected_double_edge_swap
 │   │   ├─ directed_edge_swap
 │   │   └─ double_edge_swap
 │   ├─ time_dependent
 │   │   └─ cd_index
 │   ├─ tournament
 │   │   ├─ hamiltonian_path
 │   │   ├─ is_reachable
 │   │   ├─ is_strongly_connected (tournament_is_strongly_connected)
 │   │   ├─ is_tournament
 │   │   ├─ random_tournament
 │   │   ├─ score_sequence
 │   │   └─ tournament_matrix
 │   ├─ traversal
 │   │   ├─ beamsearch
 │   │   │   └─ bfs_beam_edges
 │   │   ├─ breadth_first_search
 │   │   │   ├─ bfs_edges
 │   │   │   ├─ bfs_labeled_edges
 │   │   │   ├─ bfs_layers
 │   │   │   ├─ bfs_predecessors
 │   │   │   ├─ bfs_successors
 │   │   │   ├─ bfs_tree
 │   │   │   ├─ descendants_at_distance
 │   │   │   └─ generic_bfs_edges
 │   │   ├─ depth_first_search
 │   │   │   ├─ dfs_edges
 │   │   │   ├─ dfs_labeled_edges
 │   │   │   ├─ dfs_postorder_nodes
 │   │   │   ├─ dfs_predecessors
 │   │   │   ├─ dfs_preorder_nodes
 │   │   │   ├─ dfs_successors
 │   │   │   └─ dfs_tree
 │   │   ├─ edgebfs
 │   │   │   └─ edge_bfs
 │   │   └─ edgedfs
 │   │       └─ edge_dfs
 │   ├─ tree
 │   │   ├─ branchings
 │   │   │   ├─ branching_weight
 │   │   │   ├─ greedy_branching
 │   │   │   ├─ maximum_branching
 │   │   │   ├─ maximum_spanning_arborescence
 │   │   │   ├─ minimal_branching
 │   │   │   ├─ minimum_branching
 │   │   │   └─ minimum_spanning_arborescence
 │   │   ├─ coding
 │   │   │   ├─ from_nested_tuple
 │   │   │   ├─ from_prufer_sequence
 │   │   │   ├─ to_nested_tuple
 │   │   │   └─ to_prufer_sequence
 │   │   ├─ decomposition
 │   │   │   └─ junction_tree
 │   │   ├─ mst
 │   │   │   ├─ boruvka_mst_edges
 │   │   │   ├─ kruskal_mst_edges
 │   │   │   ├─ maximum_spanning_edges
 │   │   │   ├─ maximum_spanning_tree
 │   │   │   ├─ minimum_spanning_edges
 │   │   │   ├─ minimum_spanning_tree
 │   │   │   ├─ number_of_spanning_trees
 │   │   │   ├─ partition_spanning_tree
 │   │   │   ├─ prim_mst_edges
 │   │   │   └─ random_spanning_tree
 │   │   ├─ operations
 │   │   │   └─ join_trees
 │   │   └─ recognition
 │   │       ├─ is_arborescence
 │   │       ├─ is_branching
 │   │       ├─ is_forest
 │   │       └─ is_tree
 │   ├─ triads
 │   │   ├─ all_triads
 │   │   ├─ all_triplets
 │   │   ├─ is_triad
 │   │   ├─ random_triad
 │   │   ├─ triad_type
 │   │   ├─ triadic_census
 │   │   └─ triads_by_type
 │   ├─ vitality
 │   │   └─ closeness_vitality
 │   ├─ voronoi
 │   │   └─ voronoi_cells
 │   ├─ walks
 │   │   └─ number_of_walks
 │   └─ wiener
 │       ├─ gutman_index
 │       ├─ schultz_index
 │       └─ wiener_index
 ├─ classes
 │   └─ function
 │       └─ is_negatively_weighted
 ├─ convert
 │   ├─ from_dict_of_dicts
 │   ├─ from_dict_of_lists
 │   ├─ from_edgelist
 │   ├─ to_dict_of_lists
 │   └─ to_edgelist
 ├─ convert_matrix
 │   ├─ from_numpy_array
 │   ├─ from_pandas_adjacency
 │   ├─ from_pandas_edgelist
 │   ├─ from_scipy_sparse_array
 │   ├─ to_numpy_array
 │   ├─ to_pandas_adjacency
 │   ├─ to_pandas_edgelist
 │   └─ to_scipy_sparse_array
 ├─ drawing
 │   ├─ nx_agraph
 │   │   ├─ from_agraph
 │   │   └─ read_dot (agraph_read_dot)
 │   └─ nx_pydot
 │       ├─ from_pydot
 │       └─ read_dot (pydot_read_dot)
 ├─ generators
 │   ├─ atlas
 │   │   ├─ graph_atlas
 │   │   └─ graph_atlas_g
 │   ├─ classic
 │   │   ├─ balanced_tree
 │   │   ├─ barbell_graph
 │   │   ├─ binomial_tree
 │   │   ├─ circulant_graph
 │   │   ├─ circular_ladder_graph
 │   │   ├─ complete_graph
 │   │   ├─ complete_multipartite_graph
 │   │   ├─ cycle_graph
 │   │   ├─ dorogovtsev_goltsev_mendes_graph
 │   │   ├─ empty_graph
 │   │   ├─ full_rary_tree
 │   │   ├─ kneser_graph
 │   │   ├─ ladder_graph
 │   │   ├─ lollipop_graph
 │   │   ├─ null_graph
 │   │   ├─ path_graph
 │   │   ├─ star_graph
 │   │   ├─ tadpole_graph
 │   │   ├─ trivial_graph
 │   │   ├─ turan_graph
 │   │   └─ wheel_graph
 │   ├─ cographs
 │   │   └─ random_cograph
 │   ├─ community
 │   │   ├─ LFR_benchmark_graph
 │   │   ├─ caveman_graph
 │   │   ├─ connected_caveman_graph
 │   │   ├─ gaussian_random_partition_graph
 │   │   ├─ planted_partition_graph
 │   │   ├─ random_partition_graph
 │   │   ├─ relaxed_caveman_graph
 │   │   ├─ ring_of_cliques
 │   │   ├─ stochastic_block_model
 │   │   └─ windmill_graph
 │   ├─ degree_seq
 │   │   ├─ configuration_model
 │   │   ├─ degree_sequence_tree
 │   │   ├─ directed_configuration_model
 │   │   ├─ directed_havel_hakimi_graph
 │   │   ├─ expected_degree_graph
 │   │   ├─ havel_hakimi_graph
 │   │   └─ random_degree_sequence_graph
 │   ├─ directed
 │   │   ├─ gn_graph
 │   │   ├─ gnc_graph
 │   │   ├─ gnr_graph
 │   │   ├─ random_k_out_graph
 │   │   ├─ random_uniform_k_out_graph
 │   │   └─ scale_free_graph
 │   ├─ duplication
 │   │   ├─ duplication_divergence_graph
 │   │   └─ partial_duplication_graph
 │   ├─ ego
 │   │   └─ ego_graph
 │   ├─ expanders
 │   │   ├─ chordal_cycle_graph
 │   │   ├─ is_regular_expander
 │   │   ├─ margulis_gabber_galil_graph
 │   │   ├─ maybe_regular_expander
 │   │   ├─ paley_graph
 │   │   └─ random_regular_expander_graph
 │   ├─ geometric
 │   │   ├─ geographical_threshold_graph
 │   │   ├─ geometric_edges
 │   │   ├─ geometric_soft_configuration_graph
 │   │   ├─ navigable_small_world_graph
 │   │   ├─ random_geometric_graph
 │   │   ├─ soft_random_geometric_graph
 │   │   ├─ thresholded_random_geometric_graph
 │   │   └─ waxman_graph
 │   ├─ harary_graph
 │   │   ├─ hkn_harary_graph
 │   │   └─ hnm_harary_graph
 │   ├─ internet_as_graphs
 │   │   └─ random_internet_as_graph
 │   ├─ intersection
 │   │   ├─ general_random_intersection_graph
 │   │   ├─ k_random_intersection_graph
 │   │   └─ uniform_random_intersection_graph
 │   ├─ interval_graph
 │   │   └─ interval_graph
 │   ├─ joint_degree_seq
 │   │   ├─ directed_joint_degree_graph
 │   │   ├─ is_valid_directed_joint_degree
 │   │   ├─ is_valid_joint_degree
 │   │   └─ joint_degree_graph
 │   ├─ lattice
 │   │   ├─ grid_2d_graph
 │   │   ├─ grid_graph
 │   │   ├─ hexagonal_lattice_graph
 │   │   ├─ hypercube_graph
 │   │   └─ triangular_lattice_graph
 │   ├─ line
 │   │   ├─ inverse_line_graph
 │   │   └─ line_graph
 │   ├─ mycielski
 │   │   ├─ mycielski_graph
 │   │   └─ mycielskian
 │   ├─ nonisomorphic_trees
 │   │   ├─ nonisomorphic_trees
 │   │   └─ number_of_nonisomorphic_trees
 │   ├─ random_clustered
 │   │   └─ random_clustered_graph
 │   ├─ random_graphs
 │   │   ├─ barabasi_albert_graph
 │   │   ├─ connected_watts_strogatz_graph
 │   │   ├─ dense_gnm_random_graph
 │   │   ├─ dual_barabasi_albert_graph
 │   │   ├─ extended_barabasi_albert_graph
 │   │   ├─ fast_gnp_random_graph
 │   │   ├─ gnm_random_graph
 │   │   ├─ gnp_random_graph
 │   │   ├─ newman_watts_strogatz_graph
 │   │   ├─ powerlaw_cluster_graph
 │   │   ├─ random_kernel_graph
 │   │   ├─ random_lobster
 │   │   ├─ random_powerlaw_tree
 │   │   ├─ random_powerlaw_tree_sequence
 │   │   ├─ random_regular_graph
 │   │   ├─ random_shell_graph
 │   │   └─ watts_strogatz_graph
 │   ├─ small
 │   │   ├─ LCF_graph
 │   │   ├─ bull_graph
 │   │   ├─ chvatal_graph
 │   │   ├─ cubical_graph
 │   │   ├─ desargues_graph
 │   │   ├─ diamond_graph
 │   │   ├─ dodecahedral_graph
 │   │   ├─ frucht_graph
 │   │   ├─ heawood_graph
 │   │   ├─ hoffman_singleton_graph
 │   │   ├─ house_graph
 │   │   ├─ house_x_graph
 │   │   ├─ icosahedral_graph
 │   │   ├─ krackhardt_kite_graph
 │   │   ├─ moebius_kantor_graph
 │   │   ├─ octahedral_graph
 │   │   ├─ pappus_graph
 │   │   ├─ petersen_graph
 │   │   ├─ sedgewick_maze_graph
 │   │   ├─ tetrahedral_graph
 │   │   ├─ truncated_cube_graph
 │   │   ├─ truncated_tetrahedron_graph
 │   │   └─ tutte_graph
 │   ├─ social
 │   │   ├─ davis_southern_women_graph
 │   │   ├─ florentine_families_graph
 │   │   ├─ karate_club_graph
 │   │   └─ les_miserables_graph
 │   ├─ spectral_graph_forge
 │   │   └─ spectral_graph_forge
 │   ├─ stochastic
 │   │   └─ stochastic_graph
 │   ├─ sudoku
 │   │   └─ sudoku_graph
 │   ├─ time_series
 │   │   └─ visibility_graph
 │   ├─ trees
 │   │   ├─ prefix_tree
 │   │   ├─ prefix_tree_recursive
 │   │   ├─ random_labeled_rooted_forest
 │   │   ├─ random_labeled_rooted_tree
 │   │   ├─ random_labeled_tree
 │   │   ├─ random_tree
 │   │   ├─ random_unlabeled_rooted_forest
 │   │   ├─ random_unlabeled_rooted_tree
 │   │   └─ random_unlabeled_tree
 │   └─ triads
 │       └─ triad_graph
 ├─ linalg
 │   ├─ algebraicconnectivity
 │   │   ├─ algebraic_connectivity
 │   │   ├─ fiedler_vector
 │   │   ├─ spectral_bisection
 │   │   └─ spectral_ordering
 │   ├─ attrmatrix
 │   │   ├─ attr_matrix
 │   │   └─ attr_sparse_matrix
 │   ├─ bethehessianmatrix
 │   │   └─ bethe_hessian_matrix
 │   ├─ graphmatrix
 │   │   ├─ adjacency_matrix
 │   │   └─ incidence_matrix
 │   ├─ laplacianmatrix
 │   │   ├─ directed_combinatorial_laplacian_matrix
 │   │   ├─ directed_laplacian_matrix
 │   │   ├─ laplacian_matrix
 │   │   ├─ normalized_laplacian_matrix
 │   │   └─ total_spanning_tree_weight
 │   ├─ modularitymatrix
 │   │   ├─ directed_modularity_matrix
 │   │   └─ modularity_matrix
 │   └─ spectrum
 │       ├─ adjacency_spectrum
 │       ├─ bethe_hessian_spectrum
 │       ├─ laplacian_spectrum
 │       ├─ modularity_spectrum
 │       └─ normalized_laplacian_spectrum
 ├─ readwrite
 │   ├─ adjlist
 │   │   ├─ parse_adjlist
 │   │   └─ read_adjlist
 │   ├─ edgelist
 │   │   ├─ parse_edgelist
 │   │   ├─ read_edgelist
 │   │   └─ read_weighted_edgelist
 │   ├─ gexf
 │   │   └─ read_gexf
 │   ├─ gml
 │   │   ├─ parse_gml
 │   │   └─ read_gml
 │   ├─ graph6
 │   │   ├─ from_graph6_bytes
 │   │   └─ read_graph6
 │   ├─ graphml
 │   │   ├─ parse_graphml
 │   │   └─ read_graphml
 │   ├─ json_graph
 │   │   ├─ adjacency
 │   │   │   └─ adjacency_graph
 │   │   ├─ cytoscape
 │   │   │   └─ cytoscape_graph
 │   │   ├─ node_link
 │   │   │   └─ node_link_graph
 │   │   └─ tree
 │   │       └─ tree_graph
 │   ├─ leda
 │   │   ├─ parse_leda
 │   │   └─ read_leda
 │   ├─ multiline_adjlist
 │   │   ├─ parse_multiline_adjlist
 │   │   └─ read_multiline_adjlist
 │   ├─ pajek
 │   │   ├─ parse_pajek
 │   │   └─ read_pajek
 │   └─ sparse6
 │       ├─ from_sparse6_bytes
 │       └─ read_sparse6
 └─ relabel
     ├─ convert_node_labels_to_integers
     └─ relabel_nodes

Copy link

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I can't wait to see this feature get merged. I didn't notice anything wrong but I did have a few questions.

networkx/algorithms/tree/branchings.py Outdated Show resolved Hide resolved
Comment on lines 1050 to 1052
"You may also use `G.__networkx_cache__.clear()` to "
"manually clear the cache, or set `G.__networkx_cache__` "
"to None to disable caching for G. Enable or disable "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to advertise direct access to the dunder? I wonder if it would be better to support manually clearing the cache later with a new API or even just nx._clear_cache().

Copy link
Contributor Author

@eriknw eriknw Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dan suggested this here: #7345 (comment)

Would it be better to suggest nx._clear_cache(G) instead? Should we make it not private?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like that better, but I don't feel strongly either way. I'm also thinking we can keep it named with the sunder since it's easier to go private->public later.

networkx/utils/backends.py Outdated Show resolved Hide resolved
graph_name=graph_name,
)
if use_cache and nx_cache is not None:
# Remove old cached items that are no longer necessary since they

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I lost track of how this works; can there be multiple converted graphs cached for each backend? I was under the impression the cache contains only the last converted graph for each backend, and the logic above was to support returning a cached graph that's potentially a superset (attribute-wise) of the graph needed. If that's not the case, I'm just wondering if this could use an unexpectedly large amount of memory (thinking of nx-cugraph instances and GPU memory in particular).

@dschult
Copy link
Member

dschult commented Mar 26, 2024

I think we are running into a problem of expectations. Each of us (Rick, Erik, Dan and Mridul) have different ideas of what the cache is supposed to store. We're confounding storing graph objects, storing function results and storing partial results that could be used for future computations. While it might be possible to provide all of these with a single caching mechanism, it is also possible that it'd be better to provide separate mechanisms for the different types of caching.

As for caching converted graphs, the difficulty as I understand it is that some backends -- like cugraph -- convert all the node and edge attributes into the new structure, while other backends --like graphblas -- only convert one edge attribute. So some will naturally only want one version of the graph cached while others will want an option for more than one cached representation. Yet others will likely want to cache partially computed results.

I'd like our caching mechanism to be cross-backend friendly. That is, one backend should not interfere with another's cache. Then each backend is responsible for updating their cached objects. I guess I'm envisioning a system where NetworkX provides a way for the backends to mark each cached object as needing to be cleared when certain NetworkX graph mutations are made. Then the backends don't directly clear the whole cache. They clear "their" stored objects as needed. And NetworkX clears cached objects from any backend when graph mutations are made. Which cached objects? The ones that have been tagged as being cleared for that mutation.

For example: we could have a dict of cache keys to be cleared when nodes are added/removed, another when edges are added/removed. I'm not sure how to handle the cases when node/edge attributes are changed.

I think the current setup tries to clear the whole cache when any mutation is made. And that might be the right implementation. But we will be losing the opportunity to cache e.g. a graphblas matrix representation for attribute "edge_length" when we update the attribute "edge_capacity". Is this a big deal? I don't know. :) But I guess the first check of our design could be "does it work well for cugraph, graphblas and an imagined scipy_sparse backend"?

@eriknw
Copy link
Contributor Author

eriknw commented Mar 26, 2024

Quick reactions:

But I guess the first check of our design could be "does it work well for cugraph, graphblas and an imagined scipy_sparse backend"?

I agree with this goal. This has been a principle driving the design of dispatching. And I believe this PR does.

I think the current setup tries to clear the whole cache when any mutation is made.

Currently, yes, for two important reasons:

  • it's the safest option given that we're adding caching right before releasing
  • it's pragmatic and a huge incremental gain

I like the idea of selectively clearing the cache based on what kind of mutation is made, and this PR doesn't prevent us from doing that in the future. If we tried to add selective cache clearing now, I worry it would either delay 3.3 or not get in 3.3. I would rather get this PR in 3.3, then give us plenty of time to discuss improving caching.

@dschult
Copy link
Member

dschult commented Mar 26, 2024

I agree with all those comments @eriknw
The goal here is to get a minimal caching interface in for 3.3.
I was responding to what @rlratzel asked about caching once for each graph vs potentially caching more than once. And then I got carried away with looking forward.

I have been (hopefully) more productive too. I went through your list of dispatchable functions (~4 comments up). Here are the ones I think are really just helper functions.

Two groups of such are:

  1. functions ending in "_recursive". They are the same implementation as the version without "_recursive". The signature should be the same. The algorithm is the same. So I don't think there is any advantage to making them dispatchable. (might be no harm either... but could lead to confusion later.)
  2. functions ending in _numpy. They are using a matrix manipulation algorithm rather than graph traversal. But the signature should be the same as the version without _numpy. Here, the algorithm is not the same. So perhaps there is a reason to keep it dispatchable. But in the spirit of having backends provide the algorithm for a given function signature, if feels like these shouldn't be dispatchable to me. Again, there might not be harm in making them dispatchable except that somebody writing a backend might try to implement them both.

To be clear, I'm not saying these need to have dispatching removed. I'm saying that these are functions I flagged as not needing dispatching because they either duplicate another function (e.g. _recursive) or are really just helper functions.

My list of functions (including the module info on previous line):

│   │   ├─ treewidth
 │   │   │   ├─ treewidth_decomp

 │   ├─ asteroidal
 │   │   ├─ create_component_structure

 │   ├─ clique
 │   │   ├─ find_cliques_recursive

│   ├─ coloring
 │   │   ├─ equitable_coloring
 │   │   │   └─ pad_graph
 
│   ├─ planarity
 │   │   ├─ check_planarity_recursive
 │   │   ├─ get_counterexample_recursive

@dschult
Copy link
Member

dschult commented Mar 26, 2024

I have approved this PR and my recent postings only continue conversation. They don't add to anything needed for this PR.

It sounds like @rlratzel looked through this fairly closely too.

Copy link

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes LGTM, thanks! The remaining questions from my prior review were mainly for my own understanding and need not hold up my approval.

@dschult dschult requested a review from rossbar March 28, 2024 18:31
@dschult
Copy link
Member

dschult commented Mar 28, 2024

@rossbar can you take a look at this PR? Some attendees of GTC have requested more info about the caching data shown in Rick and Mridul's talk there. So, it'd be good to get this in.

@jarrodmillman once this is approved and merged, can we make another rc (or a full release -- though we probably need to get the numpy issues working for the full release... If that starts taking a while (more than 2 more weeks or so?) maybe we can push those into 3.3.1 or something.)

Copy link
Member

@MridulS MridulS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking here for me! I do want to revisit the backends.py file later, we should probably explain the arch somewhere.

- Enable caching to be disabled by setting `__networkx_cache__` to None (or delete it).
- Improve warning to say how to manually clear cache.
- Update some `_dispatchable` decorators that were discovered to mutate data.
- Make the residual Graph an implementation detail and not dispatchable.
- Improve dispatch tests that mutate input (but more still needs done).
@MridulS MridulS changed the title Cache backend graphs ENH: Cache graphs objects when converting to a backend Mar 31, 2024
@MridulS
Copy link
Member

MridulS commented Mar 31, 2024

Hmm, rebasing the PR also didn't fix the benchmark unknown commit error. I'll have a look at the benchmark bits later. Merging this in for now! Thanks everyone!!

@MridulS MridulS merged commit 6e84e1e into networkx:main Mar 31, 2024
39 of 41 checks passed
@jarrodmillman jarrodmillman added this to the 3.3 milestone Mar 31, 2024
cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024
* Minimal PR to add cache dict to graphs and clear them

* Cache backend graph conversions

* Don't use cache when testing backends

* Better caching

* fix typo

* Don't be silly; be optimistic. Also, update comments

* Add warning when using a cached value

* Make caching more robust

- Enable caching to be disabled by setting `__networkx_cache__` to None (or delete it).
- Improve warning to say how to manually clear cache.
- Update some `_dispatchable` decorators that were discovered to mutate data.
- Make the residual Graph an implementation detail and not dispatchable.
- Improve dispatch tests that mutate input (but more still needs done).

* Add config to control caching

* Use `nx._clear_cache` to clear the cache

* Add note about config being global

* Disable cache for `maximum_branching` internal graph

* DRY
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants