Make metapath walk faster by grouping adjacency lists by node type #1399

kjun9 · 2020-04-29T04:59:56Z

Pulled out while working on #934 - currently this demo is a bit unusable and doesn't provide a good enough experience to be converted into a demo notebook, since it takes hours for it to run on a laptop, but with the changes from this PR, it should run within ~ 10 minutes or so depending on the parameters used.

This mainly updates EdgeData to store an inner set of dicts within each adjacency list, so that we group adjacency elements by their node type. This could probably be useful for other heterogeneous algorithms that want to traverse the graph either for random walks or SAGE-type sampling - currently it's just been used for Metapath2Vec specifically. At its current state in this PR, the speedup for Metapath2vec looks pretty great, but graph.neighbours becomes quite a bit slower for homogeneous graphs

Benchmark against develop for metapath walk:

----------------------------------------------------------------------------------------------------------------- benchmark: 2 tests -----------------------------------------------------------------------------------------------------------------
Name (time in us)                                                   Min                    Max                   Mean                StdDev                 Median                   IQR            Outliers         OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_uniformrandommetapathwalk (NOW)                 409.0180 (1.0)       1,415.0450 (1.0)         508.5108 (1.0)         89.4222 (1.0)         480.2570 (1.0)         38.4945 (1.0)       126;171  1,966.5264 (1.0)        1349           1
test_benchmark_uniformrandommetapathwalk (0007_1cca6b6)     11,785.8720 (28.82)    20,027.5200 (14.15)    15,493.4845 (30.47)    1,442.1304 (16.13)    15,570.4430 (32.42)    1,697.6358 (44.10)        15;3     64.5433 (0.03)         61           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Things do get a bit slower for just getting neighbours without caring about types, and constructing the stellargraph object

---------------------------------------------------------------------------------------- benchmark 'StellarGraph neighbours': 2 tests ----------------------------------------------------------------------------------------
Name (time in us)                                     Min                   Max                  Mean              StdDev              Median                IQR            Outliers         OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_get_neighbours (0009_1cca6b6)     673.0350 (1.0)      1,753.2520 (1.0)        772.2845 (1.0)      112.2834 (1.0)      738.3375 (1.0)      46.6965 (1.0)       120;151  1,294.8595 (1.0)        1244           1
test_benchmark_get_neighbours (NOW)              874.3200 (1.30)     1,957.4830 (1.12)     1,030.1057 (1.33)     156.3577 (1.39)     985.7585 (1.34)     78.7610 (1.69)       96;114    970.7742 (0.75)        920           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------- benchmark 'StellarGraph creation (time)': 12 tests ------------------------------------------------------------------------------------
Name (time in ms)                                              Min                 Max               Mean             StdDev             Median               IQR            Outliers       OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_creation[None-0-0] (0011_1cca6b6)            7.0827 (1.0)       10.4290 (1.0)       7.7464 (1.0)       0.5459 (1.01)      7.5999 (1.0)      0.3738 (1.0)         15;10  129.0918 (1.0)         123           1
test_benchmark_creation[None-0-0] (NOW)                     7.2598 (1.03)      11.9819 (1.15)      7.8674 (1.02)      0.5400 (1.0)       7.7635 (1.02)     0.4122 (1.10)         11;5  127.1069 (0.98)        114           1
test_benchmark_creation[None-100-200] (0011_1cca6b6)        7.7215 (1.09)      11.0146 (1.06)      8.6099 (1.11)      0.5706 (1.06)      8.4670 (1.11)     0.4709 (1.26)         18;5  116.1448 (0.90)         95           1
test_benchmark_creation[None-100-200] (NOW)                 8.2406 (1.16)      80.2710 (7.70)      9.9185 (1.28)      7.0318 (13.02)     8.9742 (1.18)     0.4729 (1.26)         1;12  100.8218 (0.78)        104           1
test_benchmark_creation[100-0-0] (0011_1cca6b6)             9.9753 (1.41)      16.8219 (1.61)     10.9606 (1.41)      0.8742 (1.62)     10.7575 (1.42)     0.8113 (2.17)         12;4   91.2356 (0.71)         91           1
test_benchmark_creation[100-0-0] (NOW)                     10.0131 (1.41)      14.2590 (1.37)     10.8740 (1.40)      0.8288 (1.53)     10.5627 (1.39)     0.7808 (2.09)          9;7   91.9623 (0.71)         91           1
test_benchmark_creation[100-100-200] (0011_1cca6b6)        10.3021 (1.45)      14.3739 (1.38)     11.7725 (1.52)      0.8119 (1.50)     11.6448 (1.53)     1.1646 (3.12)         18;2   84.9437 (0.66)         79           1
test_benchmark_creation[100-100-200] (NOW)                 11.6161 (1.64)      27.1939 (2.61)     14.2690 (1.84)      3.7345 (6.92)     12.6178 (1.66)     2.2522 (6.02)        10;10   70.0822 (0.54)         78           1
test_benchmark_creation[None-1000-5000] (0011_1cca6b6)     18.3149 (2.59)     100.4675 (9.63)     21.3902 (2.76)     11.4734 (21.25)    19.6913 (2.59)     1.2455 (3.33)          1;4   46.7505 (0.36)         50           1
test_benchmark_creation[100-1000-5000] (0011_1cca6b6)      20.9809 (2.96)      25.4421 (2.44)     22.9116 (2.96)      1.2696 (2.35)     22.8039 (3.00)     2.0631 (5.52)         16;0   43.6460 (0.34)         37           1
test_benchmark_creation[None-1000-5000] (NOW)              30.8860 (4.36)     114.9587 (11.02)    45.6015 (5.89)     29.9989 (55.56)    32.7245 (4.31)     5.6656 (15.16)         4;4   21.9291 (0.17)         26           1
test_benchmark_creation[100-1000-5000] (NOW)               34.9050 (4.93)     148.0317 (14.19)    56.4356 (7.29)     38.1269 (70.61)    37.4651 (4.93)     8.3243 (22.27)         4;4   17.7193 (0.14)         21           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

codeclimate · 2020-04-29T05:00:33Z

Code Climate has analyzed commit 1e4d782 and detected 3 issues on this pull request.

Here's the issue category breakdown:

Category	Count
Complexity	1
Security	2

View more on Code Climate.

huonw · 2020-04-29T06:02:53Z

Unfortunately this seems to duplicate a bunch of the node-data/edge-data interaction work from #1220, unlucky! 🙁

A 29× speed-up is pretty nice!

Things do get a bit slower for just getting neighbours without caring about types, and constructing the stellargraph object

As a potential tweak to the implementation of this, maybe we could take inspiration from #1291 and #1316, expanding EdgeData to have multiple modes of indexing/caching each of which are lazily computed as required:

basic type-less indexing (directed vs. undirected computed separately even, per StellarGraph._edges._edges_in_dict andStellarGraph._edges._edges_out_dict are unused for undirected graphs #1316)
other-node type indexing per this PR
extension: edge type indexing
extension: edge and other-node type indexing

This has the downside of requiring multiple batches of memory, but it may be the right trade-off.

In particular, I'm assuming that the workflow would typically be an interactive session with small-ish graphs for experimenting with different algorithms (thus requiring multiple indexing schemes), where the user has more control and monitoring of memory usage. A real ML pipeline on larger data would then be only using one or two algorithms, and thus only creating and storing the indices that are really needed.

What do you think?

kjun9 · 2020-04-29T06:23:16Z

expanding EdgeData to have multiple modes of indexing/caching each of which are lazily computed as required

That's an interesting idea. So we would build the index automatically when graph.neighbors is called with a non-None other_node_type value?

This has the downside of requiring multiple batches of memory, but it may be the right trade-off

As an alternative, maybe it could be specific generators/walkers that force a graph to build the index, and default to the original behaviour of filtering the adjacency list after obtaining them if the index doesn't exist

class StellarGraph:
    def build_adjacency_index(index_by)
       # do nothing if index exists otherwise build

class UniformRandomMetapathWalk:
    def __init__(self, graph, ...):
        self._graph = graph
    def run(self, ...):
        self._graph.build_adjacency_index("other_node_type")

or something like that?

kieranricardo · 2020-04-29T07:07:24Z

nice speedup!

Unfortunately I think part of this speedup is due to not having to pay the cost of converting the node ids to ilocs in node_type - #1267 gets a 7x speedup on metapath. Also, changing node_type to handle multiple nodes and doing:

neighbour_types = self.graph.node_type(neighbours, use_ilocs=True)
neighbours = [
    n_node
    for n_node, n_type in zip(neighbours, neighbour_types)
    if n_type == metapath[d]
]

gets another 2x speedup, so 15x in total but still shy of your 29x! But maybe this could be a quick way of speeding up metapath in time for the release, what do you think?

That being said, I think this is still a valuable way of storing the edges, and we should definitely implement this as an edge caching option!

kieranricardo · 2020-04-29T07:13:02Z

On the topic of edge caching, for unweighted graphs I think we can get a ~20x speedup (on top of this 29x speedup) if we store the node_ids/node_ilocs instead of edge_ilocs in the adj lists. It looks like 95% of the time in neighbors is spent doing pandas lookups with the edge_ilocs.

In the case of node_ilocs, this wouldn't cost any extra memory, infact if the node_ilocs type is smaller than edge_ilocs this will save memory.

kieranricardo · 2020-04-29T07:14:13Z

@kjun9 can you run the allocation benchmarks for this PR?

huonw · 2020-04-29T07:23:15Z

That's an interesting idea. So we would build the index automatically when graph.neighbors is called with a non-None other_node_type value?

Yeah.

As an alternative, maybe it could be specific generators/walkers that force a graph to build the index, and default to the original behaviour of filtering the adjacency list after obtaining them if the index doesn't exist

I guess that's equivalent in my mind to EdgeData computing them when other_node_type is specified, since that'll happen if (and only if) it's being used by an algorithm that would benefit from the indexing; or maybe I'm missing something?

gets another 2x speedup, so 15x in total but still shy of your 29x! But maybe this could be a quick way of speeding up metapath in time for the release, what do you think?

If you've done that work and measured it, can we have a PR for it?

On the topic of edge caching, for unweighted graphs I think we can get a ~20x speedup (on top of this 29x speedup) if we store the node_ids/node_ilocs instead of edge_ilocs in the adj lists. It looks like 95% of the time in neighbors is spent doing pandas lookups with the edge_ilocs.

I think that edge_ilocs vs. node_ilocs is a separate problem and worth having a separate issue for, to discuss without getting totally distracted from this one. I specifically chose the current approach because it means we can easily get to whatever data we want about the edge, such as the edge weight and potentially edge features (#1326). Although you're correct that it's probably unnecessarily expensive for an unweighted graph without edge features.

pandas lookups with the edge_ilocs

Typo: "numpy lookups"?

@kjun9 can you run the allocation benchmarks for this PR?

I think there's enough above that we can potentially pause effort on this PR for a bit (at least until the other PRs land) and not spent too much time doing comparisons with it until we've done that baseline work, since those comparisons will be invalidated anyway?

kjun9 · 2020-04-29T08:16:01Z

I guess that's equivalent in my mind to EdgeData computing them when other_node_type is specified, since that'll happen if (and only if) it's being used by an algorithm that would benefit from the indexing; or maybe I'm missing something?

I was thinking that building/not-building an index "explicitly" would potentially be useful e.g. an algorithm just requires a single call to graph.neighbors with other_node_type.. but yeah not sure if that's actually a real use-case 🤷

I've adjusted the code to do the building lazily, taking inspiration from #1291 and also thinking a little bit about the extensions

I'll include the new benchmarks here anyway but we can keep in mind the comparisons will be invalidated by the other PRs (still a bit slower due to additional if statements, but not quite as bad, and the allocation was probably much worse before):

------------------------------------------------------------------------------------------------------------- benchmark 'StellarGraph creation': 12 tests --------------------------------------------------------------------------------------------------------------
Name (time in s)                                                                 Min                       Max                      Mean                StdDev                    Median                   IQR            Outliers     OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_allocation_benchmark_creation[None-0-0] (NOW)                       13,575.0000 (1.0)         16,431.0000 (1.00)        14,816.6000 (1.03)       986.0009 (2.05)        14,671.0000 (1.03)     2,192.0000 (inf)           8;0  0.0001 (0.97)         20           1
test_allocation_benchmark_creation[None-0-0] (0014_1cca6b6)              14,223.0000 (1.05)        16,399.0000 (1.0)         14,353.4000 (1.0)        481.5141 (1.00)        14,247.0000 (1.0)          0.0000 (1.0)           1;2  0.0001 (1.0)          20           1
test_allocation_benchmark_creation[100-0-0] (0014_1cca6b6)               15,231.0000 (1.12)        18,503.0000 (1.13)        15,544.2000 (1.08)       699.1328 (1.46)        15,415.0000 (1.08)         0.0000 (1.0)           1;4  0.0001 (0.92)         20           1
test_allocation_benchmark_creation[100-0-0] (NOW)                        15,655.0000 (1.15)        18,279.0000 (1.11)        15,998.6000 (1.11)       578.2964 (1.20)        15,839.0000 (1.11)         0.0000 (1.0)           2;3  0.0001 (0.90)         20           1
test_allocation_benchmark_creation[None-100-200] (0014_1cca6b6)          56,499.0000 (4.16)        58,812.0000 (3.59)        56,780.2500 (3.96)       479.9861 (1.0)         56,683.0000 (3.98)         0.0000 (1.0)           1;2  0.0000 (0.25)         20           1
test_allocation_benchmark_creation[None-100-200] (NOW)                   65,295.0000 (4.81)        68,991.0000 (4.21)        66,511.8000 (4.63)     1,049.2252 (2.19)        66,391.0000 (4.66)     2,192.0000 (inf)           7;0  0.0000 (0.22)         20           1
test_allocation_benchmark_creation[100-100-200] (0014_1cca6b6)           94,571.0000 (6.97)        99,138.0000 (6.05)        95,886.1500 (6.68)     1,013.8430 (2.11)        95,667.0000 (6.71)       548.0000 (inf)           4;4  0.0000 (0.15)         20           1
test_allocation_benchmark_creation[100-100-200] (NOW)                   104,169.0000 (7.67)       110,461.0000 (6.74)       105,515.6000 (7.35)     1,364.7053 (2.84)       105,265.0000 (7.39)       640.0000 (inf)           1;5  0.0000 (0.14)         20           1
test_allocation_benchmark_creation[None-1000-5000] (0014_1cca6b6)       613,899.0000 (45.22)      618,945.0000 (37.74)      614,303.3000 (42.80)    1,093.1993 (2.28)       614,067.0000 (43.10)        0.0000 (1.0)           1;3  0.0000 (0.02)         20           1
test_allocation_benchmark_creation[None-1000-5000] (NOW)                693,529.0000 (51.09)      699,983.0000 (42.68)      694,938.5000 (48.42)    1,450.7666 (3.02)       694,625.0000 (48.76)    1,736.0000 (inf)           1;1  0.0000 (0.02)         20           1
test_allocation_benchmark_creation[100-1000-5000] (0014_1cca6b6)      1,008,299.0000 (74.28)    1,013,701.0000 (61.81)    1,009,657.5000 (70.34)    1,161.5682 (2.42)     1,009,395.0000 (70.85)      548.0000 (inf)           4;4  0.0000 (0.01)         20           1
test_allocation_benchmark_creation[100-1000-5000] (NOW)               1,089,323.0000 (80.24)    1,095,969.0000 (66.83)    1,090,687.3000 (75.99)    1,432.8181 (2.99)     1,090,419.0000 (76.54)      640.0000 (inf)           1;5  0.0000 (0.01)         20           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------- benchmark 'StellarGraph neighbours': 2 tests ----------------------------------------------------------------------------------------
Name (time in us)                                     Min                   Max                Mean              StdDev              Median                 IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_get_neighbours (0014_1cca6b6)     700.0500 (1.0)      2,304.0250 (1.05)     770.7208 (1.0)      101.4645 (1.0)      741.4980 (1.0)       51.4912 (1.0)         61;79        1.2975 (1.0)         841           1
test_benchmark_get_neighbours (NOW)              732.0860 (1.05)     2,200.7320 (1.0)      923.6836 (1.20)     246.5478 (2.43)     819.3160 (1.10)     139.9090 (2.72)      119;127        1.0826 (0.83)        886           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------- benchmark: 2 tests -----------------------------------------------------------------------------------------------------------------
Name (time in us)                                                   Min                    Max                   Mean                StdDev                 Median                   IQR            Outliers         OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_uniformrandommetapathwalk (NOW)                 456.3170 (1.0)         849.3380 (1.0)         520.7856 (1.0)         60.3954 (1.0)         498.3265 (1.0)         45.2690 (1.0)         38;33  1,920.1758 (1.0)         342           1
test_benchmark_uniformrandommetapathwalk (0016_1cca6b6)     13,007.4970 (28.51)    19,690.7460 (23.18)    16,295.5201 (31.29)    1,387.3179 (22.97)    16,282.2605 (32.67)    2,086.3395 (46.09)        22;0     61.3666 (0.03)         64           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

huonw

This is nice. I think we can even extend this idea to handle edge types too, and so get optimised and memory use queries for HinSAGE (since that currently constructs its own type_adjacency_list thing?), and for something like #433.

huonw · 2020-05-01T00:31:50Z

stellargraph/core/element_data.py

+        def _numpyise_inner(d, dtype):
+            return {k: _numpyise(v, dtype) for k, v in d.items()}
+
+        self._edges_index_by_other_node_type = _AdjIndex(


This ends up being a dictionary like {node_id: {node_type: array}}, right?

huonw · 2020-05-01T00:33:00Z

stellargraph/core/element_data.py

+            in_dict.setdefault(tgt, {}).setdefault(src_type, []).append(i)
+            out_dict.setdefault(src, {}).setdefault(tgt_type, []).append(i)
+
+            undirected.setdefault(tgt, {}).setdefault(src_type, []).append(i)


Also, I wonder if it actually makes sense to compute each of these separately (following #1316)? E.g. an algorithm that's using undirected probably won't touch in_dict/out_dict, and vice versa. I thing this might be especially important for this, because there'll be a lot more memory overhead from having many smaller np.ndarray instances (i.e. there'll be more so more of the fixed overheads of each one).

Hmm yeah makes sense. I might try to extend the _AdjIndex type to make it lazy for individual dicts

tests/core/test_graph.py

kjun9 · 2020-05-04T01:27:05Z

I think this now conflicts a lot with #1435 and the new speedups I'm getting are not as huuuuge compared to what's already landed in develop, so I might try to run the notebook for #934 independently of this PR, and this might not have to land

------------------------------------------------------------------------------------------------------------- benchmark: 2 tests -------------------------------------------------------------------------------------------------------------
Name (time in us)                                                  Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_uniformrandommetapathwalk (NOW)                576.2610 (1.0)      1,260.3640 (1.0)        667.4091 (1.0)       97.5010 (1.0)        637.5760 (1.0)       48.6500 (1.0)       104;122  1,498.3313 (1.0)        1018           1
test_benchmark_uniformrandommetapathwalk (0018_9506f93)     1,679.1980 (2.91)     4,186.3280 (3.32)     2,183.5397 (3.27)     266.7055 (2.74)     2,150.2190 (3.37)     245.3337 (5.04)        87;19    457.9720 (0.31)        417           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

huonw · 2020-05-04T01:39:02Z

A 3× speedup still seems worth it, but yeah, maybe we can leave it till after the 1.0 crunch.

kjun9 · 2020-05-04T03:02:29Z

Hmm maybe the benchmark test doesn't quite provide the full picture.. notebook seems to be much slower than 3x (been running random walks for ~2 hours)

huonw · 2020-05-04T03:17:24Z

Hm, 2 hours with this PR or with develop? #1446 is related too.

kjun9 · 2020-05-04T03:36:19Z

Hm, 2 hours with this PR or with develop? #1446 is related too.

with develop. With this PR, the whole notebook runs within ~10 mins or so

huonw

Looks reasonable, although I'm not sure we've quite hit on the perfect design of the laziness (both before in the current develop and in this PR's current state).

huonw · 2020-05-07T05:12:34Z

stellargraph/data/explorer.py


                    walks.append(
                        list(self.graph.node_ilocs_to_ids(walk))
                    )  # store the walk

        return walks

+    def _augment_and_group_metapaths(self, metapaths, walk_length):


What does "augment" mean here? Maybe:

Suggested change

def _augment_and_group_metapaths(self, metapaths, walk_length):

def _extend_and_group_metapaths(self, metapaths, walk_length):

or even

Suggested change

def _augment_and_group_metapaths(self, metapaths, walk_length):

def _repeat_and_group_metapaths(self, metapaths, walk_length):

extend sounds good to me. augment came from me taking the phrasing from an existing comment that I removed # augment metapath to be length long.

huonw · 2020-05-07T05:16:46Z

tests/core/test_graph.py

@@ -762,7 +762,7 @@ def f():
 @pytest.mark.parametrize("num_nodes,num_edges", [(0, 0), (100, 200), (1000, 5000)])
 # features or not, to capture their cost
 @pytest.mark.parametrize("feature_size", [None, 100])
-@pytest.mark.parametrize("force_adj_lists", [None, "directed", "undirected", "both"])
+@pytest.mark.parametrize("force_adj_lists", [None, "homogeneous", "by_other_node_type"])


This is dramatically changing what this benchmark is measuring; previously it was capturing the cost of using a graph in undirected mode or directed mode (or neither (adjacency matrix only), or both), now it's only capturing the cost of using a graph in either:

adjacency matrix only

both undirected and directed mode for homogeneous

both undirected and directed mode for other_node_type

This benchmark is kinda completely exploding in terms of options, but maybe it instead makes sense to do this as a new parameter like:

@pytest.mark.parametrize("adj_list_directness", ["homogeneous", "by_other_node_type"])

Potentially with a:

if force_adj_lists is None and adj_list_directness != "homogeneous": pytest.skip("only measure force_adj_lists=None once")

However, maybe it's not entirely useful to know the specific cost of the undirected vs. directed adjacencies.

huonw · 2020-05-07T05:18:20Z

tests/data/test_metapath_walker.py

@@ -317,11 +317,11 @@ def test_init_parameters(self):
        assert all(np.array_equal(w1, w2) for w1, w2 in zip(run_1, run_2))

    def test_benchmark_uniformrandommetapathwalk(self, benchmark):
-        g = example_graph_random(n_nodes=50, n_edges=500, node_types=2)
+        g = example_graph_random(n_nodes=50, n_edges=2000, node_types=4)


Out of curiosity, why did this need to become larger? The metapaths got truncated often in practice, despite the theory?

I tried increasing it since I wasn't so convinced that the speedups reported from the benchmark were as good as what we end up seeing in practice (#1399 (comment) vs #1399 (comment) the larger benchmark reports much higher speedup)

Is that the number of edges increasing or the number of head nodes increasing?

huonw · 2020-05-07T05:19:03Z

tests/data/test_metapath_walker.py

        mrw = UniformRandomMetaPathWalk(g)

        # this should be made larger to be more realistic, when it is fast enough
-        nodes = np.arange(0, 5)
+        nodes = np.arange(0, 10)


This is probably now fast enough to switch to being the same as the other ones, e.g.

stellargraph/tests/data/test_uniform_random_walker.py

Line 245 in 284c89d

nodes = np.arange(0, 50)

Suggested change

nodes = np.arange(0, 10)

nodes = np.arange(0, 50)

huonw · 2020-05-07T05:31:50Z

stellargraph/core/element_data.py

+        self.undirected = self.init_index(ins=True, outs=True)
+        self.ins = self.init_index(ins=True, outs=False)
+        self.outs = self.init_index(ins=False, outs=True)


Rather than duplicate the init_index calling and assignment logic, maybe this could be:

Suggested change

self.undirected = self.init_index(ins=True, outs=True)

self.ins = self.init_index(ins=True, outs=False)

self.outs = self.init_index(ins=False, outs=True)

self.lookup(ins=True, outs=True)

self.lookup(ins=True, outs=False)

self.lookup(ins=False, outs=True)

huonw · 2020-05-07T05:39:14Z

stellargraph/core/element_data.py

@@ -303,74 +303,117 @@ def _numpyise(d, dtype):
    return {k: np.array(v, dtype=dtype) for k, v in d.items()}


+class AdjList:
+    def __init__(self, init_index):
+        self.init_index = init_index


Could you talk a bit about the design choice of passing in a function here? I'm not saying it's wrong and I'm not sure there's a best answer at all, but I wonder if there's other versions that are a little clearer?

In particular, it feels a bit funny to have EdgeData store an AdjList object that holds a method from that parent EdgeData. An alternative might be something like

class AdjList: def __init__(self, by_other_node_type, ...): if by_other_node_type: self._init_func = self._init_by_other_node_type else: self._init_func = self._init_plain self.undirected = self.ins = self.outs = None def _init_plain(self, ...): ... def _init_by_other_node_type(self, ...): ...

I guess we'll need to pass more arguments around in this case.

Hmm yeah I wanted AdjList to handle the lookup + lazy initialising in one place, but thought it would be more convenient to define more adjacency-initialiser functions in EdgeData rather than in AdjList since I'm guessing it'll need access to all the type information in EdgeData and NodeData to build the adjacency lists with, so that's how I ended up with this approach.

Not sure if you had something else in mind with your suggestion, but I guess we could pass in the whole edge data and node data into the adjacency list to do whatever it needs to do... but maybe that also looks a bit funny?

class AdjList: def __init__(self, by_other_node_type, edge_data, node_data) def _init_plain(self, ...): def _init_by_other_node_type(self, ...): class EdgeData: def __init__(self, ...): ... self._adj_list = AdjList(by_other_node_type=False, self, node_data) self._adj_list_by_other_node_type = AdjList(by_other_node_type=True, self, node_data)

Yeah, that's a bit funny too.

I thought about this overnight and I think my "concern" here is that the recursiveness of the data structure initialisation introduces risks of attempting to use a value while it's being initialised (e.g. if edge list initialisation happened to look at other edge lists, for some reason).

I guess the fully-separated version would be passing in sources, targets and node_data?

However, this may not be worth it, especially as edge list initialisation is:

limited enough that there doesn't seem to be a reason why it would look at itself

done as a final commit, rather than incremental construction (specifically, the self.undirected properties go from None to "fully ready", rather than being incrementally built up), and so we'll see infinite recursion here rather than invalid results

The main risk might be around delaying garbage collection (since this causes a cycle between objects, and so cPython will have to wait for the actual garbage collector to run, rather than objects being deallocated as soon as they go out of scope), and future refactoring that change either of the points above.

huonw · 2020-05-07T05:43:31Z

stellargraph/core/element_data.py

        super().__init__(shared, type_starts)

        # cache these columns to avoid having to do more method and dict look-ups
        self.sources = self._column(SOURCE)
        self.targets = self._column(TARGET)
        self.weights = self._column(WEIGHT)

+        self.source_types = nodes.type_of_iloc(self.sources)


I think this is going to noticeably increase memory usage, both because:

it's using type names not type ilocs

it's two extra O(num edges) arrays that aren't shared with anything (i.e. they're new) but are stored permanently

The first can be fixed with:

Suggested change

self.source_types = nodes.type_of_iloc(self.sources)

self.source_types = nodes.type_ilocs[self.sources]

but the second is more problematic. Maybe these could be computed lazily too? (And/or, not even stored, just used transiently for initialising the adjacency lists?)

I think the benchmarks in the PR description aren't up to date with the current state? Maybe some new benchmarks could be reported, including the test_allocation_benchmark ones (to try to capture this addition)?

Good point, yeah I don't think we need to store these at all actually. I'll run some new benchmarks afterwards

huonw · 2020-05-07T05:45:04Z

stellargraph/core/element_data.py

@@ -303,74 +303,117 @@ def _numpyise(d, dtype):
    return {k: np.array(v, dtype=dtype) for k, v in d.items()}


+class AdjList:


It's minor, but I think this isn't quite an adjacency list, but actually something like:

Suggested change

class AdjList:

class LazyAdjLists:

This reverts commit 1ce0976.

kjun9 · 2020-05-19T04:01:40Z

Update: I decided to kind of start-over by getting rid of the LazyAdjLists class for now and just add to the existing structure for adjacency lists. So the benchmarks for existing functionality other than methpath2vec walk should remain roughly unaffected, and we're just adding some new benchmarks for the new by_other_node_type parts.

Adjacency list construction times (should be roughly the same):

------------------------------------------------------------------------------------------------------ benchmark 'StellarGraph adjacency lists (time)': 18 tests -------------------------------------------------------------------------------------------------------
Name (time in us)                                                                Min                     Max                    Mean                 StdDev                  Median                    IQR            Outliers         OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_adj_list[directed-100-200] (NOW)                             231.4470 (1.0)          637.8610 (1.0)          258.5698 (1.0)          42.2262 (1.00)         242.1290 (1.0)          13.7610 (1.0)       350;521  3,867.4271 (1.0)        3339           1
test_benchmark_adj_list[directed-100-200] (0042_cc15f17)                    232.1980 (1.00)         687.9200 (1.08)         258.8520 (1.00)         42.0826 (1.0)          242.9040 (1.00)         13.8745 (1.01)      351;519  3,863.2119 (1.00)       3301           1
test_benchmark_adj_list[undirected-100-200] (NOW)                           315.9070 (1.36)         900.4990 (1.41)         353.7579 (1.37)         61.2691 (1.46)         333.7215 (1.38)         26.9920 (1.96)      236;336  2,826.7916 (0.73)       2452           1
test_benchmark_adj_list[undirected-100-200] (0042_cc15f17)                  320.4990 (1.38)         825.2960 (1.29)         358.5158 (1.39)         51.7141 (1.23)         339.3480 (1.40)         23.2142 (1.69)      250;324  2,789.2772 (0.72)       2427           1
test_benchmark_adj_list[directed_by_other_node_type-100-200]                473.1610 (2.04)         977.0970 (1.53)         533.5859 (2.06)         66.9349 (1.59)         505.8120 (2.09)         46.6920 (3.39)      191;161  1,874.1127 (0.48)       1551           1
test_benchmark_adj_list[undirected_by_other_node_type-100-200]              583.8950 (2.52)       1,840.8670 (2.89)         692.0267 (2.68)        136.3381 (3.24)         640.6880 (2.65)        110.0740 (8.00)       118;88  1,445.0309 (0.37)       1122           1
test_benchmark_adj_list[directed-1000-5000] (NOW)                         4,700.7700 (20.31)      7,351.8230 (11.53)      5,140.5051 (19.88)       402.8736 (9.57)       5,016.1920 (20.72)       390.2730 (28.36)        21;9    194.5334 (0.05)        174           1
test_benchmark_adj_list[directed-1000-5000] (0042_cc15f17)                4,733.5760 (20.45)     14,525.9150 (22.77)      5,423.7449 (20.98)     1,129.3189 (26.84)      5,112.1680 (21.11)       474.9890 (34.52)       10;17    184.3745 (0.05)        172           1
test_benchmark_adj_list[undirected-1000-5000] (NOW)                       5,681.0990 (24.55)      7,905.0130 (12.39)      6,238.5514 (24.13)       431.1811 (10.25)      6,130.7970 (25.32)       491.5305 (35.72)        35;7    160.2936 (0.04)        149           1
test_benchmark_adj_list[undirected-1000-5000] (0042_cc15f17)              5,733.0330 (24.77)      8,448.8230 (13.25)      6,145.9833 (23.77)       366.6337 (8.71)       6,023.9110 (24.88)       456.8220 (33.20)        37;2    162.7079 (0.04)        161           1
test_benchmark_adj_list[directed_by_other_node_type-1000-5000]           10,009.7130 (43.25)     89,959.8790 (141.03)    11,933.1857 (46.15)     8,443.4651 (200.64)    10,914.3815 (45.08)       544.4775 (39.57)         1;6     83.7999 (0.02)         88           1
test_benchmark_adj_list[undirected_by_other_node_type-1000-5000]         13,573.0680 (58.64)     97,755.2520 (153.25)    22,233.7995 (85.99)    13,983.3121 (332.28)    17,866.9160 (73.79)     6,815.7660 (495.30)        3;5     44.9766 (0.01)         45           1
test_benchmark_adj_list[directed-20000-100000] (0042_cc15f17)           127,358.4270 (550.27)   226,152.3290 (354.55)   158,161.4499 (611.68)   46,364.9196 (>1000.0)  132,032.3320 (545.30)   73,978.2312 (>1000.0)       2;0      6.3227 (0.00)          7           1
test_benchmark_adj_list[directed-20000-100000] (NOW)                    127,907.2040 (552.64)   226,413.9760 (354.96)   157,614.2974 (609.56)   45,147.8899 (>1000.0)  134,704.9390 (556.34)   71,021.4352 (>1000.0)       2;0      6.3446 (0.00)          7           1
test_benchmark_adj_list[undirected-20000-100000] (NOW)                  176,357.6690 (761.98)   279,860.9100 (438.75)   228,874.5770 (885.16)   47,724.7421 (>1000.0)  224,443.8110 (926.96)   90,528.8710 (>1000.0)       2;0      4.3692 (0.00)          5           1
test_benchmark_adj_list[undirected-20000-100000] (0042_cc15f17)         178,570.7970 (771.54)   272,435.4760 (427.11)   216,208.0132 (836.17)   51,113.7061 (>1000.0)  179,460.4530 (741.18)   93,478.8123 (>1000.0)       2;0      4.6252 (0.00)          5           1
test_benchmark_adj_list[directed_by_other_node_type-20000-100000]       294,411.1100 (>1000.0)  400,097.0160 (627.25)   371,276.0022 (>1000.0)  43,407.9478 (>1000.0)  387,538.8160 (>1000.0)  32,097.7518 (>1000.0)       1;1      2.6934 (0.00)          5           1
test_benchmark_adj_list[undirected_by_other_node_type-20000-100000]     451,506.6990 (>1000.0)  564,240.5210 (884.58)   485,105.0646 (>1000.0)  46,503.0216 (>1000.0)  469,584.9610 (>1000.0)  53,450.1095 (>1000.0)       1;0      2.0614 (0.00)          5           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Adjacency list allocations:

----------------------------------------------------------------------------------------------------------- benchmark 'StellarGraph adjacency lists (peak)': 12 tests -----------------------------------------------------------------------------------------------------------
Name (time in s)                                                                                Min                       Max                      Mean              StdDev                    Median               IQR            Outliers     OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_allocation_benchmark_adj_list_peak[directed-100-200] (0036_cc15f17)                28,424.0000 (1.0)         28,724.0000 (1.0)         28,439.0000 (1.0)       67.0820 (1.38)        28,424.0000 (1.0)      0.0000 (1.0)           1;1  0.0000 (1.0)          20           1
test_allocation_benchmark_adj_list_peak[directed-100-200] (NOW)                         28,694.0000 (1.01)        29,012.0000 (1.01)        28,709.9000 (1.01)      71.1070 (1.47)        28,694.0000 (1.01)     0.0000 (1.0)           1;1  0.0000 (0.99)         20           1
test_allocation_benchmark_adj_list_peak[undirected-100-200] (0036_cc15f17)              55,754.0000 (1.96)        55,971.0000 (1.95)        55,764.8500 (1.96)      48.5227 (1.0)         55,754.0000 (1.96)     0.0000 (1.0)           1;1  0.0000 (0.51)         20           1
test_allocation_benchmark_adj_list_peak[undirected-100-200] (NOW)                       56,578.0000 (1.99)        56,795.0000 (1.98)        56,588.8500 (1.99)      48.5227 (1.0)         56,578.0000 (1.99)     0.0000 (1.0)           1;1  0.0000 (0.50)         20           1
test_allocation_benchmark_adj_list_peak[directed_by_other_node_type-100-200]            67,553.0000 (2.38)        68,103.0000 (2.37)        67,580.5000 (2.38)     122.9837 (2.53)        67,553.0000 (2.38)     0.0000 (1.0)           1;1  0.0000 (0.42)         20           1
test_allocation_benchmark_adj_list_peak[undirected_by_other_node_type-100-200]          94,760.0000 (3.33)        95,682.0000 (3.33)        94,806.1000 (3.33)     206.1655 (4.25)        94,760.0000 (3.33)     0.0000 (1.0)           1;1  0.0000 (0.30)         20           1
test_allocation_benchmark_adj_list_peak[directed-1000-5000] (NOW)                      518,237.0000 (18.23)      520,973.0000 (18.14)      518,373.8000 (18.23)    611.7882 (12.61)      518,237.0000 (18.23)    0.0000 (1.0)           1;1  0.0000 (0.05)         20           1
test_allocation_benchmark_adj_list_peak[directed-1000-5000] (0036_cc15f17)             518,273.0000 (18.23)      520,889.0000 (18.13)      518,403.8000 (18.23)    584.9554 (12.06)      518,273.0000 (18.23)    0.0000 (1.0)           1;1  0.0000 (0.05)         20           1
test_allocation_benchmark_adj_list_peak[undirected-1000-5000] (NOW)                    783,131.0000 (27.55)      784,335.0000 (27.31)      783,191.2000 (27.54)    269.2226 (5.55)       783,131.0000 (27.55)    0.0000 (1.0)           1;1  0.0000 (0.04)         20           1
test_allocation_benchmark_adj_list_peak[undirected-1000-5000] (0036_cc15f17)           784,145.0000 (27.59)      785,427.0000 (27.34)      784,209.1000 (27.58)    286.6639 (5.91)       784,145.0000 (27.59)    0.0000 (1.0)           1;1  0.0000 (0.04)         20           1
test_allocation_benchmark_adj_list_peak[directed_by_other_node_type-1000-5000]       1,291,283.0000 (45.43)    1,292,357.0000 (44.99)    1,291,336.7000 (45.41)    240.1537 (4.95)     1,291,283.0000 (45.43)    0.0000 (1.0)           1;1  0.0000 (0.02)         20           1
test_allocation_benchmark_adj_list_peak[undirected_by_other_node_type-1000-5000]     2,045,458.0000 (71.96)    2,046,632.0000 (71.25)    2,045,516.7000 (71.93)    262.5144 (5.41)     2,045,458.0000 (71.96)    0.0000 (1.0)           1;1  0.0000 (0.01)         20           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------------- benchmark 'StellarGraph adjacency lists (size)': 12 tests ---------------------------------------------------------------------------------------------------------
Name (time in s)                                                                           Min                       Max                      Mean              StdDev                    Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_allocation_benchmark_adj_list[directed-100-200] (0036_cc15f17)                16,930.0000 (1.0)         17,296.0000 (1.0)         16,948.3000 (1.0)       81.8401 (1.69)        16,930.0000 (1.0)      0.0000 (1.0)           1;1  0.0001 (1.0)          20           1
test_allocation_benchmark_adj_list[directed-100-200] (NOW)                         17,082.0000 (1.01)        17,415.0000 (1.01)        17,098.6500 (1.01)      74.4611 (1.53)        17,082.0000 (1.01)     0.0000 (1.0)           1;1  0.0001 (0.99)         20           1
test_allocation_benchmark_adj_list[undirected-100-200] (0036_cc15f17)              27,561.0000 (1.63)        27,778.0000 (1.61)        27,571.8500 (1.63)      48.5227 (1.0)         27,561.0000 (1.63)     0.0000 (1.0)           1;1  0.0000 (0.61)         20           1
test_allocation_benchmark_adj_list[undirected-100-200] (NOW)                       31,179.0000 (1.84)        31,396.0000 (1.82)        31,193.4500 (1.84)      50.3132 (1.04)        31,179.0000 (1.84)     0.0000 (1.0)           2;2  0.0000 (0.54)         20           1
test_allocation_benchmark_adj_list[directed_by_other_node_type-100-200]            40,281.0000 (2.38)        40,807.0000 (2.36)        40,307.3000 (2.38)     117.6172 (2.42)        40,281.0000 (2.38)     0.0000 (1.0)           1;1  0.0000 (0.42)         20           1
test_allocation_benchmark_adj_list[undirected_by_other_node_type-100-200]          49,134.0000 (2.90)        50,056.0000 (2.89)        49,180.1000 (2.90)     206.1655 (4.25)        49,134.0000 (2.90)     0.0000 (1.0)           1;1  0.0000 (0.34)         20           1
test_allocation_benchmark_adj_list[directed-1000-5000] (NOW)                      174,498.0000 (10.31)      177,242.0000 (10.25)      174,635.2000 (10.30)    613.5771 (12.65)      174,498.0000 (10.31)    0.0000 (1.0)           1;1  0.0000 (0.10)         20           1
test_allocation_benchmark_adj_list[directed-1000-5000] (0036_cc15f17)             174,646.0000 (10.32)      177,246.0000 (10.25)      174,776.0000 (10.31)    581.3777 (11.98)      174,646.0000 (10.32)    0.0000 (1.0)           1;1  0.0000 (0.10)         20           1
test_allocation_benchmark_adj_list[undirected-1000-5000] (0036_cc15f17)           331,026.0000 (19.55)      332,302.0000 (19.21)      331,089.8000 (19.54)    285.3223 (5.88)       331,026.0000 (19.55)    0.0000 (1.0)           1;1  0.0000 (0.05)         20           1
test_allocation_benchmark_adj_list[undirected-1000-5000] (NOW)                    331,118.0000 (19.56)      332,274.0000 (19.21)      331,175.8000 (19.54)    258.4895 (5.33)       331,118.0000 (19.56)    0.0000 (1.0)           1;1  0.0000 (0.05)         20           1
test_allocation_benchmark_adj_list[directed_by_other_node_type-1000-5000]         609,068.0000 (35.98)      610,052.0000 (35.27)      609,117.2000 (35.94)    220.0291 (4.53)       609,068.0000 (35.98)    0.0000 (1.0)           1;1  0.0000 (0.03)         20           1
test_allocation_benchmark_adj_list[undirected_by_other_node_type-1000-5000]     1,000,528.0000 (59.10)    1,001,702.0000 (57.92)    1,000,586.7000 (59.04)    262.5144 (5.41)     1,000,528.0000 (59.10)    0.0000 (1.0)           1;1  0.0000 (0.02)         20           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Metapath walk (I ran the test in this branch against the code in develop, since I tweaked the test slightly to look at how times change as the graph gets larger/more heterogeneous). Looks like the speedup roughly ranges between 2x - 3x?

----------------------------------------------------------------------------------------------------------- benchmark: 36 tests -----------------------------------------------------------------------------------------------------------
Name (time in ms)                                                           Min                 Max                Mean            StdDev              Median               IQR            Outliers       OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_uniformrandommetapathwalk[10-4-500] (NOW)                 5.8106 (1.0)        9.4860 (1.11)       6.4435 (1.0)      0.5753 (1.38)       6.2735 (1.0)      0.6167 (1.19)         18;5  155.1962 (1.0)         128           1
test_benchmark_uniformrandommetapathwalk[10-3-500] (NOW)                 6.1243 (1.05)       8.5171 (1.0)        6.7221 (1.04)     0.4182 (1.0)        6.6243 (1.06)     0.5196 (1.0)          30;4  148.7619 (0.96)        126           1
test_benchmark_uniformrandommetapathwalk[10-4-1000] (NOW)                6.5450 (1.13)       8.7405 (1.03)       7.2068 (1.12)     0.4718 (1.13)       7.1279 (1.14)     0.6448 (1.24)         34;2  138.7576 (0.89)        111           1
test_benchmark_uniformrandommetapathwalk[10-3-1000] (NOW)                6.8049 (1.17)      11.5971 (1.36)       7.4974 (1.16)     0.6468 (1.55)       7.3635 (1.17)     0.6780 (1.30)         18;5  133.3789 (0.86)        108           1
test_benchmark_uniformrandommetapathwalk[10-3-2000] (NOW)                6.8252 (1.17)      13.1113 (1.54)       7.8623 (1.22)     1.0146 (2.43)       7.5698 (1.21)     0.7570 (1.46)         10;7  127.1900 (0.82)         92           1
test_benchmark_uniformrandommetapathwalk[10-2-1000] (NOW)                6.8725 (1.18)       9.8609 (1.16)       7.6781 (1.19)     0.5818 (1.39)       7.5195 (1.20)     0.6795 (1.31)         23;6  130.2401 (0.84)        111           1
test_benchmark_uniformrandommetapathwalk[10-2-500] (NOW)                 6.8865 (1.19)      11.3546 (1.33)       7.6187 (1.18)     0.6550 (1.57)       7.4407 (1.19)     0.6199 (1.19)         19;6  131.2567 (0.85)        120           1
test_benchmark_uniformrandommetapathwalk[10-4-2000] (NOW)                6.9626 (1.20)       9.6699 (1.14)       7.6233 (1.18)     0.5078 (1.21)       7.5368 (1.20)     0.6490 (1.25)         16;1  131.1767 (0.85)         69           1
test_benchmark_uniformrandommetapathwalk[10-2-2000] (NOW)                6.9762 (1.20)      11.9028 (1.40)       7.7711 (1.21)     0.7376 (1.76)       7.6045 (1.21)     0.7013 (1.35)         10;3  128.6827 (0.83)         95           1
test_benchmark_uniformrandommetapathwalk[10-4-500] (0040_cc15f17)       11.2608 (1.94)      14.6914 (1.72)      12.2768 (1.91)     0.6632 (1.59)      12.1402 (1.94)     0.8321 (1.60)         23;2   81.4544 (0.52)         79           1
test_benchmark_uniformrandommetapathwalk[10-3-500] (0040_cc15f17)       12.6508 (2.18)      27.1551 (3.19)      14.4338 (2.24)     2.5522 (6.10)      13.7170 (2.19)     1.0274 (1.98)          6;7   69.2820 (0.45)         68           1
test_benchmark_uniformrandommetapathwalk[10-4-1000] (0040_cc15f17)      13.8929 (2.39)      19.6450 (2.31)      15.2326 (2.36)     0.9624 (2.30)      15.0950 (2.41)     0.8085 (1.56)         12;2   65.6488 (0.42)         59           1
test_benchmark_uniformrandommetapathwalk[10-2-500] (0040_cc15f17)       15.5431 (2.67)      20.9874 (2.46)      17.2136 (2.67)     1.0308 (2.46)      16.9838 (2.71)     0.9730 (1.87)         13;3   58.0937 (0.37)         57           1
test_benchmark_uniformrandommetapathwalk[10-3-1000] (0040_cc15f17)      16.4838 (2.84)      22.0240 (2.59)      18.1795 (2.82)     1.1191 (2.68)      17.9888 (2.87)     1.4695 (2.83)         13;1   55.0072 (0.35)         49           1
test_benchmark_uniformrandommetapathwalk[10-4-2000] (0040_cc15f17)      17.4925 (3.01)      20.8271 (2.45)      18.9591 (2.94)     0.7609 (1.82)      18.8170 (3.00)     0.8590 (1.65)         14;1   52.7452 (0.34)         45           1
test_benchmark_uniformrandommetapathwalk[10-2-1000] (0040_cc15f17)      18.1114 (3.12)      22.8011 (2.68)      19.3948 (3.01)     1.0059 (2.41)      19.2194 (3.06)     1.2589 (2.42)         13;1   51.5601 (0.33)         47           1
test_benchmark_uniformrandommetapathwalk[10-3-2000] (0040_cc15f17)      18.7091 (3.22)      27.2711 (3.20)      21.0154 (3.26)     1.8865 (4.51)      20.7958 (3.31)     1.7701 (3.41)          9;3   47.5841 (0.31)         44           1
test_benchmark_uniformrandommetapathwalk[10-2-2000] (0040_cc15f17)      20.8818 (3.59)      27.9140 (3.28)      22.6180 (3.51)     1.2499 (2.99)      22.3012 (3.55)     1.0038 (1.93)          7;2   44.2126 (0.28)         39           1
test_benchmark_uniformrandommetapathwalk[50-4-500] (NOW)                24.3835 (4.20)      29.1655 (3.42)      26.3123 (4.08)     1.1441 (2.74)      26.1270 (4.16)     1.4407 (2.77)         11;1   38.0051 (0.24)         36           1
test_benchmark_uniformrandommetapathwalk[50-4-1000] (NOW)               31.0475 (5.34)      36.3336 (4.27)      32.7223 (5.08)     1.1051 (2.64)      32.5729 (5.19)     1.5332 (2.95)          6;1   30.5602 (0.20)         28           1
test_benchmark_uniformrandommetapathwalk[50-3-500] (NOW)                31.9296 (5.50)      38.7202 (4.55)      33.5805 (5.21)     1.5277 (3.65)      33.2236 (5.30)     1.6323 (3.14)          4;2   29.7792 (0.19)         27           1
test_benchmark_uniformrandommetapathwalk[50-4-2000] (NOW)               31.9304 (5.50)      35.6491 (4.19)      34.3374 (5.33)     0.9837 (2.35)      34.6116 (5.52)     1.2645 (2.43)          7;1   29.1228 (0.19)         24           1
test_benchmark_uniformrandommetapathwalk[50-3-1000] (NOW)               32.7119 (5.63)      39.6607 (4.66)      34.7665 (5.40)     1.3730 (3.28)      34.6718 (5.53)     1.2141 (2.34)          6;2   28.7633 (0.19)         26           1
test_benchmark_uniformrandommetapathwalk[50-2-500] (NOW)                33.3727 (5.74)      37.2157 (4.37)      35.0011 (5.43)     0.8976 (2.15)      35.0400 (5.59)     1.2386 (2.38)          8;0   28.5706 (0.18)         27           1
test_benchmark_uniformrandommetapathwalk[50-3-2000] (NOW)               33.4652 (5.76)      37.6337 (4.42)      35.0256 (5.44)     0.9944 (2.38)      34.8666 (5.56)     1.1132 (2.14)          5;2   28.5505 (0.18)         26           1
test_benchmark_uniformrandommetapathwalk[50-2-1000] (NOW)               33.5675 (5.78)      37.9009 (4.45)      35.3018 (5.48)     1.0373 (2.48)      35.3584 (5.64)     1.2893 (2.48)          7;1   28.3271 (0.18)         25           1
test_benchmark_uniformrandommetapathwalk[50-2-2000] (NOW)               34.0837 (5.87)      39.6224 (4.65)      35.8334 (5.56)     1.1626 (2.78)      35.7461 (5.70)     1.1165 (2.15)          5;2   27.9069 (0.18)         24           1
test_benchmark_uniformrandommetapathwalk[50-4-500] (0040_cc15f17)       54.0867 (9.31)      59.4685 (6.98)      55.9395 (8.68)     1.3827 (3.31)      55.5902 (8.86)     1.9714 (3.79)          5;0   17.8764 (0.12)         18           1
test_benchmark_uniformrandommetapathwalk[50-3-500] (0040_cc15f17)       69.6907 (11.99)     76.1763 (8.94)      72.3166 (11.22)    1.8046 (4.31)      72.1359 (11.50)    2.1411 (4.12)          4;0   13.8281 (0.09)         14           1
test_benchmark_uniformrandommetapathwalk[50-4-1000] (0040_cc15f17)      78.8268 (13.57)     84.8072 (9.96)      82.0019 (12.73)    1.8392 (4.40)      82.0519 (13.08)    2.8974 (5.58)          5;0   12.1948 (0.08)         12           1
test_benchmark_uniformrandommetapathwalk[50-2-500] (0040_cc15f17)       83.8906 (14.44)     88.6003 (10.40)     85.3964 (13.25)    1.1761 (2.81)      85.2431 (13.59)    0.9288 (1.79)          2;1   11.7101 (0.08)         12           1
test_benchmark_uniformrandommetapathwalk[50-3-1000] (0040_cc15f17)      85.2421 (14.67)     90.7248 (10.65)     87.8713 (13.64)    1.8624 (4.45)      87.8735 (14.01)    3.1376 (6.04)          5;0   11.3803 (0.07)         11           1
test_benchmark_uniformrandommetapathwalk[50-4-2000] (0040_cc15f17)      90.9771 (15.66)     98.0309 (11.51)     93.9101 (14.57)    2.1096 (5.04)      93.4466 (14.90)    2.8223 (5.43)          5;0   10.6485 (0.07)         11           1
test_benchmark_uniformrandommetapathwalk[50-2-1000] (0040_cc15f17)      92.2929 (15.88)     96.8610 (11.37)     94.0338 (14.59)    1.2186 (2.91)      94.0808 (15.00)    1.2658 (2.44)          3;1   10.6345 (0.07)         11           1
test_benchmark_uniformrandommetapathwalk[50-3-2000] (0040_cc15f17)      96.9583 (16.69)    103.1501 (12.11)     99.7080 (15.47)    2.4264 (5.80)      99.4320 (15.85)    4.4427 (8.55)          4;0   10.0293 (0.06)         10           1
test_benchmark_uniformrandommetapathwalk[50-2-2000] (0040_cc15f17)     107.2951 (18.47)    115.4023 (13.55)    111.2413 (17.26)    2.7680 (6.62)     110.2651 (17.58)    4.7315 (9.11)          4;0    8.9895 (0.06)          9           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

kjun9 · 2020-05-21T01:49:55Z

I've switched to contiguous arrays like #1296 - took me a little while to understand how to do it.. 😅 I did try to take out a few small functions to share with the existing code for constructing the adjacency lists, but _create_undirected_adj_lists_by_other_node_type currently still looks a bit complicated

New benchmarks after switching to contiguous arrays:

Metapath2vec walk

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_uniformrandommetapathwalk[10-3-500] (NOW)                 9.8704 (1.0)       13.4906 (1.0)       11.0875 (1.0)      0.7382 (1.05)      11.0106 (1.0)       1.0323 (1.25)         29;2  90.1916 (1.0)          89           1
test_benchmark_uniformrandommetapathwalk[10-4-500] (NOW)                11.4017 (1.16)      17.5578 (1.30)      13.4410 (1.21)     1.3345 (1.89)      13.1116 (1.19)      1.7548 (2.12)         16;2  74.3994 (0.82)         60           1
test_benchmark_uniformrandommetapathwalk[10-2-500] (NOW)                11.4081 (1.16)      18.9774 (1.41)      12.9319 (1.17)     1.2247 (1.74)      12.7304 (1.16)      1.1039 (1.33)         17;4  77.3284 (0.86)         76           1
test_benchmark_uniformrandommetapathwalk[10-2-1000] (NOW)               11.4775 (1.16)      15.0860 (1.12)      12.5705 (1.13)     0.7391 (1.05)      12.3893 (1.13)      1.0939 (1.32)         24;1  79.5516 (0.88)         76           1
test_benchmark_uniformrandommetapathwalk[10-3-1000] (NOW)               11.4792 (1.16)      17.8878 (1.33)      13.0479 (1.18)     1.1452 (1.63)      12.7832 (1.16)      1.0229 (1.23)         12;6  76.6408 (0.85)         76           1
test_benchmark_uniformrandommetapathwalk[10-2-2000] (NOW)               11.4887 (1.16)      16.3850 (1.21)      12.8020 (1.15)     0.9154 (1.30)      12.6393 (1.15)      1.2218 (1.47)         18;1  78.1128 (0.87)         69           1
test_benchmark_uniformrandommetapathwalk[10-4-2000] (NOW)               11.7484 (1.19)      21.2716 (1.58)      14.0348 (1.27)     1.8677 (2.65)      13.5541 (1.23)      2.1136 (2.55)         11;3  71.2514 (0.79)         63           1
test_benchmark_uniformrandommetapathwalk[10-3-2000] (NOW)               12.1010 (1.23)      28.2123 (2.09)      19.0184 (1.72)     3.5185 (5.00)      19.4636 (1.77)      3.8575 (4.65)         22;2  52.5807 (0.58)         74           1
test_benchmark_uniformrandommetapathwalk[10-4-1000] (NOW)               13.8436 (1.40)      25.0714 (1.86)      17.4340 (1.57)     1.8478 (2.62)      17.2061 (1.56)      1.6533 (1.99)         12;3  57.3590 (0.64)         50           1
test_benchmark_uniformrandommetapathwalk[10-4-500] (0046_cd41384)       15.8208 (1.60)      24.6789 (1.83)      17.3714 (1.57)     1.6221 (2.30)      16.8680 (1.53)      1.0430 (1.26)          6;5  57.5660 (0.64)         52           1
test_benchmark_uniformrandommetapathwalk[10-3-500] (0046_cd41384)       18.9521 (1.92)      23.6497 (1.75)      20.4179 (1.84)     1.0053 (1.43)      20.2320 (1.84)      1.1920 (1.44)         12;2  48.9767 (0.54)         45           1
test_benchmark_uniformrandommetapathwalk[10-4-1000] (0046_cd41384)      20.2379 (2.05)      24.6014 (1.82)      21.9907 (1.98)     0.9296 (1.32)      21.9236 (1.99)      0.9018 (1.09)         12;2  45.4737 (0.50)         41           1
test_benchmark_uniformrandommetapathwalk[10-2-500] (0046_cd41384)       22.5255 (2.28)      26.8943 (1.99)      24.2128 (2.18)     0.9924 (1.41)      24.2159 (2.20)      1.1919 (1.44)         12;1  41.3005 (0.46)         41           1
test_benchmark_uniformrandommetapathwalk[10-3-1000] (0046_cd41384)      23.1060 (2.34)      26.6813 (1.98)      24.7956 (2.24)     0.9113 (1.29)      24.7760 (2.25)      1.0807 (1.30)         11;0  40.3297 (0.45)         40           1
test_benchmark_uniformrandommetapathwalk[10-2-1000] (0046_cd41384)      24.3782 (2.47)      28.9344 (2.14)      25.9133 (2.34)     1.1302 (1.60)      25.7071 (2.33)      1.2305 (1.48)         13;2  38.5902 (0.43)         36           1
test_benchmark_uniformrandommetapathwalk[10-4-2000] (0046_cd41384)      24.7837 (2.51)      29.9705 (2.22)      26.1244 (2.36)     1.1029 (1.57)      25.9406 (2.36)      0.8373 (1.01)          6;3  38.2784 (0.42)         38           1
test_benchmark_uniformrandommetapathwalk[10-3-2000] (0046_cd41384)      25.5129 (2.58)      31.6947 (2.35)      27.4617 (2.48)     1.4286 (2.03)      27.5091 (2.50)      1.8986 (2.29)         11;1  36.4144 (0.40)         35           1
test_benchmark_uniformrandommetapathwalk[10-2-2000] (0046_cd41384)      27.3576 (2.77)      33.1049 (2.45)      29.2116 (2.63)     1.2305 (1.75)      28.8563 (2.62)      1.1923 (1.44)          6;2  34.2329 (0.38)         32           1
test_benchmark_uniformrandommetapathwalk[50-4-500] (NOW)                44.7374 (4.53)      64.9497 (4.81)      54.5772 (4.92)     8.0176 (11.38)     53.0025 (4.81)     15.4557 (18.65)         8;0  18.3227 (0.20)         16           1
test_benchmark_uniformrandommetapathwalk[50-3-500] (NOW)                52.1458 (5.28)      76.9201 (5.70)      62.0943 (5.60)     9.9459 (14.12)     56.9634 (5.17)     19.1126 (23.06)         5;0  16.1046 (0.18)         16           1
test_benchmark_uniformrandommetapathwalk[50-4-1000] (NOW)               55.4521 (5.62)      80.9966 (6.00)      61.2577 (5.52)     8.9198 (12.66)     56.6902 (5.15)      4.8868 (5.90)          3;3  16.3245 (0.18)         18           1
test_benchmark_uniformrandommetapathwalk[50-3-1000] (NOW)               56.6112 (5.74)      82.8767 (6.14)      62.1955 (5.61)     7.5411 (10.71)     59.3717 (5.39)      4.4207 (5.33)          2;2  16.0783 (0.18)         18           1
test_benchmark_uniformrandommetapathwalk[50-3-2000] (NOW)               57.8079 (5.86)      67.9448 (5.04)      60.4088 (5.45)     2.5104 (3.56)      59.5477 (5.41)      2.6652 (3.22)          3;1  16.5539 (0.18)         16           1
test_benchmark_uniformrandommetapathwalk[50-4-2000] (NOW)               58.0146 (5.88)      83.6341 (6.20)      66.4221 (5.99)     9.3006 (13.20)     61.7044 (5.60)     13.4114 (16.18)         2;0  15.0552 (0.17)         12           1
test_benchmark_uniformrandommetapathwalk[50-2-2000] (NOW)               58.4879 (5.93)      69.9393 (5.18)      61.0031 (5.50)     3.0216 (4.29)      60.4804 (5.49)      2.1000 (2.53)          1;1  16.3926 (0.18)         12           1
test_benchmark_uniformrandommetapathwalk[50-2-1000] (NOW)               58.8696 (5.96)      81.0091 (6.00)      64.9659 (5.86)     7.2315 (10.27)     62.0181 (5.63)      6.8114 (8.22)          3;2  15.3927 (0.17)         16           1
test_benchmark_uniformrandommetapathwalk[50-2-500] (NOW)                59.1837 (6.00)      76.7491 (5.69)      66.4931 (6.00)     6.2903 (8.93)      64.5412 (5.86)     11.6325 (14.04)         6;0  15.0392 (0.17)         16           1
test_benchmark_uniformrandommetapathwalk[50-4-500] (0046_cd41384)       82.9248 (8.40)      87.2365 (6.47)      85.4376 (7.71)     1.2288 (1.74)      85.6426 (7.78)      1.7431 (2.10)          3;0  11.7045 (0.13)         12           1
test_benchmark_uniformrandommetapathwalk[50-3-500] (0046_cd41384)       97.7590 (9.90)     104.6571 (7.76)     100.8204 (9.09)     2.2971 (3.26)     100.0851 (9.09)      3.6840 (4.45)          3;0   9.9186 (0.11)         10           1
test_benchmark_uniformrandommetapathwalk[50-4-1000] (0046_cd41384)     104.3067 (10.57)    110.5014 (8.19)     107.9361 (9.73)     2.0195 (2.87)     108.2626 (9.83)      2.9150 (3.52)          4;0   9.2647 (0.10)         10           1
test_benchmark_uniformrandommetapathwalk[50-3-1000] (0046_cd41384)     117.2997 (11.88)    122.5132 (9.08)     119.6024 (10.79)    1.8246 (2.59)     120.1795 (10.91)     3.0525 (3.68)          3;0   8.3610 (0.09)          9           1
test_benchmark_uniformrandommetapathwalk[50-2-500] (0046_cd41384)      118.2874 (11.98)    120.6687 (8.94)     119.2230 (10.75)    0.7044 (1.0)      119.3571 (10.84)     0.8288 (1.0)           2;0   8.3876 (0.09)          9           1
test_benchmark_uniformrandommetapathwalk[50-4-2000] (0046_cd41384)     123.7497 (12.54)    129.2173 (9.58)     125.8153 (11.35)    1.6711 (2.37)     125.4899 (11.40)     1.6654 (2.01)          2;1   7.9482 (0.09)          8           1
test_benchmark_uniformrandommetapathwalk[50-2-1000] (0046_cd41384)     124.5510 (12.62)    132.2876 (9.81)     128.5747 (11.60)    2.9129 (4.14)     128.6809 (11.69)     4.9707 (6.00)          3;0   7.7776 (0.09)          8           1
test_benchmark_uniformrandommetapathwalk[50-3-2000] (0046_cd41384)     130.0331 (13.17)    134.7002 (9.98)     132.2468 (11.93)    1.5553 (2.21)     131.9357 (11.98)     2.3324 (2.81)          2;0   7.5616 (0.08)          8           1
test_benchmark_uniformrandommetapathwalk[50-2-2000] (0046_cd41384)     141.6583 (14.35)    146.4903 (10.86)    143.9868 (12.99)    1.6506 (2.34)     144.2513 (13.10)     2.3606 (2.85)          2;0   6.9451 (0.08)          7           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Adjacency list construction times

---------------------------------------------------------------------------------------------------- benchmark 'StellarGraph adjacency lists (time)': 18 tests ----------------------------------------------------------------------------------------------------
Name (time in us)                                                               Min                    Max                   Mean                StdDev                 Median                   IQR            Outliers          OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_adj_list[undirected-100-200] (0048_cd41384)                  24.5480 (1.0)         294.3800 (1.11)         32.3052 (1.0)         16.4745 (1.05)         27.7660 (1.0)          4.9880 (2.45)     654;1219  30,954.8139 (1.0)       11373           1
test_benchmark_adj_list[undirected-100-200] (NOW)                           30.4880 (1.24)        382.8110 (1.45)         37.2370 (1.15)        15.6660 (1.0)          33.2300 (1.20)         2.0370 (1.0)      749;1577  26,855.0183 (0.87)      11578           1
test_benchmark_adj_list[directed-100-200] (0048_cd41384)                    48.8760 (1.99)      3,947.6520 (14.90)        77.9877 (2.41)        76.9466 (4.91)         63.3635 (2.28)        30.3660 (14.91)     113;190  12,822.5266 (0.41)       3146           1
test_benchmark_adj_list[directed-100-200] (NOW)                             49.5970 (2.02)        264.9040 (1.0)          58.5081 (1.81)        18.5747 (1.19)         52.7800 (1.90)         3.5103 (1.72)      355;716  17,091.6572 (0.55)       4485           1
test_benchmark_adj_list[directed_by_other_node_type-100-200]               170.3110 (6.94)        595.8370 (2.25)        193.0913 (5.98)        37.5174 (2.39)        182.1765 (6.56)        11.4760 (5.63)      237;403   5,178.8963 (0.17)       2838           1
test_benchmark_adj_list[undirected_by_other_node_type-100-200]             190.0620 (7.74)        484.7940 (1.83)        210.4009 (6.51)        31.1211 (1.99)        200.3320 (7.22)        12.8848 (6.33)      175;260   4,752.8310 (0.15)       2055           1
test_benchmark_adj_list[undirected-1000-5000] (0048_cd41384)               490.7640 (19.99)     4,467.5190 (16.86)       560.8511 (17.36)      140.8498 (8.99)        528.7230 (19.04)       55.5720 (27.28)      83;147   1,783.0043 (0.06)       1791           1
test_benchmark_adj_list[undirected-1000-5000] (NOW)                        492.8560 (20.08)     1,419.8710 (5.36)        558.8858 (17.30)       92.5383 (5.91)        526.6060 (18.97)       45.8270 (22.50)     189;232   1,789.2745 (0.06)       1874           1
test_benchmark_adj_list[directed-1000-5000] (0048_cd41384)                 757.4370 (30.86)     1,749.9320 (6.61)        905.4496 (28.03)      156.1302 (9.97)        846.4790 (30.49)      168.5962 (82.77)      150;74   1,104.4237 (0.04)       1021           1
test_benchmark_adj_list[directed-1000-5000] (NOW)                          763.3580 (31.10)     7,888.8580 (29.78)       885.9209 (27.42)      345.1872 (22.03)       824.9655 (29.71)      101.1240 (49.64)       18;72   1,128.7689 (0.04)       1106           1
test_benchmark_adj_list[undirected_by_other_node_type-1000-5000]         1,299.5350 (52.94)     2,612.8160 (9.86)      1,417.8116 (43.89)      147.2268 (9.40)      1,376.2400 (49.57)       85.7830 (42.11)       63;66     705.3123 (0.02)        702           1
test_benchmark_adj_list[directed_by_other_node_type-1000-5000]           1,444.1410 (58.83)     2,396.5890 (9.05)      1,593.9811 (49.34)      141.4838 (9.03)      1,545.4120 (55.66)       95.0855 (46.68)       60;51     627.3600 (0.02)        529           1
test_benchmark_adj_list[undirected-20000-100000] (NOW)                  13,148.9710 (535.64)   26,388.4510 (99.62)    17,896.5498 (553.98)   4,053.7007 (258.76)   17,073.9930 (614.92)   6,290.1110 (>1000.0)      16;0      55.8767 (0.00)         43           1
test_benchmark_adj_list[undirected-20000-100000] (0048_cd41384)         13,165.5930 (536.32)   21,529.8700 (81.27)    13,802.7064 (427.26)   1,054.7727 (67.33)    13,615.2810 (490.36)     477.4575 (234.39)        2;3      72.4496 (0.00)         65           1
test_benchmark_adj_list[directed-20000-100000] (0048_cd41384)           20,073.5320 (817.73)   68,412.7400 (258.25)   24,364.9066 (754.21)   8,489.6125 (541.91)   21,277.4305 (766.31)   3,545.3960 (>1000.0)       4;5      41.0426 (0.00)         46           1
test_benchmark_adj_list[directed-20000-100000] (NOW)                    20,092.9070 (818.52)   22,822.2610 (86.15)    21,363.1302 (661.29)     636.8507 (40.65)    21,381.1560 (770.05)     593.6295 (291.42)       14;5      46.8096 (0.00)         43           1
test_benchmark_adj_list[undirected_by_other_node_type-20000-100000]     26,283.7130 (>1000.0)  32,821.3140 (123.90)   27,349.3274 (846.59)   1,235.1034 (78.84)    27,087.4120 (975.56)     924.2280 (453.72)        2;2      36.5640 (0.00)         34           1
test_benchmark_adj_list[directed_by_other_node_type-20000-100000]       33,165.1700 (>1000.0)  36,872.6090 (139.19)   34,440.3798 (>1000.0)  1,118.2480 (71.38)    34,059.4760 (>1000.0)    847.0395 (415.83)        8;5      29.0357 (0.00)         27           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Adjacency list allocations

------------------------------------------------------------------------------------------------------- benchmark 'StellarGraph adjacency lists (peak)': 12 tests -------------------------------------------------------------------------------------------------------
Name (time in s)                                                                              Min                     Max                    Mean              StdDev                  Median               IQR            Outliers     OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_allocation_benchmark_adj_list_peak[undirected-100-200] (NOW)                      7,651.0000 (1.0)        7,915.0000 (1.0)        7,664.2000 (1.0)       59.0322 (inf)        7,651.0000 (1.0)      0.0000 (1.0)           1;1  0.0001 (1.0)          20           1
test_allocation_benchmark_adj_list_peak[undirected-100-200] (0050_cd41384)             7,731.0000 (1.01)       7,995.0000 (1.01)       7,744.2000 (1.01)      59.0322 (inf)        7,731.0000 (1.01)     0.0000 (1.0)           1;1  0.0001 (0.99)         20           1
test_allocation_benchmark_adj_list_peak[undirected_by_other_node_type-100-200]         8,747.0000 (1.14)       9,772.0000 (1.23)       8,798.2500 (1.15)     229.1970 (inf)        8,747.0000 (1.14)     0.0000 (1.0)           1;1  0.0001 (0.87)         20           1
test_allocation_benchmark_adj_list_peak[directed-100-200] (NOW)                        8,752.0000 (1.14)       8,752.0000 (1.11)       8,752.0000 (1.14)       0.0000 (1.0)        8,752.0000 (1.14)     0.0000 (1.0)           0;0  0.0001 (0.88)         20           1
test_allocation_benchmark_adj_list_peak[directed-100-200] (0050_cd41384)               8,912.0000 (1.16)       8,912.0000 (1.13)       8,912.0000 (1.16)       0.0000 (1.0)        8,912.0000 (1.16)     0.0000 (1.0)           0;0  0.0001 (0.86)         20           1
test_allocation_benchmark_adj_list_peak[directed_by_other_node_type-100-200]           8,912.0000 (1.16)       8,912.0000 (1.13)       8,912.0000 (1.16)       0.0000 (1.0)        8,912.0000 (1.16)     0.0000 (1.0)           0;0  0.0001 (0.86)         20           1
test_allocation_benchmark_adj_list_peak[undirected-1000-5000] (NOW)                   64,903.0000 (8.48)      64,967.0000 (8.21)      64,906.2000 (8.47)      14.3108 (inf)       64,903.0000 (8.48)     0.0000 (1.0)           1;1  0.0000 (0.12)         20           1
test_allocation_benchmark_adj_list_peak[undirected-1000-5000] (0050_cd41384)          72,983.0000 (9.54)      73,047.0000 (9.23)      72,986.2000 (9.52)      14.3108 (inf)       72,983.0000 (9.54)     0.0000 (1.0)           1;1  0.0000 (0.11)         20           1
test_allocation_benchmark_adj_list_peak[undirected_by_other_node_type-1000-5000]     106,091.0000 (13.87)    106,891.0000 (13.50)    106,131.0000 (13.85)    178.8854 (inf)      106,091.0000 (13.87)    0.0000 (1.0)           1;1  0.0000 (0.07)         20           1
test_allocation_benchmark_adj_list_peak[directed-1000-5000] (NOW)                    120,428.0000 (15.74)    120,428.0000 (15.22)    120,428.0000 (15.71)      0.0000 (1.0)      120,428.0000 (15.74)    0.0000 (1.0)           0;0  0.0000 (0.06)         20           1
test_allocation_benchmark_adj_list_peak[directed-1000-5000] (0050_cd41384)           125,588.0000 (16.41)    125,588.0000 (15.87)    125,588.0000 (16.39)      0.0000 (1.0)      125,588.0000 (16.41)    0.0000 (1.0)           0;0  0.0000 (0.06)         20           1
test_allocation_benchmark_adj_list_peak[directed_by_other_node_type-1000-5000]       163,517.0000 (21.37)    163,517.0000 (20.66)    163,517.0000 (21.34)      0.0000 (1.0)      163,517.0000 (21.37)    0.0000 (1.0)           0;0  0.0000 (0.05)         20           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------- benchmark 'StellarGraph adjacency lists (size)': 12 tests ---------------------------------------------------------------------------------------------------
Name (time in s)                                                                        Min                    Max                   Mean              StdDev                 Median               IQR            Outliers     OPS            Rounds  Iterations
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_allocation_benchmark_adj_list[directed-100-200] (NOW)                         672.0000 (1.0)       1,706.0000 (1.13)        723.7000 (1.0)      231.2094 (64.62)       672.0000 (1.0)      0.0000 (1.0)           1;1  0.0014 (1.0)          20           1
test_allocation_benchmark_adj_list[directed-100-200] (0050_cd41384)                688.0000 (1.02)      1,706.0000 (1.13)        738.9000 (1.02)     227.6317 (63.62)       688.0000 (1.02)     0.0000 (1.0)           1;1  0.0014 (0.98)         20           1
test_allocation_benchmark_adj_list[undirected-100-200] (0050_cd41384)              840.0000 (1.25)      1,506.0000 (1.0)         873.3000 (1.21)     148.9221 (41.62)       840.0000 (1.25)     0.0000 (1.0)           1;1  0.0011 (0.83)         20           1
test_allocation_benchmark_adj_list[undirected-100-200] (NOW)                       840.0000 (1.25)      1,506.0000 (1.0)         873.3000 (1.21)     148.9221 (41.62)       840.0000 (1.25)     0.0000 (1.0)           1;1  0.0011 (0.83)         20           1
test_allocation_benchmark_adj_list[directed_by_other_node_type-100-200]          1,592.0000 (2.37)      3,278.0000 (2.18)      1,676.3000 (2.32)     377.0011 (105.38)    1,592.0000 (2.37)     0.0000 (1.0)           1;1  0.0006 (0.43)         20           1
test_allocation_benchmark_adj_list[undirected_by_other_node_type-100-200]        2,973.0000 (4.42)      4,560.0000 (3.03)      3,052.3500 (4.22)     354.8640 (99.19)     2,973.0000 (4.42)     0.0000 (1.0)           1;1  0.0003 (0.24)         20           1
test_allocation_benchmark_adj_list[directed-1000-5000] (NOW)                    22,674.0000 (33.74)    22,706.0000 (15.08)    22,675.6000 (31.33)      7.1554 (2.00)     22,674.0000 (33.74)    0.0000 (1.0)           1;1  0.0000 (0.03)         20           1
test_allocation_benchmark_adj_list[directed-1000-5000] (0050_cd41384)           22,690.0000 (33.76)    22,706.0000 (15.08)    22,690.8000 (31.35)      3.5777 (1.0)      22,690.0000 (33.76)    0.0000 (1.0)           1;1  0.0000 (0.03)         20           1
test_allocation_benchmark_adj_list[undirected-1000-5000] (0050_cd41384)         24,844.0000 (36.97)    24,908.0000 (16.54)    24,847.2000 (34.33)     14.3108 (4.00)     24,844.0000 (36.97)    0.0000 (1.0)           1;1  0.0000 (0.03)         20           1
test_allocation_benchmark_adj_list[undirected-1000-5000] (NOW)                  24,844.0000 (36.97)    24,908.0000 (16.54)    24,847.2000 (34.33)     14.3108 (4.00)     24,844.0000 (36.97)    0.0000 (1.0)           1;1  0.0000 (0.03)         20           1
test_allocation_benchmark_adj_list[directed_by_other_node_type-1000-5000]       29,590.0000 (44.03)    29,670.0000 (19.70)    29,594.0000 (40.89)     17.8885 (5.00)     29,590.0000 (44.03)    0.0000 (1.0)           1;1  0.0000 (0.02)         20           1
test_allocation_benchmark_adj_list[undirected_by_other_node_type-1000-5000]     38,888.0000 (57.87)    39,368.0000 (26.14)    38,912.0000 (53.77)    107.3313 (30.00)    38,888.0000 (57.87)    0.0000 (1.0)           1;1  0.0000 (0.02)         20           1
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

huonw

This looks awesome. It's definitely much harder for me to think about this implementation, but it looks like it's not too much slower than the non-node type one, and (based on comparing the benchmarks in the PR description with your latest ones) significantly faster than the old form. 👍

huonw · 2020-05-21T02:32:48Z

stellargraph/core/element_data.py

+    neigh_counts = sum(
+        np.bincount(ilocs, minlength=number_of_nodes) for ilocs in node_ilocs_list
+    )


I think this will allocate len(node_iloc_list) extra arrays, because it's not mutating in place like the old form did (in particular it'll do something like (0 + bincount) + bincount, and each of those + will create a whole new array). We could resolve this with something more complicated like:

Suggested change

neigh_counts = sum(

np.bincount(ilocs, minlength=number_of_nodes) for ilocs in node_ilocs_list

)

neigh_counts = np.bincount(node_ilocs_list[0], minlength=number_of_nodes)

for ilocs in node_ilocs_list[1:]:

neigh_counts += np.bincount(ilocs, minlength=number_of_nodes)

But, based on the benchmarks not really changing, I don't think this is worth it, especially because this would be entirely better using numpy/numpy#7998.

huonw · 2020-05-21T02:36:36Z

stellargraph/core/element_data.py

+        source_types = self.node_data.type_ilocs[self.sources]
+        target_types = self.node_data.type_ilocs[self.targets]
+
+        def _to_directed_adj_list(arr, other_node_types):


The peak memory use may be able to be reduced by creating the ..._types arrays more transiently, so that they don't both exist at once. In particular, I think they can probably moved inside the helper method:

Suggested change

source_types = self.node_data.type_ilocs[self.sources]

target_types = self.node_data.type_ilocs[self.targets]

def _to_directed_adj_list(arr, other_node_types):

def _to_directed_adj_list(node_ilocs, other_node_ilocs):

other_node_types = self.node_data.type_ilocs[other_node_ilocs]

And then call it like _to_directed_adj_list(self.targets, self.sources) (or vice versa) below.

huonw · 2020-05-21T02:39:49Z

stellargraph/core/element_data.py

+        source_types = self.node_data.type_ilocs[self.sources]
+        target_types = self.node_data.type_ilocs[self.targets]
+
+        def _to_directed_adj_list(arr, other_node_types):


If you felt like it, this nested method could become a top level method, like:

def _to_directed_adj_list(self, node_ilocs, other_node_ilocs): ... def _create_directed_adj_lists_by_other_node_type(self): return ( self._to_directed_adj_list(self.targets, self.sources), self._to_directed_adj_list(self.sources, self.targets), )

huonw · 2020-05-21T02:53:33Z

stellargraph/core/element_data.py

+            for other_node_type in np.unique(other_node_types):
+
+                # filter node ilocs based on other node type
+                arr_filtered = arr[other_node_types == other_node_type]


Because of the various whole-array filters here, this loop is O(edges * node types + nodes * node types); it's probably fine, assuming there's not many node types, but we could potentially consider other approaches, e.g.

doing a Pandas groupby, which might be O(edges + nodes * node types), although it'll like have more overhead, e.g.
for other_node_type, flat_filtered in pd.Series(flat).groupby(other_node_types): splits = _get_splits([arr[flat_filtered]], self.number_of_nodes, self._id_index.dtype) index[other_node_type] = FlatAdjacencyList(flat_filtered, splits)

sorting other_node_types (and other bits too) so that index returned by np.unique(other_node_types, return_index=True) gives the first index of each node type, and thus we can do O(1) slicing instead of O(edges) comparisons

Both of these are likely to have more overhead and potentially be a bit fiddly (I don't know if the first one is the right code), so I don't know if they'll make sense. I don't know if we have a good "real world" dataset with many node types to test this, but this may also indicate that it's not actually a concern (knowledge graphs can end up with thousands of edge types... but only one node type).

Thoughts?

The pandas one seems relatively nice and clear, I can try something like that 👍 - I'm still finding it a bit difficult to remember how to wrap my head around the sorting and filtering going on each time I look at it, so I'm inclined to accept a bit of overhead if it makes things easier to follow. I can try something similar for the undirected version too.

huonw · 2020-05-21T02:58:34Z

stellargraph/core/element_data.py

+            for other_node_type in np.unique(other_node_types):
+
+                # filter node ilocs based on other node type
+                arr_filtered = arr[other_node_types == other_node_type]
+                splits = _get_splits(
+                    [arr_filtered], self.number_of_nodes, self._id_index.dtype
+                )
+
+                # filter edge ilocs based on other node type
+                flat_filtered = flat[other_node_types[flat] == other_node_type]


The double indexing here is potentially suboptimal, especially because other_node_types[flat] doesn't change between loops but does have to be recomputed. I wonder if there's another way we could phrase this loop to be slightly more efficient even if we don't completely switch it:

other_node_types = other_node_types[flat] for other_node_type in np.unique(other_node_types): # filter edge ilocs based on other node type flat_filtered = flat[other_node_types == other_node_type] # choose the relevant node ilocs based on the edge ilocs arr_filtered = arr[flat_filtered] splits = _get_splits( [arr_filtered], self.number_of_nodes, self._id_index.dtype )

(Not a GitHub suggestion because the rendering is really ugly.)

This reduces us from doing two O(edges) comparisons to doing one O(edges) one, and one O(selected edges) one. This doesn't change the overall asymptotics of the loop but should reduce the overhead. (Moving the other_node_types[flat] out of the loop could be applied even without changing the order of flat_filtered and arr_filtered.)

huonw · 2020-05-21T03:00:32Z

stellargraph/core/element_data.py

+            self._create_undirected_adj_lists_by_other_node_type()
+        )
+
+    def _create_undirected_adj_lists_by_other_node_type(self):


This function is... impressive. It definitely seems super hard to phrase some of this.

huonw · 2020-05-21T03:10:26Z

stellargraph/core/element_data.py

+            return self._adj_lookup_by_other_node_type(ins=ins, outs=outs).get(
+                other_node_type_iloc, defaultdict(list)
+            )[node_iloc]


I take it the get+defaultdict(list) here is in case there's node types that don't have any incoming edges or any outgoing edges (e.g. all edges might be a->b)? I'm slightly concerned that creating a defaultdict for every call may be adding noticeable overhead, and, also it's returning a list rather than a NumPy array, so consumers may get confused (and/or have to spend more work converting types).

One option might be pre-filling the _edges..._dict_by_other_node_type values with every node type, e.g.:

class EmptyFlatAdjacencyList: def __init__(self, dtype): self.empty = np.array([], dtype=dtype) def __getitem__(self, idx): return empty

Or just creating an O(nodes) FlatAdjacencyList, like FlatAdjacencyList(np.array([], dtype=...), np.zeros(len(nodes) + 1))

Another option would be moving the defaulting to a separate code path, like:

adj_list = self._adj_lookup_by_other_node_type(ins=ins, outs=outs).get(other_node_type_iloc) if adj_list is None: return np.array([], dtype=...) return adj_list[node_iloc]

Ah good point, I did want to also allow completely non-existent node types to return empty lists:

g = StellarGraph(...) # has node types ["A", "B"] g.neighbors(1, other_node_type="X") # should return np.array([])

So I might go with the separate code path approach?

What's the motivation for returning empty, rather than emitting an error? Because this is using ilocs, I suspect the conversion to a type iloc should fail for that case?

huonw · 2020-05-21T03:14:44Z

tests/core/test_graph.py

@@ -1001,14 +1005,30 @@ def f():
 @pytest.mark.parametrize(
    "num_nodes,num_edges", [(100, 200), (1000, 5000), (20000, 100000)]
 )
-@pytest.mark.parametrize("force_adj_lists", ["directed", "undirected"])
+@pytest.mark.parametrize(


In order to capture some of performance of the filtering in the inner loop, I wonder if if it's worth also parameterising this benchmark by the number of node types, e.g.

@pytest.mark.parametrize("n_types", [1, 4, 20])

and passing this through to example_benchmark_graph.

(Potentially with a if force_adj_lists in ("directed", "undirected") and n_types != 1: pytest.skip("...") to avoid spending time on retesting uninteresting things.)

deepakagrawal · 2021-09-24T03:30:02Z

Is this enhancement already in place? How can I start using this version? The current implementation of UniformRandomMetaPathWalk is still very slow.

kevin added 5 commits April 29, 2020 11:14

Group adjacencies by types

b62390d

Fix failing tests

3d8e150

Add other_node_type parameters

ac697b8

Add tests

994ef0a

Add comments

95729a5

kjun9 marked this pull request as ready for review April 29, 2020 05:14

kjun9 requested review from kieranricardo and huonw April 29, 2020 05:15

build extra adj index lazily

7c8ca4d

huonw reviewed May 1, 2020

View reviewed changes

kevin added 5 commits May 3, 2020 18:30

Make each index lazy, make tests better

b55eb69

Merge branch 'develop' into feature/speed-up-metapath-walk

1b56db1

Fix bad merge

26b705e

Fix docstring

91b103d

Add benchmarks

fb35234

This was referenced May 4, 2020

Remove utils directory #1462

Merged

Create a demo notebook to replace script for random walk link prediction #934

Closed

huonw reviewed May 7, 2020

View reviewed changes

kevin added 13 commits May 8, 2020 12:46

Address review comments

6e0f943

make types lazy

55dbbf1

Update benchmark parameter name

e7a299c

Parametrize metapath walker test

337eb5a

Merge branch 'develop' into feature/speed-up-metapath-walk

96b7036

Revert everything except tests

1ce0976

Revert "Revert everything except tests"

46cb9fe

This reverts commit 1ce0976.

don't use type ilocs

4f0376d

Use type ilocs in metapath walk

7f91eca

Merge branch 'develop' into feature/speed-up-metapath-walk

dd851e5

Remove LazyAdjLists

3ba3cd9

Add force_adj_lists options by other node type

de44e93

Merge branch 'develop' into feature/speed-up-metapath-walk

897b4d3

kevin added 2 commits May 20, 2020 12:25

Merge branch 'develop' into feature/speed-up-metapath-walk

ee99248

Merge branch 'develop' into feature/speed-up-metapath-walk

0ec91df

huonw mentioned this pull request May 20, 2020

Ehancement/adj list by type #1232

Closed

kevin added 6 commits May 20, 2020 16:23

WIP convert to contiguous array

7bcd271

Implement adj lists by other node type as continuous arrays

2e08f02

Take out _get_splits

6b102fe

Take out combine

d5b8d3f

Minor fixups

a30b609

Add test case for non-existent type

1e4d782

kjun9 requested a review from huonw May 21, 2020 01:51

huonw reviewed May 21, 2020

View reviewed changes

This was referenced May 28, 2020

Add metapath2vec link prediction demo #1392

Closed

Add metapath2vec link prediction notebook #1614

Merged

	def _augment_and_group_metapaths(self, metapaths, walk_length):
	def _extend_and_group_metapaths(self, metapaths, walk_length):

	def _augment_and_group_metapaths(self, metapaths, walk_length):
	def _repeat_and_group_metapaths(self, metapaths, walk_length):

	self.source_types = nodes.type_of_iloc(self.sources)
	self.source_types = nodes.type_ilocs[self.sources]

		@@ -303,74 +303,117 @@ def _numpyise(d, dtype):
		return {k: np.array(v, dtype=dtype) for k, v in d.items()}


		class AdjList:

-    neigh_counts = sum(
-        np.bincount(ilocs, minlength=number_of_nodes) for ilocs in node_ilocs_list
-    )
+    neigh_counts =  np.bincount(node_ilocs_list[0], minlength=number_of_nodes)
+    for ilocs in node_ilocs_list[1:]:
+        neigh_counts += np.bincount(ilocs, minlength=number_of_nodes)

Make metapath walk faster by grouping adjacency lists by node type #1399

Are you sure you want to change the base?

Make metapath walk faster by grouping adjacency lists by node type #1399

Conversation

kjun9 commented Apr 29, 2020 • edited

codeclimate bot commented Apr 29, 2020 • edited

huonw commented Apr 29, 2020

kjun9 commented Apr 29, 2020

kieranricardo commented Apr 29, 2020 • edited

kieranricardo commented Apr 29, 2020 • edited

kieranricardo commented Apr 29, 2020

huonw commented Apr 29, 2020 • edited

kjun9 commented Apr 29, 2020

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kjun9 commented May 4, 2020 • edited

huonw commented May 4, 2020

kjun9 commented May 4, 2020

huonw commented May 4, 2020

kjun9 commented May 4, 2020

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kjun9 May 7, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kjun9 commented May 19, 2020

kjun9 commented May 21, 2020

Metapath2vec walk

Adjacency list construction times

Adjacency list allocations

huonw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huonw May 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deepakagrawal commented Sep 24, 2021

kjun9 commented Apr 29, 2020 •

edited

codeclimate bot commented Apr 29, 2020 •

edited

kieranricardo commented Apr 29, 2020 •

edited

kieranricardo commented Apr 29, 2020 •

edited

huonw commented Apr 29, 2020 •

edited

kjun9 commented May 4, 2020 •

edited

kjun9 May 7, 2020 •

edited

huonw May 21, 2020 •

edited