Skip to content
This repository has been archived by the owner on Oct 8, 2021. It is now read-only.

Inconsistency about weights #1519

Open
greimel opened this issue Jan 28, 2021 · 21 comments
Open

Inconsistency about weights #1519

greimel opened this issue Jan 28, 2021 · 21 comments

Comments

@greimel
Copy link

greimel commented Jan 28, 2021

I am confused about how the LightGraphs ecosystem handles weighted graphs. I wonder why adjacency_matrix == weights for SimpleWeightedGraphs, while for MetaGraphs

  • the adjacency_matrix is a matrix of {0,1}, even in the presence of weights
  • the weight equals the default weight for non-connected nodes (I would have expected it to be 0)

Is there a reason for this inconsistency?

I am not a mathematician, but an economist. My standard reference for graphs is
Social and Economic Networks by Matthew O. Jackson. And he makes no difference between adjacency matrix and weight matrix.

I am happy to contribute to get rid of this inconsistency, if that means to bring MetaGraphs closer to the behavior of SimpleWeightedGraphs.

Here is some code to show you what I mean

using LightGraphs, SimpleWeightedGraphs, MetaGraphs, GraphDataFrameBridge, DataFrames

begin
	w_g = SimpleWeightedGraph(3)
	add_edge!(w_g, 1, 2, 0.5)
	add_edge!(w_g, 2, 3, 0.8)
	add_edge!(w_g, 1, 3, 2.0)
end

adjacency_matrix(w_g)

# 3×3 SparseMatrixCSC{Float64, Int64} with 6 stored entries:
#  ⋅   0.5  2.0
# 0.5   ⋅   0.8
# 2.0  0.8   ⋅ 

weights(w_g) == adjacency_matrix(w_g)

# true

begin
	df = DataFrame(from = [1, 2, 1], to = [2, 3, 3], weight = [0.5, 0.8, 2.0])
	m_g = MetaGraph(df, :from, :to, weight = :weight)
end

adjacency_matrix(m_g)

# 3×3 SparseMatrixCSC{Int64, Int64} with 6 stored entries:
#  ⋅  1  1
#  1  ⋅  1
#  1  1  ⋅

collect(weights(m_g))

# 3×3 Matrix{Float64}:
#  1.0  0.5  2.0
#  0.5  1.0  0.8
#  2.0  0.8  1.0
@simonschoelly
Copy link
Contributor

I would prefer if the adjacency matrix would not be the same as weights for all graphs - sometimes one is only interested in structural information so in that case a 0/1 matrix is enough.

That the default weights return a matrix full of ones is a bit of a hack for performance - as for algorithms that take a weight parameter one usually just queries the entries of that matrix where there is an edge. There are certainly other ways to do that that would yield the same performance though.

@greimel
Copy link
Author

greimel commented Jan 28, 2021

What about allowing MetaGraphs.graph isa SimpleWeightedGraph and then defaulting to the SimpleWeightedGraphs behavior? (I guess this would be very confusing)

So what about having an alternative type like MetaWeightedGraph or WeightedMetaGraph that behaves more like SimpleWeightedGraphs?

@sbromberger
Copy link
Owner

sbromberger commented Jan 28, 2021

OK, let's see if this helps clear things up:

SimpleGraphs are unweighted. That is, each edge has a unit weight of 1. You. cannot change this - it is a feature/limitation of the graph type.

MetaGraphs support edge weights. The way you do it is that you create and set a value for a weight field for each edge, and then tell the graph that that field represents edge weights.

Adjacency matrices simply tell you whether an edge exists between two given vertices. It has nothing to do with weights at all. Adjacency matrices do not have to be integers: you can tell adjacency_matrix to give you booleans, for example, so that true means the edge exists, and false means there is no edge. The default for SimpleGraphs is int.
The function signature is

adjacency_matrix(g[, T=Int; dir=:out])

So if you want a matrix of bool, you can use adjacency_matrix(g, Bool).

Note that an adjacency matrix will ALWAYS be a sparse matrix, even if weights are defined (as in MetaGraphs) in another data structure (like a dict).

Does this help?

Edited to respond to @greimel - in LightGraphs, we NEVER reference fields of structs - there are always accessors. That is, you should NEVER use MetaGraphs.graph anywhere in your code. It may (eventually will) break on you without warning.

For MetaGraphs, the weights are set (and used) as follows:

julia> g = MetaGraph(3)
{3, 0} undirected Int64 metagraph with Float64 weights defined by :weight (default weight 1.0)

julia> add_edge!(g, 1, 2)
true

julia> add_edge!(g, 2, 3)
true

julia> set_prop!(g, 1, 2, :weight, 5.0)
true

julia> set_prop!(g, 2, 3, :weight, 10.0)
true

julia> diameter(g)
15.0

@greimel
Copy link
Author

greimel commented Jan 28, 2021

Thank you for your responses!

My question was not how to use LightGraphs, but why it works the way it does. Why doesn't adjacency_matrix(::SimpleWeightedGraph) produce a matrix of {0,1} but the weights?

I think this is inconsistent with

Adjacency matrices simply tell you whether an edge exists between two given vertices. It has nothing to do with weights at all.

@greimel
Copy link
Author

greimel commented Jan 28, 2021

For a consistent behavior in the ecosystem, these things would have to change

  1. adjacency_matrix(::SimpleWeightedGraph) returns a matrix of 0 and 1
  2. weights(::MetaGraph) are 0 for unconnected nodes

@greimel
Copy link
Author

greimel commented Jan 28, 2021

I think you are misunderstanding. adjacency_matrix(::SimpleWeightedGraph) DOES NOT return a matrix of Ints.

Moreover, adjacency_matrix(::SimpleWeightedGraph, Bool) gives an error.

see my code snipped from the OP

using LightGraphs, SimpleWeightedGraphs

begin
	w_g = SimpleWeightedGraph(3)
	add_edge!(w_g, 1, 2, 0.5)
	add_edge!(w_g, 2, 3, 0.8)
	add_edge!(w_g, 1, 3, 2.0)
end

adjacency_matrix(w_g)

# 3×3 SparseMatrixCSC{Float64, Int64} with 6 stored entries:
#  ⋅   0.5  2.0
# 0.5   ⋅   0.8
# 2.0  0.8   ⋅ 

@sbromberger
Copy link
Owner

sbromberger commented Jan 28, 2021

Ah, you're talking about SimpleWeightedGraphs. The reason adjacency_matrix returns the weights there is twofold:

  1. it's super efficient.
  2. it's probably incorrect.

That is, we really should be doing a adjacency_matrix(g) .> 0 there for consistency.

@greimel
Copy link
Author

greimel commented Jan 28, 2021

Very good! I am happy to prepare a PR.

What about point two of

For a consistent behavior in the ecosystem, these things would have to change

  1. adjacency_matrix(::SimpleWeightedGraph) returns a matrix of 0 and 1
  2. weights(::MetaGraph) are 0 for unconnected nodes

Do you agree that weights of a MetaGraph should be zero for unconnected nodes?

@sbromberger
Copy link
Owner

sbromberger commented Jan 28, 2021

Do you agree that weights of a MetaGraph should be zero for unconnected nodes?

I'm not convinced yet.

Also - please include benchmarks in your PR: that is, you can do (!iszero).(adjacency_matrix(g)), adjacency_matrix(g) .> 0, etc. Time these against the default behavior so we know how much of a performance loss we're going to get.

@sbromberger
Copy link
Owner

I take back what I said earlier about efficiency. What we should be doing for SWG is adjacency_matrix(g::SimpleWeightedGraph{T}, ::Datatype{T}) where T = copy(g.weights) (or the transform; I forget). I don't recall offhand what we're actually doing and I don't have time to look at it for a while - but if you open up the PR over in SWG, we can discuss when I'm back in office.

@greimel
Copy link
Author

greimel commented Jan 28, 2021

Do you agree that weights of a MetaGraph should be zero for unconnected nodes?

I'm not convinced yet.

Ok, so let me give another try.

Point 1: Simplicity - weights(graph) can completely characterize the graph

In much of the economics literature on networks, a graph is represented as matrix G. (We call G the adjacency matrix, but that's not so important). We define two nodes as connected if G[i,j] > 0.

In LightGraphs this matrix G doesn't exist yet. It is adjacency_matrix(G) .* weights(G) which is a bit annoying. I don't think it would hurt anyone if the weights are 0 for unconnected nodes.

Point 2: Consistency

Also, there still remains an inconsistency. If you want to return the default weight for unconnected nodes, then there should be a default weight for a ::SimpleWeightedGraph, so that weights(g) = weights .* (weights .> 0) .+ default_weight .* (weights .== 0). This matters if you do wg = SimpleWeightedGraph(g::SimpleGraph, 5), then weights(wg) should be equal to 5, even for unconnected nodes, since 5 is the default weight.

@sbromberger
Copy link
Owner

sbromberger commented Jan 28, 2021

I think there may be a misunderstanding about what "default weight" means: it means: IF the edge exists, AND no weights have been defined for the edge, THEN we use this value. It's not "Every entry in the adjacency matrix has this value UNLESS an edge is defined". The latter definition would result in dense matrices for all adjacency matrices, which is not desirable at all.

Edited: but it's possible I'm misunderstanding, since I'm rushing to get out the door and haven't had coffee yet. Sorry. This discussion should probably wait until I'm back in the office (week after next).

@greimel
Copy link
Author

greimel commented Jan 28, 2021

I guess we agree then that the current behavior is a bug. Here's the other part of the code snippet from the OP.

using LightGraphs, MetaGraphs, GraphDataFrameBridge, DataFrames
begin
	df = DataFrame(from = [1, 2, 1], to = [2, 3, 3], weight = [0.5, 0.8, 2.0])
	m_g = MetaGraph(df, :from, :to, weight = :weight)
end

adjacency_matrix(m_g)

# 3×3 SparseMatrixCSC{Int64, Int64} with 6 stored entries:
#  ⋅  1  1
#  1  ⋅  1
#  1  1  ⋅

collect(weights(m_g))

# 3×3 Matrix{Float64}:
#  1.0  0.5  2.0
#  0.5  1.0  0.8
#  2.0  0.8  1.0

The weights are 1.0 for unconnected nodes.

@sbromberger
Copy link
Owner

That sure looks like a bug. But I don't think I've ever used that constructor. You're familiar with GraphDataFrameBridge, right?

Let's pick this up week after next if you don't mind :) I'm seriously late this morning.

@sbromberger
Copy link
Owner

This is a really important discussion to have, though. Thanks for bringing it up.

@jpfairbanks
Copy link
Contributor

I think I understand Seth's point to be distinguishing the weights object from the weighted adjacency matrix. If you do collect(weights(mg)) .* adjacency_matrix(mg) you get the weighted adjacency matrix. A nonbreaking solution could be to introduce a weighted_adjacency_matrix(mg) function.

@greimel
Copy link
Author

greimel commented Jan 28, 2021

I could live with that. But it would keep the inconsistencies between MetaGraphs and SimpleWeightedGraphs, which is not very nice.

Either of the two changes I am proposing is breaking, unfortunately. But I would consider the current behavior as bugs in both cases.

One could

  1. introduce weighted_adjacency_matrix
  2. deprecate weights in favor weighted_adjacency_matrix
  3. if necessary, introduce fast_weights for the current behavior

Changing the behavior of adjacency_matrix in SWG is probably more difficult to handle nicely.

@sbromberger
Copy link
Owner

I'm tempted to make weights private (that is, not part of any API contract). Would that solve the issue?

@sbromberger
Copy link
Owner

Closing this out as it appears that the discussion has run its course.

@greimel
Copy link
Author

greimel commented Jul 6, 2021

I'd appreciate if you could reopen this. I don't think the issue is solved. (It's just that I was busy with other things.)

If you create the same graph, once as a SimpleWeightedGraph and once as a MetaGraph with weights, then all kinds of network statistics give different results. I think this is a bug that should be fixed.

Once I am bitten by this issue again, I'll make that PR (that I announced months ago, sorry!).

@sbromberger sbromberger reopened this Jul 6, 2021
@sbromberger
Copy link
Owner

Happy to reopen. I still think weights needs to be private.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants