Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get a specified node 2-hop subgraph with pyspark #444

Open
Reid00 opened this issue Nov 20, 2023 · 4 comments
Open

How to get a specified node 2-hop subgraph with pyspark #444

Reid00 opened this issue Nov 20, 2023 · 4 comments

Comments

@Reid00
Copy link

Reid00 commented Nov 20, 2023

Hi guys,

I am fresh man about grahpx. I read the doc want to get the 2-hop subgraph, find this

from graphframes.examples import Graphs
g = Graphs.friends()  # Get example graph

# Select subgraph based on edges "e" of type "follow"
# pointing from a younger user "a" to an older user "b".
paths = g.find("(a)-[e]->(b)")\
  .filter("e.relationship = 'follow'")\
  .filter("a.age < b.age")
# "paths" contains vertex info. Extract the edges.
e2 = paths.select("e.src", "e.dst", "e.relationship")
# In Spark 1.5+, the user may simplify this call:
#  val e2 = paths.select("e.*")

# Construct the subgraph
g2 = GraphFrame(g.vertices, e2)

but I noticed, this just 1-hop subgraph. I try use bfs, but I found it just get out direction like, get subgraph from a, bfs below code, just get (a)-[]->(b)-[]->(c). In fact, I want to both directions, including (a)<-[]-(b)<-[]-(c) and (a)-[]->(b)<-[]-(c).

    two_hop_graph = graph.bfs(fromExpr="id='f625a5b661058ba5082ca508f99ffe1b'", 
    toExpr="id<>'f625a5b661058ba5082ca508f99ffe1b'", maxPathLength=hop)
    two_hop_graph.show(truncate=False)
    print(f"count: {two_hop_graph.count()}")

is there any method can get a subgraph with specified node id?

@Reid00
Copy link
Author

Reid00 commented Nov 21, 2023

hello, is there anyone take a look? thanks!

@tusharg1993
Copy link

@Reid00 Trying to follow your query. Let me know if I misunderstood. If I understand correctly, you are looking for edges in the graph to be treated as undirected?

In GraphFrame, the edges are directed by nature. Thus, if you want the both reverse edges to be taken, you would need to add the reciprocal edges to the edges dataframe.

@Reid00
Copy link
Author

Reid00 commented Dec 21, 2023

thanks for your reply. I sorry confused you. I want to how to get a specified node 2-hop subgraph, I do some tests as above code, but It's all wrong, so I want to know is there any way in GraphFrame?

what is 2-hop subgraph like below:
image

@coreyabs-db
Copy link

coreyabs-db commented Mar 11, 2024

Hi @Reid00 - I think you can do this with a motif and appropriate filters, for example something like:

hops = g.find("(a)-[e1]->(b); (b)-[e2]->(c)").filter("a.id == 0 and a.id != c.id")
new_edges = hops.select("e1.*").unionAll(hops.select("e2.*")).distinct()
subgraph = GraphFrame(g.vertices, new_edges).dropIsolatedVertices()

bfs won't work here as you noticed because it doesn't go beyond the first node in the path matching the expression (and the first non target node of course matches your toExpr). as @tusharg1993 mentioned, be sure you have reverse edges for undirected behavior.

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants