You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
this plays into #54, since the groups are a little easier to read/parse.
unlike some of the operations for #139, this grouping doesn't mutate edges. in fact, its output might be another kind of complete graph, or it might be a simple list of lists or other non-graph-related structure. apparently this shape is called a star by networkx; see e.g. add_star()
one way this could work using the data= param for networkx's edges():
query for all the edges that connect to the given node using
g.edges([n, None], data=True)
where g is a MultiGraph in which nodes are docs and edges are matches and n is the target doc. this will return all the data associated with the edge, and it will helpfully express the edges with the target node first:
group the edges via the sequence bounds in n. in other words, if there are two matches whose Span in n has the same start and end, group them. even if the actual aligned text differs due to spacing, we want to aggregate them all so that we can display the corresponding sequences that aren't in n together.
when we display the group, we'll just use the actual unaligned text of n between both bounds of the Span (or group) once, followed by all the sequences from docs that aren't n.
The text was updated successfully, but these errors were encountered:
this plays into #54, since the groups are a little easier to read/parse.
unlike some of the operations for #139, this grouping doesn't mutate edges. in fact, its output might be another kind of complete graph, or it might be a simple list of lists or other non-graph-related structure. apparently this shape is called a
star
by networkx; see e.g.add_star()
one way this could work using the
data=
param for networkx'sedges()
:where
g
is aMultiGraph
in which nodes are docs and edges are matches andn
is the target doc. this will return all the data associated with the edge, and it will helpfully express the edges with the target node first:n
. in other words, if there are two matches whoseSpan
inn
has the samestart
andend
, group them. even if the actual aligned text differs due to spacing, we want to aggregate them all so that we can display the corresponding sequences that aren't inn
together.n
between both bounds of theSpan
(or group) once, followed by all the sequences from docs that aren'tn
.The text was updated successfully, but these errors were encountered: