group matches after extension #122

thatbudakguy · 2020-10-19T15:16:23Z

this plays into #54, since the groups are a little easier to read/parse.

unlike some of the operations for #139, this grouping doesn't mutate edges. in fact, its output might be another kind of complete graph, or it might be a simple list of lists or other non-graph-related structure. apparently this shape is called a star by networkx; see e.g. add_star()

one way this could work using the data= param for networkx's edges():

query for all the edges that connect to the given node using

g.edges([n, None], data=True)

where g is a MultiGraph in which nodes are docs and edges are matches and n is the target doc. this will return all the data associated with the edge, and it will helpfully express the edges with the target node first:

MultiEdgeDataView([(n, other, {"foo": "bar"}), (n, other, {"foo": "bar"}), ...])

group the edges via the sequence bounds in n. in other words, if there are two matches whose Span in n has the same start and end, group them. even if the actual aligned text differs due to spacing, we want to aggregate them all so that we can display the corresponding sequences that aren't in n together.
when we display the group, we'll just use the actual unaligned text of n between both bounds of the Span (or group) once, followed by all the sequences from docs that aren't n.

The text was updated successfully, but these errors were encountered:

thatbudakguy · 2021-02-19T22:07:08Z

deferring this past 2.0 since it's not trivial.

thatbudakguy added the enhancement New feature or request label Oct 19, 2020

thatbudakguy added this to the v2.0 milestone Oct 19, 2020

thatbudakguy removed this from the v2.0 milestone Feb 19, 2021

thatbudakguy added this to the v3.0 milestone Feb 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

group matches after extension #122

group matches after extension #122

thatbudakguy commented Oct 19, 2020 •

edited

thatbudakguy commented Feb 19, 2021

group matches after extension #122

group matches after extension #122

Comments

thatbudakguy commented Oct 19, 2020 • edited

thatbudakguy commented Feb 19, 2021

thatbudakguy commented Oct 19, 2020 •

edited