Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ops.split inconsistent behavior with LineStrings and MultiLineStrings #701

Open
romainfontaine opened this issue Apr 11, 2019 · 6 comments

Comments

@romainfontaine
Copy link

Expected behavior and actual behavior & steps to reproduce the problem

Splitting a linestring A with an intersecting linestring B does not produce the expected output (returns A). Splitting A with a multilinestring containing B and C (C intersecting A as well) produces the desired output (3 segments). This seems inconsistent.

from shapely.ops import split
from shapely.geometry import LineString, MultiLineString

A = LineString([(0, 0), (10, 0)])
B = LineString([(5, 0), (5, 5)])
C = LineString([(1, -1), (1, 1)])
assert (split(A, B).wkt == "GEOMETRYCOLLECTION (LINESTRING (0 0, 10 0))")
# Does not split
# Expected: "GEOMETRYCOLLECTION (LINESTRING (0 0, 5 0), LINESTRING (5 0, 10 0))"

assert (A.intersection(B).wkt == "POINT (5 0)")
# Although A and B do intersect

assert (split(A, MultiLineString([B, C])).wkt ==
        "GEOMETRYCOLLECTION (LINESTRING (0 0, 1 0), LINESTRING (1 0, 5 0), LINESTRING (5 0, 10 0))")
# OK - A is split by both B and C - but A was not split by B in the previous example.

Operating system

Ubuntu 18.04.2 LTS

Shapely version and provenance

Python 3.6.7 - Shapely 1.6.4.post2 (installed from PyPI using pip)

@Jeremiah-England
Copy link
Contributor

Jeremiah-England commented Oct 19, 2020

Ok. I see the problem here.

Visually, those linestrings look like this:

image

When you split the horizontal (A) by the large vertical on top (B) which only touches A, nothing happens. But splitting A by C does work because C crosses A.

Now, when you split A by MultiLineString([B, C]), it splits into three linestrings, not two as you would expect. I agree that this seems inconsistent.

The code causing this is actually pretty accessible (permalink).

    @staticmethod
    def _split_line_with_line(line, splitter):
        """Split a LineString with another (Multi)LineString or (Multi)Polygon"""

        # if splitter is a polygon, pick it's boundary
        if splitter.type in ('Polygon', 'MultiPolygon'):
            splitter = splitter.boundary

        assert(isinstance(line, LineString))
        assert(isinstance(splitter, LineString) or isinstance(splitter, MultiLineString))

        if splitter.crosses(line):
            # The lines cross --> return multilinestring from the split
            return line.difference(splitter)
        elif splitter.relate_pattern(line, '1********'):
            # The lines overlap at some segment (linear intersection of interiors)
            raise ValueError('Input geometry segment overlaps with the splitter.')
        else:
            # The lines do not cross --> return collection with identity line
            return [line]

You can see that the splitter is required to cross line for any splitting to commence. But once that checkpoint is passed, a simple call to difference is made to generate the split geometries. But difference does not discriminate between crossing and merely touching! So if at any point splitter crosses line, then all merely touching points will be split on.

@sgillies
Copy link
Contributor

sgillies commented Nov 9, 2020

@Jeremiah-England thanks for digging into this! It seems like, for lines at least, we could change the crosses test to a touches test and get less surprising results? What do you think about that?

@Jeremiah-England
Copy link
Contributor

Jeremiah-England commented Nov 9, 2020

I think that would work fine, yes. And the behavior makes sense to me.

For a bit, I thought we could filter the goems in a MultiLineString splitter for those crossing line and take the difference of only those. But that would not solve the case where a single LineString splitter hooks back to touch the line it crosses at another point. I think the only way to get consistent behavior here while using difference for the heavy lifting is to split touching linear geometries.

I suspect that maybe crosses was used so the function returned early when only the boundary of line touches the splitter. In fact this case is tested for. But our current case, where the boundary of splitter touches line, is not in the tests, and may have been a blind spot. However, I'm a Shapely native without much experience with other simple feature libraries. And I don't know whether only splitting when crossing is standard behavior for linear geometries.

It was 5 years ago, but if @georgeouzou remembers maybe he could confirm or explain why crosses was used?

We can have our cake and eat it too if we want to return early in the case tested for, but carry on with splitting in this issue's case. That would require using relate directly. But I doubt that there would be a large performance difference either way.

@georgeouzou
Copy link
Contributor

@Jeremiah-England my memory is a little cloudy on this one, but seeing the code again i can confirm that this was a blind spot. The code should also test for the current case to be complete. Could you continue on with the changes?

@Jeremiah-England
Copy link
Contributor

Hey @georgeouzou, thanks for taking the time to look at that!

And, yes, if it's OK with everyone I would like to hammer out this case and submit a PR at some point.

@sgillies
Copy link
Contributor

sgillies commented Mar 3, 2021

@georgeouzou @Jeremiah-England thanks for communicating about the intent of the splitter!

Splitting on touch seems consistent with https://postgis.net/docs/ST_Split.html, which allows a point to split a line. I'm going to merge #1034 now and it will be in 1.8a1.

sgillies pushed a commit that referenced this issue Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants