Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Section on cleaning geometries in the geometry chapter #811

Open
5 tasks
Robinlovelace opened this issue Jun 23, 2022 · 11 comments
Open
5 tasks

Section on cleaning geometries in the geometry chapter #811

Robinlovelace opened this issue Jun 23, 2022 · 11 comments
Assignees
Milestone

Comments

@Robinlovelace
Copy link
Collaborator

Robinlovelace commented Jun 23, 2022

Currently this is the only mention of cleaning geometries in the book I believe:

https://github.com/Robinlovelace/geocompr/blob/3579906af69949dbe47fec783b62ad530018ce14/10-gis.Rmd#L255-L263

As @defuneste has flagged this, and anecdotal evidence suggests it's a common issue, I suggest it will be a useful section. Thoughts on best tools for the job? Options include (we can check off which to test/mention):

Interested in which works, people look to this for recommendations so if we cover stuff we should ensure it's tested and known to work! I've had so-so experience with st_make_valid() but it's in {sf} so should be covered first, then {sptatstat} tools as they are well maintained. pprepr is not on CRAN and seems to be unmaintained.

@defuneste
Copy link
Contributor

defuneste commented Jun 23, 2022

I am just trying to use some stuff in {spastat} so I am slowly reading doc/book/codes. Apparently {polyclip} (https://github.com/baddstats/polyclip) is used to do some "cleaning". It is use here (https://github.com/spatstat/spatstat.geom/blob/d90441de5ce18aeab1767d11d4da3e3914e49bc7/R/window.R#L230-L240).

This is in the owin class and it is probably use to avoid self-intersecting polygon.

I will have to test it a bit to get a better understanding ...

@defuneste
Copy link
Contributor

I have adapted this web page: http://s3.cleverelephant.ca/invalid.html with a bunch of topological errors (it is from @pramsey and related blog post: https://www.crunchydata.com/blog/waiting-for-postgis-3.2-st_makevalid).

The script is here: https://github.com/defuneste/utile_comme_du_pq/blob/master/erreur_topo.R
it has a lot of dead codes and should be cleaned a bit soon. I could not understand/reproduce all the errors but I think it is a very nice setup to test some algorithm that "clean geometries". On the negative side it only include one or two geometries per error.

Stuff that can be improved (for later):

  • Try to organize errors in category, ie : polygon, ppolygon + hole(s), multipolygon
  • Display vertexes

@defuneste
Copy link
Contributor

defuneste commented Oct 11, 2022

The twiiter post helped!

  • Ty frazier (@syntheticpops) mentioned also terra::makeValid() and a terminal approach with ogr2ogr — skipfailures x.shp y.shp
  • @mdsumner mentionned sfdct::ct_triangulate() followed with group_by this tweet is also very helpful to start understanding a bit more the various approach of this problem
  • Etienne Racine (@tiennebr) also bring the classic buffer at 0m that we should add to the list
  • New Geographer mention v.clean in grass that we already have in chapter 10

My shiny app start to look not too bad. I will add more options and see how I can host it somewhere so it can be accessible to other.

edit: few typos

@Robinlovelace
Copy link
Collaborator Author

This is awesome @defuneste, keep the ideas coming. Hope to implement some of them in time for the 2nd edition!

@defuneste
Copy link
Contributor

I have tested {prepr} (with one p I think!) and {polyclip} on the small shiny app here (https://github.com/defuneste/utile_comme_du_pq/tree/master/topo_errors). We get very different results depending of the errors, algorithms/implementation. Even if it is not perfect (we could add some function args in the shiny apps), I will try to figure a way later to publish it. it will probably take me too much time to host it but before I can use the free shiny hosting. What do you think?

How deep do you want to go in geocompr?

I think the minimum should include the two functions from {terra} and {sf} and the classic "hack" of st_buffer(x, 0). Polyclip is probably the least interesting even if it is quick intuitive to understand how it works.

I will need to read the paper on "constrained triangulation" to understand {prepr} but result look goods.

Next should be for me to read a bit more on how terra::makeValid() and sf::st_make_valid works().

@Robinlovelace
Copy link
Collaborator Author

Look forward to giving this a spin, over the weekend maybe 🙏

@Robinlovelace Robinlovelace self-assigned this Oct 12, 2022
@defuneste
Copy link
Contributor

defuneste commented Oct 19, 2022

Well I am hosting it! : https://www.branchtwigleaf.com/shinyapps/make-valid-geom/

if it useful I would totally move it to some geocompsomething because I think the value is mostly pedagogical

What I have learn from it:

  • I was surprised at the diversity of results in some cases

  • even if {terra} and {sf} both use geos they sometimes provide slightly different results, my guess is different choices of implementation. I have no idea which is correct (if one is) but we, the geo communities, should find a way of explaining it.

  • st_buffer(geom, 0) is great but sometimes produces weird result with multi polygon or polygon with holes

  • polyclip should probably not be used outside of case were you need a windows (kind of similar problem than a buffer because polyclip is a clipping tool with a bigger polygon). Troubles could also come from my implementation as you has a lot of format conversions (sf -> polyclip -> sf)

  • st_repair not much to say, it seems good, it is a bit hard on the dependencies sides so not for a basic user. I have still not read the paper

Edit: updating the link!

@defuneste
Copy link
Contributor

GRASS documentation about V.clean is great and I should think of a way to add it : https://grass.osgeo.org/grass82/manuals/v.clean.html

@pramsey
Copy link

pramsey commented Oct 19, 2022

Probably you want the "structure" option for the make valid parameters. That should give a result that is "much like buffer(0)" without the failure modes.

@defuneste
Copy link
Contributor

Hi @Robinlovelace do we have dead line on this?

I will probably need some time to understand a bit more GRASS before adding it. It can be mention in chapter two (explaining the concept of validity maybe in the same place than inner ring / holes ?) or later in chapter 5 but I do not see where.

The link of @pramsey was a good help to understand the GEOS level (I will still have to try some cases and "draw" them). We still need to get how {terra}/{sf} use it. It is hard because not everyone will be at GEOS 3.10.

@Robinlovelace
Copy link
Collaborator Author

Hey @defuneste, no hard deadline but sooner would be getter.

@Nowosad Nowosad added this to the 2nd edition milestone Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants