Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rnet_breakup_vertices breaks lines into 2-vertex linestrings when lines overlap #458

Open
Robinlovelace opened this issue Jan 26, 2021 · 8 comments

Comments

@Robinlovelace
Copy link
Member

This could be considered a bug or a feature request depending on how you look at it. The outcome I'm looking for is shown in the bottom map below, produced by overline():

# Aim: highlight potential issue with rnet_breakup_vertices

library(stplanr)
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 7.0.0
r = routes_fast_sf[2:5, ]
plot(r)  
#> Warning: plotting the first 9 out of 16 attributes; use max.plot = 16 to plot
#> all

rnet = rnet_breakup_vertices(r)
nrow(rnet)
#> [1] 112
summary(n_vertices(rnet)) # most lines only have 2 linestrings
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>       2       2       2       3       2      64
plot(rnet)
#> Warning: plotting the first 9 out of 16 attributes; use max.plot = 16 to plot
#> all

p1 = line2points(rnet)
mapview::mapview(rnet) +
  mapview::mapview(p1)

rnet_overline = overline(r, "length")
p2 = line2points(rnet_overline)
mapview::mapview(rnet_overline) +
  mapview::mapview(p2)

Created on 2021-01-26 by the reprex package (v0.3.0)

@Robinlovelace
Copy link
Member Author

The code added here provides a solution to the problem when dealing with segments: cyipt/actdev@b024959

Could there be a way to adapt the code @agila5 so that it ignores lines that have overlapping segments? Context - this could be useful: cyipt/actdev#43

@agila5
Copy link
Collaborator

agila5 commented Jan 29, 2021

Hi @Robinlovelace. I just had a look and this problem and I found a partial solution (documented in the following reprex). Please have a look and add any comment.

# packages
library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(stplanr)
library(sfnetworks)
library(tidygraph)

# data
r = routes_fast_sf[2:5, ]
plot(r["ID"], lwd = 3)

# It's not clear from the previous plot, but I think that the main challange
# with this type of data is the presence of overlapping lines (which create
# several problem for the following algorithms). I think that the overlapping
# lines can be removed as follows:
r <- r %>% geo_projected(st_union) %>% st_cast("LINESTRING")

# Then we can use sfnetworks function to perform the following steps
r_sfn <- as_sfnetwork(r, directed = FALSE)
par(mar = rep(0.1, 4))
plot(r_sfn)

# Subdivide and smooth the edges (which is more or less the same as
# rnet_breakup_vertices + contraction of edges)
r_clean <- r_sfn %>% 
  convert(to_spatial_subdivision, .clean = TRUE) %>% 
  convert(to_spatial_smooth, .clean = TRUE) %>% 
  activate("edges") %>% 
  mutate(ID = 1:n())
#> Warning: to_spatial_subdivision assumes attributes are constant over geometries

plot(st_as_sf(r_clean)["ID"], lwd = 3)

Created on 2021-01-29 by the reprex package (v0.3.0)

The main problem is that the st_union() approach strips all fields from the input data, which is a problem for the use-cases you mentioned, right? If so, I think that this type of data (i.e. overlapping linestrings) could represent the starting point for a new morpher in sfnetworks.

@Robinlovelace
Copy link
Member Author

The main problem is that the st_union() approach strips all fields from the input data, which is a problem for the use-cases you mentioned, right? If so, I think that this type of data (i.e. overlapping linestrings) could represent the starting point for a new morpher in sfnetworks.

Correct and sounds like a plan. Another possibility I was thinking of 'exploding' the data into 2-vertex linestring represented as simple data frames with start xy and end xy points (that could be optionally switched to allow merging of identical geometries going in opposite directions). It could look a bit like this:

l1m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, 1
))
l1 = sf::st_linestring(l1m)
l2m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, -1
))
l2 = sf::st_linestring(l2m)

# plot(l1, lwd = 3, col = "grey")
# plot(l2, add = TRUE)

l = sf::st_sf(data.frame(11:12), geometry = sf::st_sfc(l1, l2))
plot(l, lwd = 2:1)

l_exploded = sfheaders::sf_to_df(l)

matrix_widen = function(l1) {
  if(inherits(l1, "sfc"))
    l1m = sfheaders::sf_to_df(l1)
  l1_start = l1m[1:(nrow(l1) - 1), ]
  l1_end = l1m[(2:nrow(l1)), ]
  cbind(l1_start, l1_end)
}

ml = lapply(l$geometry, matrix_widen)
ml

to find overlapping segments.

@Robinlovelace
Copy link
Member Author

Robinlovelace commented Jan 31, 2021

Heads-up @agila5 I've made progress on this. Check and please try to reproduce the code below.

A few questions/considerations:

  • How to 'line merge' the 'exploded' 2-vertex linestrings into segments between graph nodes (should be easy if it's viable to run rnet_breakup_vertices() on the exploded result)?
  • How to prevent identical exploded linestrings that go in opposite directions (I guess that's easy, e.g. by adding an if(lsfm$x[1] > lsfm$x[nrow(lsfm)]) reverse the order)?
  • How to make it faster and should the functions have arguments that allow different objects to be returned, e.g. dfs/matrices? The approach seems slower than the current implementation of overline() but I wonder how it performs on large networks and how it could be optimised, e.g. with data.table - heads-up @mem48 on that.
  • How to enable it to return any combination of values/function summaries, e.g. with DT[, keyby = .(V4, V1), .(sumV2 = sum(V2))] in data.table or DF %>% group_by(V4, V1) %>% summarise(sumV2 = sum(V2)).
  • Where to put it - I'm thinking of splitting out functions that add extra functionality to sf objects, like geo_projected() into a more modular package, e.g. called sfextras. Thoughts?
l1m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, 1
))
l2m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, -1
))
l1 = sf::st_linestring(l1m)
l2 = sf::st_linestring(l2m)
l = sf::st_sf(data.frame(v = 11:12), geometry = sf::st_sfc(l1, l2))
l_exploded = sfheaders::sf_to_df(l)

lsf = l$geometry

matrix_widen = function(lsf) {
  if(inherits(lsf, "sf"))
    lsfm = sfheaders::sf_to_df(lsf)
  if(inherits(lsf, "sfc"))
    lsfm = sfheaders::sfc_to_df(lsf)
  lsf_start = lsfm[1:(nrow(lsfm) - 1), c("x", "y")]
  lsf_end = lsfm[(2:nrow(lsfm)), c("x", "y")]
  lsfmw = cbind(lsf_start, lsf_end)
  names(lsfmw) = c("xs", "yx", "xe", "ye")
  lsfmw
}

matrix_widen(l[1, ])
matrix_widen(l[2, ])
lapply(1:nrow(l), function(i) matrix_widen(l$geometry[i]))
ml = purrr::map_dfr(1:nrow(l), function(i) matrix_widen(l$geometry[i]), .id = "id")
ml

l$id = as.character(1:nrow(l))
res = dplyr::inner_join(ml, sf::st_drop_geometry(l))
ra = dplyr::summarise(dplyr::group_by(res, xs, yx, xe, ye), v = sum(v))
geom_agg = sf::st_sfc(lapply(1:nrow(ra), function(i)
  sf::st_linestring(matrix(as.numeric(ra[i, 1:4]), ncol = 2, byrow = TRUE)))
)
rnet = sf::st_sf(ra, geometry = geom_agg)
plot(rnet["v"])

# with larger dataset 
# l = stplanr::routes_fast_sf[2:5, ]

system.time({
  
l = stplanr::routes_fast_sf

l$v = 1:nrow(l)
ml = purrr::map_dfr(1:nrow(l), function(i) matrix_widen(l$geometry[i]), .id = "id")
ml

l$id = as.character(1:nrow(l))
res = dplyr::inner_join(ml, sf::st_drop_geometry(l))
ra = dplyr::summarise(dplyr::group_by(res, xs, yx, xe, ye), v = sum(v))
ram = as.matrix(ra[1:4])
geom_agg = sf::st_sfc(lapply(1:nrow(ra), function(i)
 sf::st_linestring(matrix(as.numeric(ram[i, 1:4]), ncol = 2, byrow = TRUE)))
)
rnet = sf::st_sf(ra, geometry = geom_agg, crs = sf::st_crs(l))

})

system.time(rneto <- stplanr::overline(l, "v"))
plot(rnet["v"])
plot(rneto["v"])

nrow(rnet)
nrow(rneto)
rneto2 = stplanr::overline(rnet, "v")
plot(rneto2)
nrow(rneto2)
l1m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, 1
))
l2m = matrix(ncol = 2, byrow = TRUE, c(
  0, 0,
  1, 0,
  2, -1
))
l1 = sf::st_linestring(l1m)
l2 = sf::st_linestring(l2m)
l = sf::st_sf(data.frame(v = 11:12), geometry = sf::st_sfc(l1, l2))
l_exploded = sfheaders::sf_to_df(l)

lsf = l$geometry

matrix_widen = function(lsf) {
  if(inherits(lsf, "sf"))
    lsfm = sfheaders::sf_to_df(lsf)
  if(inherits(lsf, "sfc"))
    lsfm = sfheaders::sfc_to_df(lsf)
  lsf_start = lsfm[1:(nrow(lsfm) - 1), c("x", "y")]
  lsf_end = lsfm[(2:nrow(lsfm)), c("x", "y")]
  lsfmw = cbind(lsf_start, lsf_end)
  names(lsfmw) = c("xs", "yx", "xe", "ye")
  lsfmw
}

matrix_widen(l[1, ])
#>   xs yx xe ye
#> 1  0  0  1  0
#> 2  1  0  2  1
matrix_widen(l[2, ])
#>   xs yx xe ye
#> 1  0  0  1  0
#> 2  1  0  2 -1
lapply(1:nrow(l), function(i) matrix_widen(l$geometry[i]))
#> [[1]]
#>   xs yx xe ye
#> 1  0  0  1  0
#> 2  1  0  2  1
#> 
#> [[2]]
#>   xs yx xe ye
#> 1  0  0  1  0
#> 2  1  0  2 -1
ml = purrr::map_dfr(1:nrow(l), function(i) matrix_widen(l$geometry[i]), .id = "id")
ml
#>   id xs yx xe ye
#> 1  1  0  0  1  0
#> 2  1  1  0  2  1
#> 3  2  0  0  1  0
#> 4  2  1  0  2 -1

l$id = as.character(1:nrow(l))
res = dplyr::inner_join(ml, sf::st_drop_geometry(l))
#> Joining, by = "id"
ra = dplyr::summarise(dplyr::group_by(res, xs, yx, xe, ye), v = sum(v))
#> `summarise()` has grouped output by 'xs', 'yx', 'xe'. You can override using the `.groups` argument.
geom_agg = sf::st_sfc(lapply(1:nrow(ra), function(i)
  sf::st_linestring(matrix(as.numeric(ra[i, 1:4]), ncol = 2, byrow = TRUE)))
)
rnet = sf::st_sf(ra, geometry = geom_agg)
plot(rnet["v"])

# with larger dataset 
# l = stplanr::routes_fast_sf[2:5, ]

system.time({
  
l = stplanr::routes_fast_sf

l$v = 1:nrow(l)
ml = purrr::map_dfr(1:nrow(l), function(i) matrix_widen(l$geometry[i]), .id = "id")
ml

l$id = as.character(1:nrow(l))
res = dplyr::inner_join(ml, sf::st_drop_geometry(l))
ra = dplyr::summarise(dplyr::group_by(res, xs, yx, xe, ye), v = sum(v))
ram = as.matrix(ra[1:4])
geom_agg = sf::st_sfc(lapply(1:nrow(ra), function(i)
 sf::st_linestring(matrix(as.numeric(ram[i, 1:4]), ncol = 2, byrow = TRUE)))
)
rnet = sf::st_sf(ra, geometry = geom_agg, crs = sf::st_crs(l))

})
#> Joining, by = "id"
#> `summarise()` has grouped output by 'xs', 'yx', 'xe'. You can override using the `.groups` argument.
#>    user  system elapsed 
#>   0.958   0.012   0.983

system.time(rneto <- stplanr::overline(l, "v"))
#>    user  system elapsed 
#>   0.117   0.004   0.123
plot(rnet["v"])

plot(rneto["v"])

nrow(rnet)
#> [1] 1274
nrow(rneto)
#> [1] 81
rneto2 = stplanr::overline(rnet, "v")
#> 2021-01-31 10:25:17 constructing segments
#> 2021-01-31 10:25:17 building geometry
#> 2021-01-31 10:25:17 simplifying geometry
#> 2021-01-31 10:25:17 aggregating flows
#> 2021-01-31 10:25:17 rejoining segments into linestrings
plot(rneto2)
nrow(rneto2)
#> [1] 81

Created on 2021-01-31 by the reprex package (v0.3.0)

@Robinlovelace
Copy link
Member Author

Hey @agila5 revisiting this after a looong time. Above I see you said

this type of data (i.e. overlapping linestrings) could represent the starting point for a new morpher in sfnetworks.

Still worth it? Could approach this on the sfnetworks side if so.

@agila5
Copy link
Collaborator

agila5 commented Jan 21, 2023

this type of data (i.e. overlapping linestrings) could represent the starting point for a new morpher in sfnetworks.

I think so. The idea is to merge overlapping linestrings creating a "unique" segment that somehow combines the attributes of the overlapping segments, right?

@Robinlovelace
Copy link
Member Author

I think so. The idea is to merge overlapping linestrings creating a "unique" segment that somehow combines the attributes of the overlapping segments, right?

Exactly 👍

Well up for contributing to this, may need a few pointers on the sfnetworks side tho!

@agila5
Copy link
Collaborator

agila5 commented Jan 21, 2023

If you want, I'm happy to review a PR (or even discuss about the implementation but I'm busy until Thursday/Friday of the next week).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants