Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation with the stars object partially fails (incomplete), but when changing stars to sf this works very well #317

Open
alexyshr opened this issue Sep 4, 2020 · 5 comments

Comments

@alexyshr
Copy link

alexyshr commented Sep 4, 2020

Dear all,

Using the function aggregate, I want to get the maximum value of every numerical attribute of a stars object (h_nh_c.stnoxy) by an sf object (mun$geometry), but the result is incomplete (704 polygons are out). If I change the stars object to sf (h_nh_c.sf), the aggregate will return a value for all the polygons. Is this a normal behaviour of aggregate with stars?

Best

Alexys

# get vector and raster data
myurl <- "http://geocorp.co/wind/wind_result.zip"
filename <- "wind_result.zip"
download.file(myurl, filename, mode = "wb")
unzip(filename)
library(stringr)
library(stars)
#> Warning: package 'stars' was built under R version 3.6.3
#> Loading required package: abind
#> Loading required package: sf
#> Warning: package 'sf' was built under R version 3.6.3
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
# Create stars object from tif files and change names
rasterfiles <- list.files(path = "./", pattern = "*.*.tif$")
rasterfiles <- paste0("./", rasterfiles)
rasterfiles <- str_sort(rasterfiles, numeric = TRUE)
h_nh_c.st <- read_stars(rasterfiles, quiet = TRUE)
mynames <- names(h_nh_c.st)
start <- str_locate(mynames, "_")
stop <- str_locate(mynames, "\\.")
mynames <- str_sub(mynames, start = start[, 1] + 1, end = stop[, 1] - 1)
h_nh_c.st <- setNames(h_nh_c.st, mynames)
# Create sf object (municipalities) from shapefile
mun <- st_read("./MUNICIPIOS_WGS84.shp")
#> Reading layer `MUNICIPIOS_WGS84' from data source `E:\Thesis\othercode\stars_and_sf\MUNICIPIOS_WGS84.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1126 features and 3 fields
#> geometry type:  POLYGON
#> dimension:      XY
#> bbox:           xmin: -79.0369 ymin: -4.228051 xmax: -66.84933 ymax: 12.45824
#> geographic CRS: WGS 84
# Change the dimensions of stars object
# Convert to stars without x and y dimensions, only geometry dimension
h_nh_c.stnoxy <- st_xy2sfc(h_nh_c.st, as_points = FALSE, na.rm = TRUE)
# Plot input data (stars)
plot(h_nh_c.stnoxy["c_700"], reset = FALSE, border = NA)
plot(st_geometry(mun), add = TRUE)

# Convert stars to sf
h_nh_c.sf <- st_as_sf(h_nh_c.stnoxy)
# Plot input data (sf)
plot(h_nh_c.sf[, "c_700"], reset = FALSE, border = NA)
plot(st_geometry(mun), add = TRUE)

# Perfom aggregate stars (wgs84)
h_nh_c.stmaxmun <- aggregate(h_nh_c.stnoxy, mun$geometry, max, exact = TRUE)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
# Plot stars result (aggregate - max)
plot(h_nh_c.stmaxmun["c_700"])

# Check stars aggregate result
(sum(is.na(h_nh_c.stmaxmun[["c_700"]])))
#> [1] 704
# Perfom aggregate sf (wgs84)
h_nh_c.sfmaxmun <- aggregate(h_nh_c.sf, mun$geometry, max, exact = TRUE)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
# Plot sf result (aggregate - max)
plot(h_nh_c.sfmaxmun["c_700"])

# Check sf aggregate result
(sum(is.na(h_nh_c.sfmaxmun$c_700)))
#> [1] 0
@eroubenoff
Copy link

This thread is quite stale, but I am now having the same issue. Did you ever figure out a solution? Very puzzling, but thanks to your thread was able to convert to SF as a work around.

@edzer
Copy link
Member

edzer commented Apr 24, 2023

Yes, great issue. In the aggregate.stars docs it says "Note: each pixel is assigned to only a single group (in the order the groups occur) so non-overlapping spatial features and temporal windows are recommended." - with each pixel, it meant to say, each geometry in x, and by group it meant element of by - this can certainly be improved. The question is whether we want to retain behaviour as is, or change; if elements of by are small, what you want is closer to what st_extract does: extract the polygonized pixel value underneath the small polygon. This finishes but takes much longer than aggregate over the sf route. I

@eroubenoff
Copy link

eroubenoff commented Apr 24, 2023

Thank you for your response here— yes, I understand. I would argue that changing the behavior to allow each pixel to be 'assigned' to multiple geometries probably is more in line with many use cases when by is an sfc column. May not be the case when by is another raster grid, etc.

Does setting exact=TRUE effectively result in the same behavior? I wasn't able to get this to work— was getting an error saying something to the effect of "cannot convert object to a Raster*". This went away when I read in the raster using read_stars instead of read_ncdf, but then other issues arose.

In the end I decided to do the aggregation like:

br_bb <- st_bbox(br_shp)
r <- read_ncdf(f, crs = st_crs(br_shp))
r <- st_crop(r, br_bb)
r_poly <- st_as_sf(r, as_points=FALSE)
r_poly <- units::drop_units(r_poly) 

r_new <- br_shp %>% 
  select(muni_code) %>%
  st_join(r_poly) %>% 
  mutate(area = units::drop_units(st_area(.))) %>%
  st_drop_geometry() %>%
  group_by(muni_code) %>%
  summarize(across(everything(), ~weighted.mean(., area)))

Where br_shp is a multipolygon and f is a path to a .nc raster. Then, I attribute joined r_new with br_shp on muni_code.

@edzer
Copy link
Member

edzer commented Apr 24, 2023

I would argue that changing the behavior to allow each pixel to be 'assigned' to multiple geometries probably is more in line with many use cases when by is an sfc column.

That disaggregates pixels, rather than aggregating them.

Does setting exact=TRUE effectively result in the same behavior?

It only works if x is a raster stars object, but it doesn't emit a warning if it isn't and gets ignored.

@edzer
Copy link
Member

edzer commented Apr 24, 2023

In @alexyshr 's example above,

st_interpolate_aw(h_nh_c.stnoxy, mun$geometry, FALSE)

also works and gives you a proper mean or sum, similar to how exactextractr would do but for the polygon-polygon case; it can't give you the max or any other aggregate, though.

edzer added a commit that referenced this issue Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants