Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

biig update for vapour/warp #24

Open
mdsumner opened this issue Oct 13, 2021 · 3 comments
Open

biig update for vapour/warp #24

mdsumner opened this issue Oct 13, 2021 · 3 comments

Comments

@mdsumner
Copy link
Member

mdsumner commented Oct 13, 2021

NOTES

  • keep deps out of raadfiles, we can't have raster/sp etc.
  • netcdf package is probably unavoidable, but keep the actual running code out of the functions if possible - can we set up tests to trigger if things have changed, and just store params in the current functions? this is faster, don't have to iterate so many files to get their (unchanging) time steps etc. and puts the hard work in a formal test, making it only Suggests dep
  • I think we end up with various columns fullname as now, but also vrt string, /vsi forms, and in future vrt:// forms - and when/if are these templated on other column values, or do we just stamp it out as the current data ... (I think so, no templates)
  • do we have to care if the data with the file names is redundant? i.e. the file provides it?
  • what about varnames, do we end up with file functions for every var/sds? (probably)
  • next is GEOID ... working down the files in R/

These are the different kinds of things (still need update from notes above, see new VRTDataset capability coming to vapour)

  • missing projection (usually OGC:CRS84 but because .nc folks hate projections ...)
  • inconsistent longitude convention (Pacific, then Atlantic in same data series)
  • wrong extent (because some .nc degen axis problem)
  • no format support (because raw binary)
  • pick variables, or pair variables (sds, in same or different files)
  • data needing flip (so you flip the resolution sign, then warp)
  • possibly mdim for some .nc that are slow (because of orientation vs. scanline/y-down)
  • no projection/extent, only lonlat arrs (because .nc folks hate projections)
  • multiple inputs, either tiles or (potentially) overlapping sources to cascade/coalesce together

AMPS, CFA, CMIP,

see this tweet, I don't yet know how to nominate the X/Y arrays for the warper, but it's totally doable (probably by sending options into a VRTDataset

i.e. OSGeo/gdal@8adcb58

this tweet does the barycentric interpolation from the longlat arrays (for truly curvilinear data like CMIP, some ROMS)
https://twitter.com/mdsumner/status/1469151164952285188?s=20&t=ft6taP_h8fhB0QUc9Q2cEw - I don

AMPS is really an "assign the grid" problem, more basic VRT

FAST ICE

  • two modes the cylindrical and polar forms

LEADS

these files are stereographic - flipped, so we need a vrt wrap that first flips by warping, and then warps from that, can the files columns encode that stuff (with ymax < ymin)

... this works note the wonky a_ullr (y-up)

gdal_translate NETCDF:"/rdsi/PUBLIC/raad/data/store.pangaea.de/Publications/ReiserF-etal_2020/Antarctic_Relleads_2003_2019.nc":LeadFrequency -of VRT -a_ullr -3950000 -3950000 3950000 4350000  -a_srs EPSG:3031 leads.vrt

## leads.vrt can now be used directly by raster/stars/terra/vapour-warper

NCEP2 winds

These need expanding, currently we only get the first date of the year so raadtools must be expanding the list

  • structured as one file per year for 6-hourly/daily data, and a single file for monthly

NSIDC ice binary files

a VRT string, see https://github.com/mdsumner/NSIDC-seaice/blob/master/nt_20130114_f17_v01_s.vrt

ALTIMETRY

these are easy, but we need two sets of extents sometimes 180 sometimes 360 see NOTES above

ARGO files

probably forget forever and follow Dewey

BSOSE files

complicated like AMPS but maybe not too bad, I think we need raster GCPs for this one

CAFE

easy, lon180 same as oisst_daily

FSLE

easy, lon180

SST

  • oisst_daily_files(), oisst_monthly_files() - straightfoward, add $extent and $projection and we have enough for gdalio::vrt
  • ghrsst_daily_files() also works fine but is slow because they are slow
    monthly files are complicated by the need for the mask, so we need a bit extra there

GEOID

these are fine, ADF files (will benefit from multiple sources - tiles - to the warper)

AMSR

these are HDF or TIF

  • HDF will need y-res flip and assign crs+extent, then warp to re-align
  • GTIFF probably fine as is

NCEP2

(not done, just notes)

  • NCEP2 files are structured as one file per year for 6-hourly/daily
  • and a single file for monthly
@raymondben
Copy link
Member

FWIW NCEP2 files are structured as one file per year for 6-hourly/daily data, and a single file for monthly

@mdsumner
Copy link
Member Author

mdsumner commented Aug 30, 2022

a lot of this has moved on, we can currently cover a lot of this with vapour_vrt, and others may be best covered by additions to GDAL.

see updated workflow here for an example of how vapour_vrt can help us: AustralianAntarcticDivision/raadtools#126

to be even more flexible, we need the "vrt:// " support that is a future improvement in GDAL itself.

@mdsumner
Copy link
Member Author

mdsumner commented Nov 20, 2022

the vapour_vrt way is now quite stable, I'm experimenting with

  • raadfiles more strict on only returning actual files (no band expansion), monthly oisst broke this rule but probably others
  • this raadfiles/raadtools divide should remain for bands, vars (sds) (and anything else that isn't 1:1 with a file)
  • (but review notes above ...)
  • the divide would be 'fullname' and 'date' comes from raadfiles (ufullname, band, and variable selection all done in raadtools, monthly oisst has a single nominal date just the first one, but arguably it doesn't belong here either see how that plays out)
  • that might mean 'vrt_u', 'vrt_sst', type names but currently I'm working with 'vrt_dsn', if we can stay general without special names I think that's always better
  • we can avoid use of 'lon180' in this package, those VRT files will be replaced by vapour_vrt output (or calls) in raadtools
  • to avoid special names we need raadtools to have special functions, 1:1 a variable has a function so "read_u" etc would be used under the hood for "read_mag" (for example)
  • probably never ever return u,v that seems like a mistake no one ever wants (because u can be a brick in time+depth)
  • in that sense raadtools lives below raster/terra/stars and might make delivery to those an option

a lot of this has been stalled because vapour dramas on CRAN, but now looking stable (notably the UBSAN checks and stricter compiler warnings Wconversion from ARM are now cleaned up)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants