Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions that can extract and update phenological status based on MODIS data and enable monthly SDA #3249

Open
wants to merge 33 commits into
base: develop
Choose a base branch
from

Conversation

Qianyuxuan
Copy link
Collaborator

Description

  1. Add a function to extract MODIS phenological data under "data.remote" package based on user-defined sites' locations and dates.
  2. Add functions that can parameterize leaf-on and leaf-off dates for SIPNET model with MODIS phenological data.
  3. Enable SDA with monthly observations.
  4. Add a pecan.xml template for running single-site SDA.
  5. Fix bugs in Dongchen's new block-based SDA modules "Analysis_sda_block.R".

Motivation and Context

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • My name is in the list of CITATION.cff
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

modules/assim.sequential/R/Analysis_sda_block.R Outdated Show resolved Hide resolved
modules/assim.sequential/R/sda.enkf_MultiSite.R Outdated Show resolved Hide resolved
@mdietze
Copy link
Member

mdietze commented Mar 25, 2024

@Qianyuxuan there's a bunch of errors coming up in the build that suggest that you're missing namespaces on functions

check base/workflow
  0 warnings found in base/workflow.
  1 notes found in base/workflow.
  ── R CMD check comparison ──────────────────────────── PEcAn.workflow 1.7.2 ────
  Status: BROKEN
  
  ── Fixed
  
  ✔ checking CRAN incoming feasibility ... WARNING
  ✔ checking DESCRIPTION meta-information ... NOTE
  
  ── Newly failing
  
  ✖ checking R code for possible problems ... NOTE
  
  R check of base/workflow reports the following new problems. Please fix these and resubmit:
  Error: Please fix these and resubmit.
  Execution halted
  checking R code for possible problems ... NOTE
  qsub_parallel: no visible global function definition for
    ‘makeSOCKcluster’
  qsub_parallel: no visible global function definition for
    ‘registerDoSNOW’
  qsub_parallel: no visible global function definition for
    ‘txtProgressBar’
  qsub_parallel : progress: no visible global function definition for
    ‘setTxtProgressBar’
  qsub_parallel: no visible global function definition for ‘%dopar%’
  qsub_parallel: no visible global function definition for ‘foreach’
  qsub_parallel: no visible binding for global variable ‘run’
  qsub_parallel: no visible global function definition for
    ‘remote.execute.cmd’
  qsub_parallel: no visible binding for global variable ‘host’
  qsub_parallel: no visible binding for global variable ‘folder’
  qsub_parallel: no visible global function definition for ‘stopCluster’
  Undefined global functions or variables:
    %dopar% folder foreach host makeSOCKcluster registerDoSNOW
    remote.execute.cmd run setTxtProgressBar stopCluster txtProgressBar

@Qianyuxuan
Copy link
Collaborator Author

Qianyuxuan commented Mar 25, 2024

@mdietze
Sorry I just realized the files "qsub_parallel.R" and "merge_job_files.R" were placed under "/base/workflow" by mistake long ago. They should be under "/base/remote" and they were there now as committed by Dongchen. I will delete these files under the workflow.

@github-actions github-actions bot removed the Base label Mar 25, 2024
@mdietze
Copy link
Member

mdietze commented Mar 25, 2024

check modules/data.land
  2 warnings found in modules/data.land.
  2 notes found in modules/data.land.
  Error: Please fix these and resubmit.
  R check of modules/data.land returned new problems:
  checking Rd \usage sections ... WARNING: Undocumented arguments in documentation object 'prepare_pools' 'settings'

@mdietze
Copy link
Member

mdietze commented Mar 25, 2024

check models/sipnet
  2 warnings found in models/sipnet.
  2 notes found in models/sipnet.
  Error: Please fix these and resubmit.
  Execution halted
  R check of models/sipnet returned new problems:
  checking Rd \usage sections ... WARNING: Undocumented arguments in documentation object 'write.config.SIPNET' 'defaults' 'trait.values' 'settings' 'run.id' 'inputs' 'IC' 'restart' 'spinup' 'obs_time' 'update_phenology'

prior.distns <- post.distns
}
if (exists("trait.mcmc") & exists("post.distns")) {
post.distns <- post.distns[!(row.names(post.distns) %in% names(trait.mcmc)),]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be undoing recent changes from #3220 -- is that intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was adapted from Alexis Helgeson's codes to read "basal soil respiration rate" from "trait.mcmc.Rdata" as we found basal soil respiration rate seems too high from "post.distns.Rdata" for NEON sites. I am not sure whether she put forward a PR for this change. But maybe we can revert it for the develop branch as the changes are specific for our study?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Qianyuxuan so @infotroph isn't objecting to using trait.mcmc or post.distns, but that when you resolved you merge conflict you undid the effort to load the files into a specific envir and then use that envir name (distns). I think you can easily adapt your code to include the previous changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct -- loading files into a specific envir is the important part.
Also, to make sure I'm reading right: Am I correct that all the changes in this file are specific to basal soil respiration, they're simply making sure trait.mcmc and post.distns are available to be processed elsewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Qianyuxuan you still need to fix this change before this PR can go in.

Comment on lines +307 to +308
#The command "cdo" in mergeNC will automatically rename "time_bounds" to "time_bnds". However, "time_bounds" is used
#in read_restart codes later. So we need to read the new NetCDF file and convert the variable name back.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable fix for this PR. Long term it might be worth being consistent about calling this time_bnds throughout, which I think is better aligned with the CF standard: ORNL DAAC lists time_bnds as required, and time_bnds appears* in the conventions themseves whereas time_bounds does not. Can we consider a followup patch to make read_restart use time_bnds?

*Though admittedly time_bnds only appears in an non-normative example. The CF conventions are messy!

mdietze and others added 7 commits May 15, 2024 21:19
Co-authored-by: Chris Black <chris@ckblack.org>
Co-authored-by: Chris Black <chris@ckblack.org>
Co-authored-by: Chris Black <chris@ckblack.org>
Co-authored-by: Chris Black <chris@ckblack.org>
Co-authored-by: Chris Black <chris@ckblack.org>
Co-authored-by: Chris Black <chris@ckblack.org>
@@ -240,8 +240,9 @@ template <- PEcAn.settings::Settings(list(
type = "SIPNET",
revision = "ssr",
delete.raw = FALSE,
binary = model_binary,
jobtemplate = "~/sipnet_geo.job"
binary = "/usr2/postdoc/istfer/SIPNET/trunk//sipnet_if",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason the variable model_binary got replaced with a specific path rather than just setting the model_binary variable to the path you wanted?

@@ -77,11 +78,18 @@ prepare_pools <- function(nc.path, constants = NULL){
PEcAn.logger::logger.error("TotLivBiom is less than sum of AbvGrndWood and roots; will use default for leaf biomass")
}
}
#Initial LAI at the beginning of year is set as 0 for deciduous forests and grasslands
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution is model specific and the approach you're using to implement it will only work in a sitegroup run (pft.site doesn't exist for single-site runs). However, prepare_pools is a general, across-model function. Furthermore, filtering for one specific PFT name doesn't work, even within SIPNET, as there are other evergreen PFTs in the world other than boreal.coniferous. I'd recommend moving this hack to write.configs.SIPNET where you can more reliably detect the PFT being used, the start date of the run, and the phenological state of the system

##' @param constants list of constants; must include SLA in m2 / kg C if providing LAI for leaf carbon
##' @return list of pool values in kg C / m2 with generic names
##' @author Anne Thomas
prepare_pools <- function(nc.path, constants = NULL){
prepare_pools <- function(nc.path, settings, constants = NULL){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per comment below about relocating your LAI hack, please remove settings as an argument and remake the documentation

@@ -198,6 +198,8 @@ get.ensemble.samples <- function(ensemble.size, pft.samples, env.samples,
##' @param write.to.db logical: Record this run in BETY?
##' @param restart In case this is a continuation of an old simulation. restart needs to be a list with name tags of runid, inputs, new.params (parameters), new.state (initial condition), ensemble.id (ensemble id), start.time and stop.time.See Details.
##' @param rename Decide if we want to rename previous output files, for example convert from sipnet.out to sipnet.2020-07-16.out.
##' @param time obervation timepoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write.ensemble.configs is not a SDA-specific function in PEcAn, but one that is used throughout the code to write configs. The two variables you've added aren't general, and aren't documented in a way that would make sense to a general user. I'm also surprised that they can't be passed in as part of the settings. please update

RENAME = rename)#for restart from previous model runs, not sharing the same outdir
RENAME = rename,
obs_time = time,
update_phenology = update_phenology)#for restart from previous model runs, not sharing the same outdir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per previous comment, please remove obs_time and update_phenology, otherwise you will have also update every single model's write.restart function too. Also, please don't relocate a comment about one variable (RENAME) to now point to a different variable (update_phenology), as this is going to confuse the heck out of users.

General recommendation -- before you add/remove function arguments, please think about what functions are model-specific and analysis-specific (e.g. SDA, PDA, uncertainty analysis) versus those that are used generally throughout the system.

##' @param prefix prefix to read the output files
##' @param delete.raw Flag to remove sipnet.out files, FALSE = do not remove files TRUE = remove files
##'
##' @export
##' @author Shawn Serbin, Michael Dietze
model2netcdf.SIPNET <- function(outdir, sitelat, sitelon, start_date, end_date, delete.raw = FALSE, revision, prefix = "sipnet.out",
overwrite = FALSE, conflict = FALSE) {
overwrite = FALSE, conflict = TRUE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to change the default here rather than just changing the value being passed in when your running the sub-annual SDA specifically?

} ## end loop over PFTS

#update LeafOnday and LeafOffDay
if (update_phenology){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that update_phenology needs to be a global variable rather than being a setting in settings$model$leaf_phenology or something similar?


#update LeafOnday and LeafOffDay
if (update_phenology){
leaf_pheno_outdir <- settings$model$leaf_phenology$outdir ## read from settings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. the leaf phenology data is an INPUT and thus should be referenced like all other inputs, e.g. settings$run$inputs$leaf_phenology.
  2. As such leaf_phenology should be registered with BETY as a new optional input to SIPNET.
  3. The SIPNET run is not writing anything new to leaf_phenology$outdir and thus it shouldn't be called outdir but should use id, path, or source like all other inputs, depending on whether you're specifying a database ID, a file path, or the name of the input source that is then processed within do.conversions.
  4. It's worth noting that prescribed phenology is not a SIPNET-specific feature, but something that could be valuable to any model, which is an additional argument for moving it to inputs instead of model

##read data
leafphdata <- utils::read.csv(paste0(leaf_pheno_outdir,"/leaf_phenology.csv"))
leafOnDay <- leafphdata$leafonday[leafphdata$year==obs_time & leafphdata$site_id==settings$run$site$id]
leafOffDay<- leafphdata$leafoffday[leafphdata$year==obs_time & leafphdata$site_id==settings$run$site$id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why does this have to be obs_time instead of start.date?
  2. I would expect obs_time to be in a datetime format, and yet you're matching it against year? Seems like they're a missing extraction of year
  3. how does SDA update the phenological status if the phenology is prescribed?
  4. If you're adding a whole new class of inputs to the PEcAn system (prescribed phenology) there needs to be updates to the general documentation explaining any new tags, new sources, new file formats, etc. Indeed, there needs to be updates explaining how to run the sub-annual SDA more generally as well.

##' @author Qianyu Li
##'

phenology_MODIS_extract<- function(settings,run_parallel = TRUE,ncores = NULL){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. since this is an input processing function I'd recommend tweaking the function name and arguments to be more inline with existing input processing functions (e.g. starting with extract or download)
  2. function doesn't seem to be called anywhere from within the existing workflow. I suggest adding to do.conversions
  3. in keeping with the current philosophy of making PEcAn modules more independent, I'd recommend shifting the site_info bit into do.conversions itself and making the arguments to this function be (outdir, site_info, start_date, end_date, run_parallel, ncores). In other words, don't pass in the overall settings object, just pass the info you need.
  4. Since this functionality is generally useful, not just in SDA, I'd recommend that the code you move to do.conversion grab start_date and end_date from the overall settings, not settings$state.data.assimilation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants