Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hf landscape updates [WIP] #3193

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open
129 changes: 72 additions & 57 deletions base/workflow/R/run.write.configs.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#' @param posterior.files Filenames for posteriors for drawing samples for ensemble and sensitivity
#' analysis (e.g. post.distns.Rdata, or prior.distns.Rdata)
#' @param overwrite logical: Replace output files that already exist?
#' @param use.existing.samples FALSE (default), TRUE, or filename
#'
#' @details The default value for \code{posterior.files} is NA, in which case the
#' most recent posterior or prior (in that order) for the workflow is used.
Expand All @@ -24,7 +25,7 @@
#' @author David LeBauer, Shawn Serbin, Ryan Kelly, Mike Dietze
run.write.configs <- function(settings, write = TRUE, ens.sample.method = "uniform",
posterior.files = rep(NA, length(settings$pfts)),
overwrite = TRUE) {
overwrite = TRUE,use.existing.samples=FALSE) {
tryCatch({
con <- PEcAn.DB::db.open(settings$database$bety)
on.exit(PEcAn.DB::db.close(con), add = TRUE)
Expand All @@ -34,68 +35,82 @@ run.write.configs <- function(settings, write = TRUE, ens.sample.method = "unifo
conditionMessage(e))
})

## Which posterior to use?
for (i in seq_along(settings$pfts)) {
## if posterior.files is specified us that
if (is.na(posterior.files[i])) {
## otherwise, check to see if posteriorid exists
if (!is.null(settings$pfts[[i]]$posteriorid)) {
#TODO: sometimes `files` is a 0x0 tibble and other operations with it fail.
files <- PEcAn.DB::dbfile.check("Posterior",
settings$pfts[[i]]$posteriorid,
con, settings$host$name, return.all = TRUE)
pid <- grep("post.distns.*Rdata", files$file_name) ## is there a posterior file?
if (length(pid) == 0) {
pid <- grep("prior.distns.Rdata", files$file_name) ## is there a prior file?
}
if (length(pid) > 0) {
posterior.files[i] <- file.path(files$file_path[pid], files$file_name[pid])
} ## otherwise leave posteriors as NA
}
## otherwise leave NA and get.parameter.samples will look for local
} else {
## does posterior.files point to a directory instead of a file?
if(utils::file_test("-d",posterior.files[i])){
pfiles = dir(posterior.files[i],pattern = "post.distns.*Rdata",full.names = TRUE)
if(length(pfiles)>1){
pid = grep("post.distns.Rdata",pfiles)
if(length(pid > 0)){
pfiles = pfiles[grep("post.distns.Rdata",pfiles)]
} else {
PEcAn.logger::logger.error(
"run.write.configs: could uniquely identify posterior files within",
posterior.files[i])
}
posterior.files[i] = pfiles
}
}
## also, double check PFT outdir exists
if (is.null(settings$pfts[[i]]$outdir) || is.na(settings$pfts[[i]]$outdir)) {
## no outdir
settings$pfts[[i]]$outdir <- file.path(settings$outdir, "pfts", settings$pfts[[i]]$name)
}
} ## end else
} ## end for loop over pfts

## Sample parameters
model <- settings$model$type
scipen <- getOption("scipen")
options(scipen = 12)

PEcAn.uncertainty::get.parameter.samples(settings, posterior.files, ens.sample.method)
samples.file <- file.path(settings$outdir, "samples.Rdata")
if (file.exists(samples.file)) {
samples <- new.env()
load(samples.file, envir = samples) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples
trait.samples <- samples$trait.samples
ensemble.samples <- samples$ensemble.samples
sa.samples <- samples$sa.samples
runs.samples <- samples$runs.samples
## env.samples <- samples$env.samples
# PEcAn.uncertainty::get.parameter.samples(settings, posterior.files, ens.sample.method)
# samples.file <- file.path(settings$outdir, "samples.Rdata")
# if (file.exists(samples.file)) {
# samples <- new.env()
# load(samples.file, envir = samples) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples
# trait.samples <- samples$trait.samples
# ensemble.samples <- samples$ensemble.samples
# sa.samples <- samples$sa.samples
# runs.samples <- samples$runs.samples
# ## env.samples <- samples$env.samples
Comment on lines +45 to +51
Copy link
Member

@infotroph infotroph Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing this is temporarily commented out while developing the replacement, but I strongly recommend all replacement load calls copy the pattern shown here of loading all Rdata files into their own environment -- not only does it keep the package checks from complaining about variables with no visible binding, it means there's no ambiguity about what names were loaded from which file.

The hack of reading the file into an environment and then immediately assigning back out to separate variables was just to minimize lines changed while quieting the check note. I'd support a further refactor to use all of these directly from thee loading environment instead (e.g. write all references to sa.samples as samples$sa.samples)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be honest, this was me trying to resolve a merge conflict and not having time to go back and check the code in more detail (it's been months since I worked on this PR). My plan had been to take a closer look at this function and once I had pulled in the latest code

# } else {
# PEcAn.logger::logger.error(samples.file, "not found, this file is required by the run.write.configs function")
# }

## Which posterior to use?
if(is.logical(use.existing.samples)){
if(use.existing.samples & file.exists(file.path(settings$outdir, "samples.Rdata"))){
load(file.path(settings$outdir, "samples.Rdata")) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples
} else {
for (i in seq_along(settings$pfts)) {
## if posterior.files is specified us that
if (is.na(posterior.files[i])) {
## otherwise, check to see if posteriorid exists
if (!is.null(settings$pfts[[i]]$posteriorid)) {
#TODO: sometimes `files` is a 0x0 tibble and other operations with it fail.
files <- PEcAn.DB::dbfile.check("Posterior",
settings$pfts[[i]]$posteriorid,
con, settings$host$name, return.all = TRUE)
pid <- grep("post.distns.*Rdata", files$file_name) ## is there a posterior file?
if (length(pid) == 0) {
pid <- grep("prior.distns.Rdata", files$file_name) ## is there a prior file?
}
if (length(pid) > 0) {
posterior.files[i] <- file.path(files$file_path[pid], files$file_name[pid])
} ## otherwise leave posteriors as NA
}
## otherwise leave NA and get.parameter.samples will look for local
} else {
## does posterior.files point to a directory instead of a file?
if(utils::file_test("-d",posterior.files[i])){
pfiles = dir(posterior.files[i],pattern = "post.distns.*Rdata",full.names = TRUE)
if(length(pfiles)>1){
pid = grep("post.distns.Rdata",pfiles)
if(length(pid > 0)){
pfiles = pfiles[grep("post.distns.Rdata",pfiles)]
} else {
PEcAn.logger::logger.error(
"run.write.configs: could uniquely identify posterior files within",
posterior.files[i])
}
posterior.files[i] = pfiles
}
}
## also, double check PFT outdir exists
if (is.null(settings$pfts[[i]]$outdir) || is.na(settings$pfts[[i]]$outdir)) {
## no outdir
settings$pfts[[i]]$outdir <- file.path(settings$outdir, "pfts", settings$pfts[[i]]$name)
}
} ## end else
} ## end for loop over pfts

## Sample parameters
PEcAn.uncertainty::get.parameter.samples(settings, posterior.files, ens.sample.method)
load(file.path(settings$outdir, "samples.Rdata")) ## loads ensemble.samples, trait.samples, sa.samples, runs.samples, env.samples

## end use.existing.samples == FALSE
} ## is.logical(use.existing.samples)
} else {
PEcAn.logger::logger.error(samples.file, "not found, this file is required by the run.write.configs function")
### use.existing.samples must be a filename
load(use.existing.samples)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a doubletake here ("wait, why is it calling load on a bool?") even after reading at the top of the function that use.existing.samples could be a filepath.

Possible alternate design: Call this param existing.sample.file, have the argument always be a path, and make the default value be NULL (behaving like FALSE does now). Then passing existing.sample.file = "samples.Rdata" would behave like TRUE does now with no need for a special case.

}

## remove previous runs.txt
if (overwrite && file.exists(file.path(settings$rundir, "runs.txt"))) {
PEcAn.logger::logger.warn("Existing runs.txt file will be removed.")
Expand Down
16 changes: 13 additions & 3 deletions base/workflow/R/runModule.run.write.configs.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,21 @@
#'
#' @param settings a PEcAn Settings or MultiSettings object
#' @param overwrite logical: Replace config files if they already exist?
#' @param use.existing.samples FALSE (default), TRUE, or filename
#' @return A modified settings object, invisibly
#' @export
runModule.run.write.configs <- function(settings, overwrite = TRUE) {
runModule.run.write.configs <- function(settings, overwrite = TRUE, use.existing.samples = TRUE) {

if (PEcAn.settings::is.MultiSettings(settings)) {
if (overwrite && file.exists(file.path(settings$rundir, "runs.txt"))) {
PEcAn.logger::logger.warn("Existing runs.txt file will be removed.")
unlink(file.path(settings$rundir, "runs.txt"))
}
return(PEcAn.settings::papply(settings, runModule.run.write.configs, overwrite = FALSE))
return(PEcAn.settings::papply(
settings,
runModule.run.write.configs,
overwrite = overwrite,
use.existing.samples = use.existing.samples))
} else if (PEcAn.settings::is.Settings(settings)) {
write <- settings$database$bety$write
# double check making sure we have method for parameter sampling
Expand All @@ -27,7 +32,12 @@ runModule.run.write.configs <- function(settings, overwrite = TRUE) {
}) %>%
unlist()

return(PEcAn.workflow::run.write.configs(settings, write, ens.sample.method, posterior.files = posterior.files, overwrite = overwrite))
return(PEcAn.workflow::run.write.configs(settings,
write,
ens.sample.method,
posterior.files = posterior.files,
overwrite = overwrite,
use.existing.samples = use.existing.samples))
} else {
stop("runModule.run.write.configs only works with Settings or MultiSettings")
}
Expand Down
5 changes: 4 additions & 1 deletion base/workflow/man/run.write.configs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 7 additions & 1 deletion base/workflow/man/runModule.run.write.configs.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions models/sipnet/R/model2netcdf.SIPNET.R
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ sipnet2datetime <- function(sipnet_tval, base_year, base_month = 1,
##' @export
##' @author Shawn Serbin, Michael Dietze
model2netcdf.SIPNET <- function(outdir, sitelat, sitelon, start_date, end_date, delete.raw = FALSE, revision, prefix = "sipnet.out",
overwrite = FALSE, conflict = FALSE) {
overwrite = FALSE, conflict = TRUE) {

### Read in model output in SIPNET format
sipnet_out_file <- file.path(outdir, prefix)
Expand Down Expand Up @@ -323,4 +323,4 @@ model2netcdf.SIPNET <- function(outdir, sitelat, sitelon, start_date, end_date,
}
} # model2netcdf.SIPNET
#--------------------------------------------------------------------------------------------------#
### EOF
### EOF
2 changes: 1 addition & 1 deletion models/sipnet/man/model2netcdf.SIPNET.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.