Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

Connecting to existing cluster fails to load imageName in code #372

Open
NewbieScriptWriter opened this issue Oct 27, 2020 · 0 comments
Open

Comments

@NewbieScriptWriter
Copy link

NewbieScriptWriter commented Oct 27, 2020

Before submitting a bug please check the following:

  • [yes] Start a new R session
  • [yes] Check your credentials file
  • [yes] Install the latest doAzureParallel package
  • [yes] Submit a minimal, reproducible example
  • [yes] run sessionInfo()

If we connect to an existing cluster then jobs fail to run, likely due to missing "imageName" in code.

{"id":"1","commandLine":"/bin/bash -c "set -e; set -o pipefail; Rscript --no-save --no-environ --no-restore --no-site-file --verbose $AZ_BATCH_JOB_PREP_WORKING_DIR/worker.R 1 10 0 stop > $AZ_BATCH_TASK_ID.txt; wait"","userIdentity":{"autoUser":{"scope":"pool","elevationLevel":"admin"}},"environmentSettings"........removed code for privacy........{"filePattern":"../stdout.txt","destination":{"container":{"path":"stdout/1-........removed code for privacy........constraints":{"maxTaskRetryCount":3},"exitConditions":{"default":{"dependencyAction":"satisfy"}},"containerSettings":{"imageName":{},"containerRunOptions":"--rm"}}

Error in curl::curl_fetch_memory(url, handle = handle) :
Failure when receiving data from the peer

But if we create a new cluster and then connect to the existing cluster, then the code runs fine and the verbose output shows "imageName":"rocker/tidyverse:3.6.3":

{"id":"1","commandLine":"/bin/bash -c "set -e; set -o pipefail; Rscript --no-save --no-environ --no-restore --no-site-file --verbose $AZ_BATCH_JOB_PREP_WORKING_DIR/worker.R 1 1 0 stop > $AZ_BATCH_TASK_ID.txt; wait"","userIdentity":{"autoUser":{"scope":"pool","elevationLevel":"admin"}},"environmentSettings"........removed code for privacy........{"filePattern":"../stdout.txt","destination":{"container":{"path":"stdout/1-........removed code for privacy........constraints":{"maxTaskRetryCount":3},"exitConditions":{"default":{"dependencyAction":"satisfy"}},"containerSettings":{"imageName":"rocker/tidyverse:3.6.3","containerRunOptions":"--rm"}}

Steps we follow to reproduce the issue:

cluster already exists with several idle nodes

#------------------# Load the doAzureParallel library
library(doAzureParallel)
#------------------# Logging on
setVerbose(TRUE)
setHttpTraffic(TRUE)
#------------------# Set your credentials
setCredentials("credentials.json")
#------------------# Get existing cluster
cluster <- getCluster("TestCluster_2020", verbose = TRUE)
#------------------# Register the cluster as your parallel backend
registerDoAzureParallel(cluster)

#------------------# Test simulation inputs
mean_change = 1.001
volatility = 0.01
opening_price = 100

getClosingPrice <- function() {
days <- 1825 # ~ 5 years
movement <- rnorm(days, mean=mean_change, sd=volatility)
path <- cumprod(c(opening_price, movement))
closingPrice <- path[days]
return(closingPrice)
}

#------------------# PARALLEL Test simulation
opt <- list(chunkSize = 10)
start_p <- Sys.time()
closingPrices_p <- foreach(i = 1:10, .combine='c', .options.azure = opt) %dopar% {
replicate(10, getClosingPrice())
}
end_p <- Sys.time()

hist(closingPrices_p)

difftime(end_p, start_p, unit = "min")

So the only way to run the code against an existing cluster is to create a "throw-away" cluster first and then use the existing cluster for execution:

#------------------# Create your cluster in Azure passing, it your cluster config file.
#throw-away cluster#
cluster <- makeCluster("cluster.json")
#------------------# Get existing cluster
cluster <- getCluster("TestCluster_2020", verbose = TRUE)
#------------------# Register the cluster as your parallel backend
registerDoAzureParallel(cluster)

#############################

sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] doAzureParallel_0.8.0 iterators_1.0.12 foreach_1.5.0

loaded via a namespace (and not attached):
[1] compiler_3.6.3 prettyunits_1.1.1 bitops_1.0-6 remotes_2.2.0 tools_3.6.3 testthat_2.3.2 digest_0.6.25 pkgbuild_1.1.0 pkgload_1.1.0 jsonlite_1.7.1 memoise_1.1.0 rlang_0.4.7 cli_2.0.2
[14] rstudioapi_0.11 curl_4.3 withr_2.3.0 httr_1.4.2 desc_1.2.0 fs_1.5.0 devtools_2.3.2 rprojroot_1.3-2 glue_1.4.2 R6_2.4.1 processx_3.4.4 fansi_0.4.1 sessioninfo_1.1.1
[27] callr_3.4.4 magrittr_1.5 backports_1.1.10 ps_1.3.4 codetools_0.2-16 ellipsis_0.3.1 usethis_1.6.3 assertthat_0.2.1 mime_0.9 rAzureBatch_0.7.0 RCurl_1.98-1.2 crayon_1.3.4 rjson_0.2.20

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant