object 'results' not found #334

ericchansen · 2018-11-28T17:46:35Z

Before submitting a bug please check the following:

Start a new R session
Check your credentials file
Install the latest doAzureParallel package
Submit a minimal, reproducible example
run sessionInfo()

Updates

EDIT (11/29/2018) - Added additional examples, corrected a typo and improved formatting.

Description

Can someone please explain why I get the error object 'results' not found? Full code below.

Ideally, I need to return two objects from inside the loop. One is a data frame that gets row binded, and the other is a list that needs to become a list of lists. In the example below, I'm only returning the data frame (will work on adding the list as additional output once this issue is resolved).

Instruction to repro the problem if applicable

Example 1

remove.packages("rAzureBatch")
remove.packages("doAzureParallel")
devtools::install_github("Azure/rAzureBatch", force = TRUE)
devtools::install_github("Azure/doAzureParallel", force = TRUE)

library(doAzureParallel)
setVerbose(TRUE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3) %do% { # Works.
# my_results <- foreach(t = 1:3, .combine = 'rbind') %do% { # Works.
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(autoDeleteJob = FALSE)) %dopar% { # Works.
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

sessionInfo()

Example 2 featuring superfluous use of setAutoDeleteJob(FALSE)

library(doAzureParallel)
setVerbose(TRUE)
setAutoDeleteJob(FALSE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

Output from sessionInfo()

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] doAzureParallel_0.7.2 iterators_1.0.9       foreach_1.4.5        
 [4] RevoUtilsMath_10.0.1  RevoUtils_10.0.7      RevoMods_11.0.0      
 [7] MicrosoftML_9.3.0     mrsdeploy_1.1.3       RevoScaleR_9.3.0     
[10] lattice_0.20-35       rpart_4.1-11         

loaded via a namespace (and not attached):
 [1] codetools_0.2-15       CompatibilityAPI_1.1.0 digest_0.6.13         
 [4] rAzureBatch_0.6.2      mime_0.5               bitops_1.0-6          
 [7] grid_3.4.3             R6_2.2.2               jsonlite_1.5          
[10] httr_1.3.1             curl_3.1               rjson_0.2.15          
[13] tools_3.4.3            RCurl_1.95-4.9         yaml_2.1.16           
[16] compiler_3.4.3         mrupdate_1.0.1

Output from error

==============================================================================
Id: job20181128173916
chunkSize: 1
enableCloudCombine: FALSE
errorHandling: stop
wait: TRUE
autoDeleteJob: TRUE
==============================================================================
Submitting tasks (3/3)
Waiting for tasks to complete. . .
| Progress: 100.00% (3/3) | Running: 0 | Queued: 0 | Completed: 3 | Failed: 0 |
Tasks have completed. 
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
  object 'results' not found
Called from: e$fun(obj, substitute(ex), parent.frame(), e$data)

The text was updated successfully, but these errors were encountered:

Pullarg · 2018-11-29T00:59:32Z

Have a look at this issue 284, I have just been running into this myself and it seems using the option setAutoDeleteJob(FALSE) and .options.azure = list(enableCloudCombine = FALSE) will solve your issue. the link has more details but bassically you merge it yourself by reading from the blob storage directly.

ericchansen · 2018-11-29T16:07:14Z

@Pullarg I tried my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found, which you can see in my original post. That should behave the same way as using setAutoDeleteJob(FALSE). That being said, I tested several variations with setAutoDeleteJob(FALSE) anyway (code below). All resulted in the same error (also shown below).

Error message

==============================================================================
Id: job20181129155730
chunkSize: 1
enableCloudCombine: FALSE
errorHandling: stop
wait: TRUE
autoDeleteJob: FALSE
==============================================================================
Submitting tasks (3/3)
Waiting for tasks to complete. . .
| Progress: 100.00% (3/3) | Running: 0 | Queued: 0 | Completed: 3 | Failed: 0 |
Tasks have completed. 
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) : 
  object 'results' not found
Called from: e$fun(obj, substitute(ex), parent.frame(), e$data)

Sample code

library(doAzureParallel)
setVerbose(TRUE)
setAutoDeleteJob(FALSE)

setCredentials(file.path(getwd(), "credentials.json"))
cluster <- makeCluster(file.path(getwd(), "cluster.json"), fullName=TRUE)
registerDoAzureParallel(cluster)
getDoParWorkers()

# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
# my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE)) %dopar% { # object 'results' not found
my_results <- foreach(t = 1:3, .combine = 'rbind', .options.azure = list(enableCloudCombine = FALSE, autoDeleteJob = FALSE)) %dopar% { # object 'results' not found

  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

brnleehng · 2018-11-29T17:33:31Z

Hi @ericchansen

If you remove the enableCloudCombine flag, you will get your results. The object 'result not found' occurs because no file is found on Azure Storage that contains the merged result (RDS file that contains all the tasks since enableCloudCombine is set to disable). I will add better error handling for this case.

Below: This example works

my_results <- foreach(t = 1:3, .combine = 'rbind') %dopar% {
  my_results_df <- data.frame("x" = runif(2), "trial" = replicate(2, t))
  my_results_list <- runif(3)
  return(my_results_df)
}

my_results

ericchansen · 2018-11-29T17:48:04Z

@brnleehng Yep, that's what I've been doing.

I suppose I don't understand the use case for enableCloudCombine = FALSE. How should we be using this option?

The documentation doesn't have any clear examples besides what's mentioned here. Looking at that example, I feel like that would also trigger this error.

Pullarg · 2018-11-30T01:17:23Z

@brnleehng I need to return a list rather than bind rows. Is there anyway to skip the merge step at all, as i am getting the same failure. I would like to just perform this on the head( non cloud side) by reading from the storage account directly.

Update:
Dumb question just needed to turn the result back into a list. , after having the rbind result i can convert from a data.frame to a list by using this statement split(rbind.df, seq(nrow(rbind.df)))

brnleehng · 2018-12-05T21:45:27Z

Hi @ericchansen

The case for enableCloudCombine = FALSE is to avoid merging all your resources onto one VM while the other VMs are in idle (Unless you are using autoscale). There are cases when your tasks are producing many/large files that the merge task can run out of memory causing your job to fail.

Hi @Pullarg,
You should use getJobResult function to download all the results locally and it will manually merge it as a list.

> getJobResult("job20181205211937")
Getting job results...
enableCloudCombine is set to FALSE, we will merge job result locally
[[1]]
[1] 2

[[2]]
[1] 3

[[3]]
[1] 4

Thanks,
Brian

angusrtaylor · 2019-01-04T13:54:00Z

I'm getting the same error even with enableCloudCombine = FALSE. In my code, I am not returning any results from the %dopar% block. Instead, I am just writing my result dataframe to disk. My code runs correctly but the error still appears. Is there a way to avoid this error when the code intentionally does not return a result?

brnleehng · 2019-01-08T17:43:33Z

Can you add NULL at the end of the %dopar% block?

I'm looking into fixing enableCloudCombine path = false.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

object 'results' not found #334

object 'results' not found #334

ericchansen commented Nov 28, 2018 •

edited

Pullarg commented Nov 29, 2018

ericchansen commented Nov 29, 2018

brnleehng commented Nov 29, 2018

ericchansen commented Nov 29, 2018

Pullarg commented Nov 30, 2018 •

edited

brnleehng commented Dec 5, 2018

angusrtaylor commented Jan 4, 2019

brnleehng commented Jan 8, 2019

object 'results' not found #334

object 'results' not found #334

Comments

ericchansen commented Nov 28, 2018 • edited

Updates

Description

Instruction to repro the problem if applicable

Pullarg commented Nov 29, 2018

ericchansen commented Nov 29, 2018

brnleehng commented Nov 29, 2018

ericchansen commented Nov 29, 2018

Pullarg commented Nov 30, 2018 • edited

brnleehng commented Dec 5, 2018

angusrtaylor commented Jan 4, 2019

brnleehng commented Jan 8, 2019

ericchansen commented Nov 28, 2018 •

edited

Pullarg commented Nov 30, 2018 •

edited