Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault in R3.4.3 on Mac #23

Open
vnijs opened this issue Feb 26, 2018 · 8 comments
Open

segfault in R3.4.3 on Mac #23

vnijs opened this issue Feb 26, 2018 · 8 comments

Comments

@vnijs
Copy link

vnijs commented Feb 26, 2018

@JackStat FYI Running caret with metric "ROC" consistently produces a segfault on macOS (sierra) with R3.4.3. The segfault message is shown below. Example code also shown below. I have been able to reproduce on multiple different datasets and estimation algorithms. The segfault message refers to ModelMetrics_auc. The code example works fine if I provide my own function to calculate auc.

I hope you'll be able to take a look at this. Please let me know if you need more information. Thanks

cc-ing @topepo

 *** caught segfault ***
address 0x18, cause 'memory not mapped'

Traceback:
 1: .Call("ModelMetrics_auc_", PACKAGE = "ModelMetrics", actual,     predicted, ranks)
 2: auc_(actual, predicted, ranks)
 3: ModelMetrics::auc(ifelse(data$obs == lev[2], 0, 1), data[, lvls[1]])
 4: ctrl$summaryFunction(testOutput, lev, method)
 5: evalSummaryFunction(y, wts = weights, ctrl = trControl, lev = classLevels,     metric = metric, method = method)
 6: train.default(x, y, weights = w, ...)
 7: train(x, y, weights = w, ...)
 8: train.formula(vs ~ ., data = dat, method = "ranger", trControl = ctrl,     tuneGrid = grid, metric = "ROC", verbose = FALSE)
 9: train(vs ~ ., data = dat, method = "ranger", trControl = ctrl,     tuneGrid = grid, metric = "ROC", verbose = FALSE)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 
Selection: 
Selection: 
## loading libraries
library(ranger)
library(caret)

dat <- mtcars
dat$vs <- factor(ifelse(dat$vs == 1, "yes", "no"))
sapply(dat, class)

ranger(
  vs ~ ., 
  data = dat, 
  probability = TRUE, 
  num.trees = 50, 
  mtry = 3
)

set.seed(1234)
grid <- expand.grid(mtry = 3:4, splitrule = "gini", min.node.size = 1)
ctrl <- trainControl(
  method = "cv", 
  number = 5, 
  classProbs = TRUE, 
  summaryFunction = twoClassSummary, 
  verboseIter = TRUE
)

result <- train(
  vs ~ ., 
  data = dat,
  method = "ranger", 
  trControl = ctrl, 
  tuneGrid = grid, 
  metric = "ROC", 
  verbose = FALSE
)

Session Info:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] caret_6.0-78    ggplot2_2.2.1   lattice_0.20-35 ranger_0.9.0   

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.3   purrr_0.2.4        reshape2_1.4.3     kernlab_0.9-25     splines_3.4.3      colorspace_1.3-2  
 [7] stats4_3.4.3       yaml_2.1.16        survival_2.41-3    prodlim_1.6.1      rlang_0.2.0.9000   ModelMetrics_1.1.0
[13] pillar_1.1.0       withr_2.1.1        foreign_0.8-69     glue_1.2.0         bindrcpp_0.2       foreach_1.4.3     
[19] bindr_0.1.0.9000   plyr_1.8.4         dimRed_0.1.0       lava_1.6           robustbase_0.92-8  stringr_1.3.0     
[25] timeDate_3042.101  munsell_0.4.3      gtable_0.2.0       recipes_0.1.2      codetools_0.2-15   psych_1.7.8       
[31] parallel_3.4.3     class_7.3-14       DEoptimR_1.0-8     broom_0.4.3        Rcpp_0.12.15       scales_0.5.0      
[37] ipred_0.9-6        CVST_0.2-1         mnormt_1.5-5       stringi_1.1.6      dplyr_0.7.4        RcppRoll_0.2.2    
[43] ddalpha_1.3.1.1    grid_3.4.3         tools_3.4.3        magrittr_1.5       lazyeval_0.2.1     tibble_1.4.2      
[49] tidyr_0.8.0        DRR_0.0.3          pkgconfig_2.0.1    MASS_7.3-48        Matrix_1.2-12      lubridate_1.7.2   
[55] gower_0.1.2        assertthat_0.2.0   iterators_1.0.8    R6_2.2.2           rpart_4.1-12       sfsmisc_1.1-0     
[61] nnet_7.3-12        nlme_3.1-131       compiler_3.4.3    
@JackStat
Copy link
Owner

JackStat commented Mar 7, 2018

Strange. I was not able to reproduce the error on high Sierra.

> ## loading libraries
> library(ranger)
> library(caret)
Loading required package: lattice
Loading required package: ggplot2
> 
> dat <- mtcars
> dat$vs <- factor(ifelse(dat$vs == 1, "yes", "no"))
> sapply(dat, class)
      mpg       cyl      disp        hp      drat        wt      qsec        vs        am 
"numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"  "factor" "numeric" 
     gear      carb 
"numeric" "numeric" 
> 
> ranger(
+     vs ~ ., 
+     data = dat, 
+     probability = TRUE, 
+     num.trees = 50, 
+     mtry = 3
+ )
Ranger result

Call:
 ranger(vs ~ ., data = dat, probability = TRUE, num.trees = 50,      mtry = 3) 

Type:                             Probability estimation 
Number of trees:                  50 
Sample size:                      32 
Number of independent variables:  10 
Mtry:                             3 
Target node size:                 10 
Variable importance mode:         none 
OOB prediction error:             0.07048678 
> 
> set.seed(1234)
> grid <- expand.grid(mtry = 3:4, splitrule = "gini", min.node.size = 1)
> ctrl <- trainControl(
+     method = "cv", 
+     number = 5, 
+     classProbs = TRUE, 
+     summaryFunction = twoClassSummary, 
+     verboseIter = TRUE
+ )
> 
> result <- train(
+     vs ~ ., 
+     data = dat,
+     method = "ranger", 
+     trControl = ctrl, 
+     tuneGrid = grid, 
+     metric = "ROC", 
+     verbose = FALSE
+ )
+ Fold1: mtry=3, splitrule=gini, min.node.size=1 
- Fold1: mtry=3, splitrule=gini, min.node.size=1 
+ Fold1: mtry=4, splitrule=gini, min.node.size=1 
- Fold1: mtry=4, splitrule=gini, min.node.size=1 
+ Fold2: mtry=3, splitrule=gini, min.node.size=1 
- Fold2: mtry=3, splitrule=gini, min.node.size=1 
+ Fold2: mtry=4, splitrule=gini, min.node.size=1 
- Fold2: mtry=4, splitrule=gini, min.node.size=1 
+ Fold3: mtry=3, splitrule=gini, min.node.size=1 
- Fold3: mtry=3, splitrule=gini, min.node.size=1 
+ Fold3: mtry=4, splitrule=gini, min.node.size=1 
- Fold3: mtry=4, splitrule=gini, min.node.size=1 
+ Fold4: mtry=3, splitrule=gini, min.node.size=1 
- Fold4: mtry=3, splitrule=gini, min.node.size=1 
+ Fold4: mtry=4, splitrule=gini, min.node.size=1 
- Fold4: mtry=4, splitrule=gini, min.node.size=1 
+ Fold5: mtry=3, splitrule=gini, min.node.size=1 
- Fold5: mtry=3, splitrule=gini, min.node.size=1 
+ Fold5: mtry=4, splitrule=gini, min.node.size=1 
- Fold5: mtry=4, splitrule=gini, min.node.size=1 
Aggregating results
Selecting tuning parameters
Fitting mtry = 3, splitrule = gini, min.node.size = 1 on full training set

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] caret_6.0-78    ggplot2_2.2.1   lattice_0.20-35 ranger_0.9.0   

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.4      purrr_0.2.4           reshape2_1.4.3        kernlab_0.9-25       
 [5] splines_3.4.3         colorspace_1.3-2      stats4_3.4.3          yaml_2.1.17          
 [9] survival_2.41-3       prodlim_1.6.1         rlang_0.2.0           e1071_1.6-8          
[13] ModelMetrics_1.1.9001 pillar_1.2.1          withr_2.1.1           foreign_0.8-69       
[17] glue_1.2.0            bindrcpp_0.2          foreach_1.4.4         bindr_0.1            
[21] plyr_1.8.4            dimRed_0.1.0          lava_1.6              robustbase_0.92-8    
[25] stringr_1.3.0         timeDate_3043.102     munsell_0.4.3         gtable_0.2.0         
[29] recipes_0.1.2         codetools_0.2-15      psych_1.7.8           parallel_3.4.3       
[33] class_7.3-14          DEoptimR_1.0-8        broom_0.4.3           Rcpp_0.12.15         
[37] scales_0.5.0          ipred_0.9-6           CVST_0.2-1            mnormt_1.5-5         
[41] stringi_1.1.6         dplyr_0.7.4           RcppRoll_0.2.2        ddalpha_1.3.1.1      
[45] grid_3.4.3            tools_3.4.3           magrittr_1.5          lazyeval_0.2.1       
[49] tibble_1.4.2          tidyr_0.8.0           DRR_0.0.3             pkgconfig_2.0.1      
[53] MASS_7.3-47           Matrix_1.2-12         data.table_1.10.4-3   lubridate_1.7.3      
[57] gower_0.1.2           assertthat_0.2.0      iterators_1.0.9       R6_2.2.2             
[61] rpart_4.1-11          sfsmisc_1.1-2         nnet_7.3-12           nlme_3.1-131         
[65] compiler_3.4.3 

@JackStat
Copy link
Owner

JackStat commented Mar 7, 2018

Are you seeing the issue on any other operating system? and can you try cloning the package and building it. I am curious if you can build it without any issues

@vnijs
Copy link
Author

vnijs commented Mar 7, 2018

Thanks for checking @JackStat! From your sessionInfo() it looks like your version of ModelMetrics is not ModelMetrics_1.1.0 from CRAN however. I tried installing the version you are using from GitHub but got the following error:

auc_.cpp:2:10: fatal error: 'omp.h' file not found
#include <omp.h>
         ^~~~~~~
1 error generated.
make: *** [auc_.o] Error 1
ERROR: compilation failed for package ‘ModelMetrics’
* removing ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/ModelMetrics’
* restoring previous ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/ModelMetrics’
Installation failed: Command failed (1)

@JackStat
Copy link
Owner

JackStat commented Mar 7, 2018

Ok this likely sounds scarier than it is. Can you follow this guide and try again?

https://thecoatlessprofessor.com/programming/openmp-in-r-on-os-x/#after-3-4-0

@vnijs
Copy link
Author

vnijs commented Mar 7, 2018

I installed clang4.0.0 and was able to build the package from github and run the code successfully. I already had gfortran 6.3 installed. Interestingly, ModelMetrics 1.1.0 from CRAN now also works.

Is it feasible to use ModelMetrics (and Caret) without installing clang and gfortran? Thanks!

@JackStat
Copy link
Owner

JackStat commented Mar 9, 2018

I will do some research on this and see if there is a viable alternative to OpenMP. Posix threading is standard in Cpp 11 but I need some time to figure that out

@vnijs
Copy link
Author

vnijs commented Mar 11, 2018

Thanks @JackStat. Would it be possible to have ModelMetrics revert to alternative calculation methods (e.g., regular R code) if OpenMP is not available?

@jyoungs
Copy link

jyoungs commented Nov 2, 2018

Maybe obvious- but others may also find themselves here just needing to install OMP (brew install libomp)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants