`pml()` segfault: memory not mapped #144

iferres · 2023-01-18T15:03:29Z

Hi, I'm having this issue with pml():

library(magrittr)
library(phangorn)

tree <- readRDS("tree.RDS")
dat <- readRDS("phydat.RDS") 

pml <- phangorn::pml(tree, data = dat, k = 8)    

 *** caught segfault ***  
address 0x8cc5f80, cause 'memory not mapped'

Traceback:
 1: pml.fit(tree, data, bf, shape = shape, k = k, Q = Q, levels = attr(data,     "levels"), inv = inv, rate = rate, g = g, w = w, eig = eig,     INV = INV, ll.0 = ll.0, llMix = llMix, wMix = wMix, site = TRUE,     ASC = ASC) 
 2: phangorn::pml(tree, data = dat, k = 8)

I'm send you the files through wetransfer to reproduce the error (https://we.tl/t-SrejR8GFvG). The phydat.RDS is quite big (about 41 Mb). If I subset it before computing pml, the error disappears.

sessionInfo()            
R version 4.2.2 Patched (2022-11-10 r83330)           
Platform: x86_64-pc-linux-gnu (64-bit)                           
Running under: Debian GNU/Linux bookworm/sid                               
                                      
Matrix products: default                 
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.21.so   
    
locale:                                                                            
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C            
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C    
 [9] LC_ADDRESS=C               LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C 

attached base packages:   
[1] stats     graphics  grDevices utils     datasets  methods   base 
   
other attached packages:        
[1] phangorn_2.10.0 ape_5.6-2       magrittr_2.0.3   
 
loaded via a namespace (and not attached):   
 [1] Rcpp_1.0.9       quadprog_1.5-8   lattice_0.20-45  codetools_0.2-18  
 [5] grid_4.2.2       nlme_3.1-160     rlang_1.0.6      cli_3.4.1 
 [9] Matrix_1.5-1     generics_0.1.3   fastmatch_1.1-3  igraph_1.3.5  
[13] parallel_4.2.2   compiler_4.2.2   pkgconfig_2.0.3

The text was updated successfully, but these errors were encountered:

KlausVigo · 2023-01-18T19:10:08Z

Dear @iferres,

with this data set it is a case that it runs out of memory. More that one cannot allocate enough memory.
So your data are pretty big, so dat is about 400Mb in memory.

> dat
664 sequences with 381968 character and 153762 different site patterns.
The states are a c g t
> object.size(dat)
410636904 bytes
> 153762 * 664 * 4
[1] 408391872

Where 153762 is the number of site pattern, 664 number of sequences and 4 bytes for an integer.
However if you try to run pml you need to allocate more memory (~26Gb) in your case:

153762 * 664 * 8 * 4  *  8
26137079808

Where 153762 is the number of site pattern, 664 number of sequences and 8 rate classes, 4 states and 8 bytes for an double. So maybe iqtree or RAxML can handle your data set.

Maybe we can discuss offline and brainstorm how to handle such data sets.
While I should one day allow longer vectors, this is not a trivial change.

Kind regards,
Klaus

iferres · 2023-01-18T20:45:02Z

Thank you very much for your quick response, Klaus!

I see. However, I ran it on a my desktop which has 64Gb of RAM, and then on a server with 1Tb of RAM. It fails in both of them. (The funny thing is that this is actually a 1/5 subset of my real dataset, which is a core genome concatenated alignment of 664 organisms and about 1000 genes 😅 ).

I'm a user-level phylogenetist, not sure if I could help with this, but I'm open to try to improve it.

Regards,
Ignacio

lemmonquiche mentioned this issue Jun 22, 2023

pml segfault: memory not mapped #149

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`pml()` segfault: memory not mapped #144

`pml()` segfault: memory not mapped #144

iferres commented Jan 18, 2023

KlausVigo commented Jan 18, 2023

iferres commented Jan 18, 2023

pml() segfault: memory not mapped #144

pml() segfault: memory not mapped #144

Comments

iferres commented Jan 18, 2023

KlausVigo commented Jan 18, 2023

iferres commented Jan 18, 2023

`pml()` segfault: memory not mapped #144

`pml()` segfault: memory not mapped #144