Clarification about Block Size and Estimating Memory Usage #704
YasirKusay
started this conversation in
General
Replies: 1 comment
-
It does not mean that this much memory will always be used.
Yes, that makes sense with a bit of memory left for tolerance (it is not a hard upper limit). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
According to your paper, -c refers to how many chunks we want to divide an individual seed when processing it and -B refers to the max amount of the index/query sequence letters to load and compare at a time. You give an estimate of the total memory usage to be 2(B + 8 x B/C + const).
I have a few questions regarding this. I have a 495 Mb fasta file with 2243452 sequences that I aligned against a 98GB index. When using the parameters -b 6 and -c 1, with 60GB of memory, the program (as expected) ran out of memory because 2(6e9 + 8*e9) = 108 GB.
However, I have ran other alignments with smaller query files (with the same index database and parameters) without any issues so I am confused about how block-size actually works. Could you let me know more about how --block-size works? I would for example like to know how it handles loading both the query and index.
Also, as a general rule, should I specify the memory of my job as calculated via the 2(B + 6B/C) algorithm.
Beta Was this translation helpful? Give feedback.
All reactions