High RAM usage when using meta.retrieval prevents usage on low end machines #40

barrel0luck · 2019-05-29T16:39:30Z

This issue depends on the numbers of genomes/ cds/ rna/ etc that are to be downloaded.
As the process goes on the amount of RAM used by the r-session progressively increases to a point where the entire system slows down.
On a system with 8 GB RAM (low-end systems), I've managed to download ~1600 successfully, but I need to restart system after the process is done as everything is super slow afterwards. So far I've not managed to download higher numbers than that as the system becomes non responsive.
I think it must be some variable that must be increasing in size as the process goes on and can be easily cleaned up after each download (maybe) to reduce RAM usage.
Also note that the RAM usage at times intermittently decreases, i.e, it's not continuously increasing, but over a long period of time, it increases a lot, eventually overpowering the system.

HajkD · 2019-05-30T18:01:03Z

Hi @barrel0luck,

Many thanks for contacting me and for making me aware of this issue.

Would you mind sharing a small example where this occurs? This will make my life much easier when troubleshooting.

Your help is very much appreciated.

Many thanks,
Hajk

barrel0luck · 2019-05-31T11:06:29Z

Sure! And thanks for developing this awesome package! I hope you can maintain it for long!

Here's the code you can use to reproduce the issue on a low end system (no biggie):
This should download ~1600 files:

meta.retrieval(kingdom = "bacteria", group = "Gammaproteobacteria", db = "refseq", type = "rna", reference = FALSE) %>%
  clean.retrieval()

This should dowload a much greater number of files (not sure about the number, I've failed so far):

meta.retrieval(kingdom = "bacteria", group = "Gammaproteobacteria", db = "genbank", type = "rna", reference = FALSE) %>%
  clean.retrieval()

HajkD · 2019-05-31T18:40:18Z

Perfect! Thank you so much :-)

I will have a look at it now.

Cheers,
Hajk

barrel0luck · 2019-05-31T21:38:46Z

I must note that I'm using R on Fedora Linux. However, if the issue is with the code, maybe a variable (or more) that grows with each iteration of a loop, then it should be reproducible on other OSes as well...
I think the problem results as r-session loads and stores everything in the RAM.

johanneswerner mentioned this issue Mar 4, 2021

Inconclusive results with is.genome.available() #70

Closed

HajkD added enhancement help wanted labels Nov 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High RAM usage when using meta.retrieval prevents usage on low end machines #40

High RAM usage when using meta.retrieval prevents usage on low end machines #40

barrel0luck commented May 29, 2019

HajkD commented May 30, 2019

barrel0luck commented May 31, 2019

HajkD commented May 31, 2019

barrel0luck commented May 31, 2019 •

edited

High RAM usage when using meta.retrieval prevents usage on low end machines #40

High RAM usage when using meta.retrieval prevents usage on low end machines #40

Comments

barrel0luck commented May 29, 2019

HajkD commented May 30, 2019

barrel0luck commented May 31, 2019

HajkD commented May 31, 2019

barrel0luck commented May 31, 2019 • edited

barrel0luck commented May 31, 2019 •

edited