-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mem_mb for Clumpify rule #6
Comments
I hadn't had this issue yet. Do you think this could be related to the cluster execution? For my latest runs setting the Java VM to the sum of all input files worked fine. |
I can't think of a reason that it would be specific to the cluster, except that on the cluster memory usage is enforced. If you run everything on one machine it's not (unless you use all of the memory of that machine), so if the jobs aren't all running at the same time then maybe you just get away with it? Have you checked to see how much memory these jobs use in your case? |
Perhaps it could be related to this: https://www.mail-archive.com/slurm-dev@schedmd.com/msg09340.html |
Did a few more tests today on these files:
I first requested an interactive session via Then ran clumpify in a singularity container as follows:
The output of this was:
So in this case GNU time says that the max RSS is 4.8 GB. After this run,
SLURM seems to agree that the max RSS was 4.8GB. So I don't think that the problem is that SLURM is incorrectly measuring the memory used. Trying with some different files:
Running this command:
I immediately get this output:
Clumpify has already allocated itself 65413MB, which seems like a lot, but if I try to allocate more by adding the argument
This runs ok:
So in this case Clumpify crashed if I gave it less than 70GB, but then it only had a max RSS of 17.8GB. And afterwards, I checked the memory usage with
SLURM agrees that the max RSS is 17.8GB. |
Clumpify seems to sometimes use a ton of memory (usually I use subs=2). For example, when I have input files like so:
And I run the clumpify rule on a cluster, and use
sacct
to check how the job went, I see this:This particular job didn't even finish, but it's already had a MaxVMSize of ~33GB, and MaxRSS of ~18GB.
Sometimes I also see errors in the clumpify rule like so:
However, when I check the sequence and quality of the read in question, they're the same length. I assume this is a memory-related issue.
I've yet to work out how to work out the best way to calculate how much memory to give these jobs. I suppose it depends on the size of the input files and the number of substitutions?
The text was updated successfully, but these errors were encountered: