Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use multiple threads via -t #171

Open
ChungLamYu opened this issue Nov 17, 2023 · 3 comments
Open

Can't use multiple threads via -t #171

ChungLamYu opened this issue Nov 17, 2023 · 3 comments

Comments

@ChungLamYu
Copy link

屏幕截图 2023-11-17 143034

I use the -t 8 option when running ragtag.py correct, but it seems to be only utilizing one CPU core. Is there an error in my command?

my command: ragtag.py correct heterodera_glycines.PRJNA381081.WBPS18.genomic.fa Sk43r400.contig -t 8

@mmpust
Copy link

mmpust commented Dec 12, 2023

Hey, did you solve this? I am having the same issue. Thanks!

@taprs
Copy link

taprs commented Jan 11, 2024

Hi! Maybe I can comment on this although I also have a kind of similar issue.

ragtag.py correct -h says that the -t parameter is only passed to minimap2. In the log fragment shared by @ChungLamYu minimap2 was not invoked because the output was generated before, hence no benefit from multithreading.

In my case though, when using ragtag correct with read files provided, the second run of minimap2 used to map the reads seems to ignore the -t parameter too:

+ ragtag.py correct --mm2-params '-x asm20 -t64' -R reads.fastq.gz -T corr -o workdir --gff annotation.gff3 reference.fa draft.fa
Wed Jan 10 18:29:20 2024 --- VERSION: RagTag v2.1.0
Wed Jan 10 18:29:20 2024 --- CMD: ragtag.py correct --mm2-params -x asm20 -t64 -R reads.fastq.gz -T corr -o workdir --gff annotation.gff3 reference.fa draft.fa
Wed Jan 10 18:29:20 2024 --- WARNING: Without '-u' invoked, some component/object AGP pairs might share the same ID. Some external programs/databases don't like this. To ensure valid AGP format, use '-u'.
Wed Jan 10 18:29:20 2024 --- INFO: Mapping the query genome to the draft genome
Wed Jan 10 18:29:20 2024 --- INFO: Running: minimap2 -x asm20 -t64 reference.fa draft.fa > workdir/ragtag.correct.asm.paf 2> workdir/ragtag.correct.asm.paf.log
Wed Jan 10 18:38:17 2024 --- INFO: Finished running : minimap2 -x asm20 -t64 reference.fa draft.fa > workdir/ragtag.correct.asm.paf 2> workdir/ragtag.correct.asm.paf.log
Wed Jan 10 18:38:17 2024 --- INFO: Reading whole genome alignments
Wed Jan 10 18:38:19 2024 --- INFO: Filtering and merging alignments
Wed Jan 10 18:38:22 2024 --- INFO: Validating putative query breakpoints via read alignment
Wed Jan 10 18:38:22 2024 --- INFO: Aligning reads to query sequences
Wed Jan 10 18:38:22 2024 --- INFO: Running: minimap2 -ax asm20 -t 1 draft.fa reads.fastq.gz > workdir/ragtag.correct.reads.sam 2> workdir/ragtag.correct.reads.sam.log

How would I use many threads for the second minimap2 run? And now that I noticed that the second run also respects my instruction for -x asm20 which I don't like — can I maybe set two different sets of parameters for two mappings in read-guided ragtag correct run? Another way to solve this would be allowing to add a custom BAM file as an alternative to making ragtag correct map the reads itself.

@taprs
Copy link

taprs commented Jan 12, 2024

Update: multithreading seem to work better if I provide the -t 64 argument outside of the--mm2-params string. First run of minimap2 is said to run with -t unspecified then but I guess it uses many cores because it did the job much faster. The mapping of reads then explicitly has -t 64.

Much better now, but still I don't feel like I can fully control the parameters of two mapping routines so would be cool to improve that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants