Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameter choice for big, repetitive genomes #230

Open
sighe opened this issue Feb 14, 2021 · 1 comment
Open

Parameter choice for big, repetitive genomes #230

sighe opened this issue Feb 14, 2021 · 1 comment

Comments

@sighe
Copy link

sighe commented Feb 14, 2021

I appreciate your developing a fast and good assembler and following up the issues here. We are currently working on animal genomes of >4Gb with high abundance (>60%) of repetitive elements.

Here is our experience with the latest species of 4.7Gb genome with PacBio CLR reads. While referring to the issue #218 , we have tried out the parameter sets '-l 6000 -m 200' and '-l 6000 -m 600' in addition to the default. The results did not differ that much but the both runs with '-l 6000' resulted in larger total assembly size by 0.2Gb, probably as expected.

Do you have any recommendation in parameter setting for such large, repetitive genomes? '-R -s' and '--aln-dovtail -1' like you suggested in #218 ? Is there any recommended value for '-s' in particular?

@ruanjue
Copy link
Owner

ruanjue commented Feb 16, 2021

Add -R, try --aln-dovetail -1 or --aln-dovetail 1024, also -l 6000. You can fast load alignments by --load-alignments, then increase -s and -l to build assembly graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants