Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you assemble chromosomes X and Y? #625

Open
zuodabin opened this issue Mar 18, 2024 · 3 comments
Open

How do you assemble chromosomes X and Y? #625

zuodabin opened this issue Mar 18, 2024 · 3 comments

Comments

@zuodabin
Copy link

Dear author, how can you assemble X and Y chromosomes? What parameters and data need to be added

@DustinSokolowski
Copy link

DustinSokolowski commented Mar 18, 2024

Hey,

Not an author but pretty familiar with genome assembly and annotation. There shouldn't be extra parameters to assemble X. X chromosome contigs/scaffolds will be represented by some scaffold number in the maternal haplotype just like any other chromosome. Sex chromosomes undergo much less chromosome re-arrangements so X should be relatively easy to identify. The easiest options are:

  1. If there is a relatively closely related species with a well assembled X (for example you're assembling a rodent you can use mouse), you can align your scaffolds to their assembly and pull out what matches to their X. Since the X chromosome signal is typically robust, I haven't ever needed to tune parameters or test specific tools for this. Honestly minimap2 + DGenies to visualize the genome-wide dot plots typically does the trick for me.
  2. Annotate the genes on your scaffolds, the X chromosome scaffolds should have a huge spike of X chromosome genes (e.g., 85%+ of genes originate from the X chromosome in other species). Again, super robust signal so most popular assembly annotation tools work very well. Personally, I found TOGA to do a good job of assigning gene symbols and identifying transcripts so you'll get a nice two-for-one for genome annotation, but again, I've identified X chromosomes wiith TOGA, GeMoMa, embl annotations, Biser2 etc. etc.

The Y chromosome on the other hand is notoriously tricky to assemble due to it's repetitive and heterochromatic nature, not to mention the regions that look like the X chromosome. Again, no special parameters needed, but folks who want a complete Y chromosome assembly typically pull down the Y chromosome prior to sequencing. In theory you should be able to find some Y chromosome fragments in the paternal haplotype but working with those fragments can be tricky.

@zuodabin
Copy link
Author

zuodabin commented Mar 18, 2024 via email

@chhylp123
Copy link
Owner

I agree with @DustinSokolowski. Generally there is no specific parameter for chrX and chrY since hifiasm does not have assumption when doing the de novo genome assembly. ChrX and chrY might be fragmented in contig-level, but it should be easy to get scaffold-level chromosomes for chrX and chrY.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants