Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2-color cortex? #43

Open
yannickwurm opened this issue Oct 21, 2016 · 3 comments
Open

2-color cortex? #43

yannickwurm opened this issue Oct 21, 2016 · 3 comments

Comments

@yannickwurm
Copy link

Hello,
as discussed briefly with Zam, it would be pretty neat if cortex could handle two layers of colors:

  • one to identify sample as already exists
  • and one for 10x genomics chromium-like information (about linkage between molecules: with their tech, molecules of ~150kb are tagged with individual barcodes and sequenced at low-ish coverage to provide linkage and phase information over greater distances than normally possible).
    This should reduce ambiguity (bubbles) in assembly, and allow phase-resolved assembly and genotyping in multi-ploid species.
@iqbal-lab
Copy link

And to follow up @yannickwurm - Mccortex already does this (not sure if I was clear) - it allows longer range paths to be stored. What it does not have is a way to use colour 2 to improve the assembly of colour1. Anyway - best person to ask is @noporpoise

@noporpoise
Copy link
Member

Hi @yannickwurm,

Yes this should be possible. Such information would be most useful when you already have a long contig and you are extending it. You'd need to assemble for a while to figure out which 10X genomics fragment(s) you're actually on. It might be easier to use map 10X genomics reads onto contigs for scaffolding after assembly. There was an interesting paper from Serafim Batzoglou's group on improving mapping by using 10X-style information[1].

To add the information into the graph, a multicolour approach would work but you'd need a new colour per sample which would be memory intensive. A low memory implementation could use a bloom filter to store unitig => fragment membership. It would also require a statistical model to make junction choices. I'm afraid it's not something I could do at the moment. Certainly an interesting idea though.

[1] Read clouds uncover variation in complex regions of the human genome

@winni2k
Copy link

winni2k commented Oct 23, 2017

This sounds like something that might be easy to implemented in CortexJDK?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants