-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge alignments into existing BAM/CRAM/SAM #1979
base: develop
Are you sure you want to change the base?
Conversation
Your develop is way behind. Please update so you can then rebase it. A lot of the test failures are elsewhere in samtools, but have been fixed already. |
Regarding the functionality, I'm tackling this from a different direction and this is ongoing work that's likely to land in a PR soon.
However, starting from that point, we have name-collated aligned data with enough information in the primary alignment to correct MM/ML problems in the secondary alignments. We already usually run fixmate at this point to fix up aligner glitches, but potentially also to cache some useful information for a subsequent duplicate removal step. So adding MM/ML to fixmate here also feels like a natural step. This avoids having to use the original FASTQ or uBAM to try and patch the aligner output. That strategy - a lift over from another file - is still valid and useful in other cases, obviously headers, but potentially some tags also. See https://gitlab.com/german.tischler/biobambam2/-/blob/master/src/programs/bamauxmerge.1?ref_type=heads for an example tool which does this already, lifting tags from one file to another. Samtools having this functionality may well be a useful addition. |
Thanks James. I've just been looking at your fixmate_MM branch and I think I might be able to reuse a few of your functions to trim MM tags. Maybe you can put them into htslib? Part of my motivation is that basemods aren't the only AUX tag that my mapper hasn't copied appropriately. |
Also are MN tags specified anywhere? I can't find them on hts-spec's samtags? |
Yes, as a PR that really needs some traction: |
I'd like them in htslib at some point, but right now I'm just trying to get my head around the nightmare of fixmate main loop! Fundamentally it's broken for what we need to do. As soon as it finds the mate, it clears all "prev"ious meta-data. Hence if we have a primary pair and a secondary aligned pair then there's no way to lift over MM information from the primaries to the secondaries. I've no idea why this code is so convoluted, but I'm thinking of just rewriting it to buffer up all reads with the same name (assuming not "*") and process in batches of name-by-name. It's cleaner and simpler. Attempting to toggle between two The alternative (perhaps safe) option is to leave fixmate as a "it works, don't touch!" thing and write a new processing command. Tempting. |
No description provided.