-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add view -X flag to drop all aux tags #871
base: develop
Are you sure you want to change the base?
Conversation
I don't think this belongs in htslib really, but it could make use of the htslib feature to give hints to the decoder. On BAM there's little that can be done (except maybe not bothering to do validation?), but with CRAM it's possible to tell the decoder to ignore blocks in the file - don't bother decompressing them and no need to serialise all the tags together. Eg:
(Clearly we ought to add |
Maybe there's also an argument for adapting how the current However input-fmt-option is a hint. When reading a BAM record we'll have read all the data so the fields are already there. When reading CRAM, if the data is necessary for decoding of other fields (eg we must know POS and RNAME to decode SEQ) then it'll be in the structures, but otherwise it'll be given a place-holder value (*, 0, etc). Perhaps what we want though is a required_fields equivalent for output-fmt-option which goes beyond an optimisation hint to become a statement of what will be stored. At this point it's essentially a crude columnar filter. (Crude because it's all or nothing as far as tags go, barring RG.) Thoughts anyone? |
Is there documentation somewhere for what the --input-* and --output-* args take? I keep finding random examples scattered around but no exhaustive doc. |
Does #516 with an empty whitelist do the same? |
Thanks @EvanTheB - I'd totally forgotten about that aging PR! We should discuss it and make a decision as it's plenty mature by now. :-) As for the options arguments, they're in the samtools man page under "GLOBAL OPTIONS". Quite a lot are CRAM only or only apply on input or output, but this is described in the text. We just added level (compression level) to there to so I'll update the man page. |
I do not know if you want this feature, and it is implemented in the wrong place.
I added -X flag to drop all aux tags. I use this for compression when I just want to save the fastq-ish data.
I think I should have added the code to htslib, if you want it, I am happy to modify so it works like that.