Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output .bam #29

Open
JPFinnigan opened this issue May 23, 2017 · 0 comments
Open

Output .bam #29

JPFinnigan opened this issue May 23, 2017 · 0 comments

Comments

@JPFinnigan
Copy link

Spoke w/ @timodonnell via email. Asked to post an issue

varlens-reads can be used to subset an input .bam containing only those reads that overlap specific coordinate flags (including coordinates provided by .vcf files). A newly created .bam can be refined to include only those reads that overlap a locus, and meet certain additional filters such as strandedness or MapQ scores

Per the documentation:

    --min-mapping-quality 30 \
    --locus 22:46932040-46932050 \
    --out /tmp/result.bam

I'm wondering whether it would be possible to also add the capability to varlens (or Isovar for that matter) to subset an input .bam to only those reads overlapping a specified locus, and containing a specified base at said locus. For example

$ varlens-reads test/data/CELSR1/bams/bam_0.bam \
--locus 22:46932040-46932041 \
--is-'A'
--out /tmp/result.bam  

Output result.bam would consist of all reads that overlap Chr22:46932040-46932041 (irrespective of start/end sites), and contain the specified base 'A' at this position. If possible, the process could also accept coordinates and bases supplied by a.vcf. However, in when a .vcf is suppled the input result.bam contains those only reads that contain the ALT base, at loci specified by the .vcf

The use-case I have in mind is to generate allele-specific coverage tracks for a particular locus (or set of loci) to compare across treatment conditions. The purpose being facile comparison of--for example--the locus of a putative neoantigen locus pre-/post-treatement showing an immuno-editing event.

The closest example of what I'm thinking of would be the typical CHIP-Seq peak data visualized using GenomeBrowser (e.g. see below), where comparison of multiple pileup "tracks" each corresponding to treatment groups, where a gain/loss of a peak in one/more tracks is indicative of some biologic process.

http://www.bloodjournal.org/content/bloodjournal/127/24/2991/F2.large.jpg?width=800&height=600&carousel=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant