Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the statistics affected by alfred qc --bed? #17

Open
evanbiederstedt opened this issue Jun 27, 2019 · 2 comments
Open

What are the statistics affected by alfred qc --bed? #17

evanbiederstedt opened this issue Jun 27, 2019 · 2 comments
Assignees
Labels

Comments

@evanbiederstedt
Copy link

Hi Tobias

I haven't dug closely into the source code for this, so apologies if this question is a bit lazy:

What are the metrics affects by using the optional --bed flag for alfred qc?

I suspect this affects things like target coverage calculated....but I'm not sure what else.

Given a standard WGS normal BAM at 40x, what would you expect the different to be between including the target BED or excluding it?

Thank you for the help

@tobiasrausch
Copy link
Owner

Hi,

The optional BED file of target regions will not affect the whole-genome statistics. For the Alfred web app you will get the same statistics in the "Summary stats" tab and in addition some summary statistics for the BED file like the fraction of reads in a BED target region.

For the GC content, you will still have the sample and reference GC distribution but in addition the GC distribution for the BED file.

Most importantly you will get 2 additional plots related to the input BED file:
(1) Target coverage distribution (Fraction of targets above a coverage level 1x, 2x, 3x, ....)
(2) The on-target rate (Fraction of reads on target at 0bp extension, 25bp extension, ...)

In short, a BED file makes sense for all targeted assays (whole-exome sequencing, Haloplex, PCR amplicon sequencing, ...)

Best, Tobias

@tobiasrausch tobiasrausch self-assigned this Jun 28, 2019
@evanbiederstedt
Copy link
Author

Hi @tobiasrausch

Thank you much for the prompt reply.

For the GC content, you will still have the sample and reference GC distribution but in addition the GC distribution for the BED file.

Most importantly you will get 2 additional plots related to the input BED file:
(1) Target coverage distribution (Fraction of targets above a coverage level 1x, 2x, 3x, ....)
(2) The on-target rate (Fraction of reads on target at 0bp extension, 25bp extension, ...)

Ah, this does make a great deal of sense. Right, I see how this works now in the source code a bit better now as well.

Thank you for the help! I'm still not entirely sure how this relates to tools for targeted assays which require both target and baits intervals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants