Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare annotation percentages #11

Open
IRECG opened this issue Jul 15, 2019 · 5 comments
Open

Compare annotation percentages #11

IRECG opened this issue Jul 15, 2019 · 5 comments

Comments

@IRECG
Copy link

IRECG commented Jul 15, 2019

Hello again,
I have performed MBD-seq expermients with two groups: cases and controls. After annotating the DMRs I have the distribution showed below in the plot. I have compare the proportions of each annotation between groups, assuming that proportion of each one should be more or less equal between groups. I do have statistical significant differences between them in all the categories, that I can explain by the biological differences between the groups. But I also wonder if it would be possible to compare to a "expected" distribution. I have been reading the manual and the paper and I don't know if the function drawGenomePool does something similar or it there is any other way to do it. Thanks
image

@jeffbhasin
Copy link
Owner

Hi Irene,
Yes it is possible to add an expected/null distribution expectation to an annotation bar plot. The procedure would be to use drawGenomePool() to obtain a GenomicRanges object of a background set. Then, run this null set through goldmine() using the same settings as for the DMR data. The proportions that come from this annotation can form the "genomic background" group that can be added to the plot. This is what we've done in our own studies. It is possible to do a statistical test, such as binom.test() or a Fisher's exact test as well.

Jeff

@IRECG
Copy link
Author

IRECG commented Jul 24, 2019

Hi Jeff,
I've tried to do drawGenomePool() and I have a question about the query I have to use. Is the one with the total of my DMRs (including hypermethylated in controls and in cases)? Or should I use two, with the lenght of each one? I mean, if I do it with my "total" query I am not really comparing similar lenghts, because the number of DMRs I have in the two patterns is quite different.
Irene

@jeffbhasin
Copy link
Owner

It would be possible to have two nulls, one for the hypo- DMRs and one for the hyper-DMRs. This could get confusing when plotted on a bar graph because it would need 4 bars. One way to plot then is to show the enrichment of each query set (hyper or hypo DMRs) over it's own respective null as a fold change or odds ratio.

However, I generally have just combined both of my hyper- and hypo- DMRs and used that as a the genomic null set. The reason is I tried with them separate and didn't see a difference. The genomic background was sampled either way, regardless of changes in length distribution. I also found my hyper- and hypo- DMRs generally had the same length distribution, even if the total number of DMRs was different. Thus, it may be valid and simpler to just treat all DMRs together to generate a genomic null that can be compared to both hyper- and hypo- DMRs.

@IRECG
Copy link
Author

IRECG commented Jul 25, 2019

Thanks for your answer. So when you talk about length is not the number of DMRs but the width of them? I'm not sure I've understood it properly
And also I want to know if I can present this proportions as the expected ones...

@jeffbhasin
Copy link
Owner

jeffbhasin commented Jul 25, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants