Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Large Confidence Interval of LogFC #142

Open
JiaxinLi-lipluszn opened this issue Oct 13, 2020 · 2 comments
Open

Unexpected Large Confidence Interval of LogFC #142

JiaxinLi-lipluszn opened this issue Oct 13, 2020 · 2 comments

Comments

@JiaxinLi-lipluszn
Copy link

Hi! I'm using MAST to identify sex-biased genes (Differentially Expressed Genes between males and females) in a dataset with both case and control samples.

zlmCond <- zlm(~sex + cngeneson + diagnosis + region + Capbatch + Seqbatch + RNA.Integrity.Number + RNA.mitochondr..percent + RNA.ribosomal.percent + age, 
               sca, method = "bayesglm", ebayes = F, silent =T)

summaryCond <- summary(zlmCond, doLRT = 'sexM')
summaryDt <- summaryCond$datatable


fcH <- merge(summaryDt[contrast=='sexM' & component=='H',.(primerid, Pr(>Chisq))], 
              summaryDt[contrast=='sexM' & component=='logFC', .(primerid, coef, ci.hi, ci.lo)], by='primerid')

When I did the analysis on the whole dataset, MAST performed expectedly. Because for some marker genes on sex chromosomes like XIST, the logFC that MAST got are just in the correct direction. For example, XIST is a gene on X chromosome only expressed in females, so MAST gets negative logFC with small FDR.

However, When I separate the dataset into 2 datasets (1 only with control samples and another one only with case samples), strange things happened, I performed nearly the same analysis (I eliminated the diagnosis factor in the model because now each group only has one level of diagnosis) on these 2 datasets. But the marker genes showed abnormal logFC and confidence interval.

For example, in the case group, XIST has a large positive logFC which means it's identified as male-biased in the case group. And the length of the confidence interval is large with ci.hi > 0 and ci.lo < 0 . I'm sure this is not because I thought in the wrong direction. Because in the same analysis, Y chromosome genes have positive logFC. And it's also not because of the low quality of data. I plotted the scatter plot of the expression level in each cell. It's obvious that XIST shows expected female-biased expression patterns in both the case group and control group.

I'm stuck here for a long time because I'm not sure how the confidence interval of LogFC is calculated. Any help will be appreciated!

@gfinak
Copy link
Member

gfinak commented Oct 13, 2020

How do you propose we help you?
You haven't posted any data, figures, or any other information about how many cells, how many samples, or the design of your experiment.
Not sure what we can do for you unless you can share some data and maybe a reproducible example.

@gfinak
Copy link
Member

gfinak commented Nov 19, 2020

@JiaxinLi-lipluszn do you have anything actionable we can follow up on? Did you solve your problem, answer your question? Do you have a reproducible example we can work with to answer your question?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants