-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no SNPs in common between prior studies and the conventional GWAS #7
Comments
Hello, My guess is that most of the prior GWASs you're using (in the Z-matrix) have a quite low sample size, and therefore a small number of SNPs are reaching the threshold used to select instruments for MR so the package can't use them for the analysis. You could try two things:
Thank you. |
Dear Ninon, Thanks very much for your info. Sample sizes of the gwas sumstats are actually large, varying from 62k to hundreds of thousands. Some of the phenotypes have a causal effect on the focal phenotype (the conventional gwas) based on previous studies; so there should be some SNPs left after thresholding. Is there a column of p-value included in the ZMatrix? I didn't see it in your example files. Does the gwas sumstats have to be imputed (by i.e. ssimp) before running bGWAS? I didn't impute gwas sumstats, which resulted in missing values for some SNPs in the full ZMatrix. Do you think this caused the issue? I tried to impute with ssimp, but an error occurred. I'm still working on the imputation to see if the above issue can be solved. Many thanks, |
Hello, Thank you for the clarifications. Indeed, it seems that there should be (enough) instruments to perform the analysis.
|
Dear Ninon, Thanks very much for your comments. Finally got bGWAS running without an error. The previous issue was indeed due to incorrect format of the input gwas summary statistics files. Will imputation make big differences to the results? Currently I run bGWAS without imputing the gwas sumstats, so there are just over 1m SNPs in ZMatrix_Full.csv.gz. There are about 91k SNPs in ZMatrix_MR.csv.gz. However, no significant SNP was detected (please find attached the log file bGWAS_logfile.log), which does not seem right to me. Besides "MR_threshold" and "MR_ninstruments", which parameters would you recommend to adjust to get reliable results? Many thanks, |
There was also a Warning message at the end of the anlysis. Regards, |
Hi Ninon, I tried to only keep the common SNPs across all gwas sumstats in the ZMatrix_Full.csv.gz, and bGWAS ran successfully without any error or warning. The previous failure might be due to large amount of missingness of the Z score in the full matrix. I'll try to impute the sumstats to see if things can be improved. Regards, |
Hi Patrick, I am glad that you (finally) managed to perform the analysis. Looking at the log file, it seems that when you used your ZMatrix with NAs, the missing values were not correctly replaced by 0 before estimating the prior, that would explain why the correlation between prior effects and observed effects is NA (does not explain the warning message though) and also why you only have 291,355 SNPs left (when trying
Thank you. |
Hi Ninon, Thanks very much for your comments. Sorry for the late update. I had been working on other stuff. It started with 291,355 SNPs during "calculation of Bayes Factors and p-vales". Would you expect more SNPs for a total of 12,465,486 SNPs in the full ZMatrix? I tried your scripts as suggested. There were around 15% of missing values in the full ZMatrix.
To investigate if the issue was actually caused by the incorrect replacement of NAs by 0. I've converted NAs to 0 as suggested and saved the new ZMatrix. Then bGWAS was conduted using the newly saved full ZMatrix. Correlation between prior and observed effects for SNPs was still NA. Additionally, the same warning message was still produced ("In stats::cor(all.priors[, 6:7]) : the standard deviation is zero"). Below is the 2nd part of the log file: Out-of-sample R-squared for MR instruments across all chromosomes is NaNOut-of-sample squared correlation for MR instruments across all chromosome is NACorrelation between prior and observed effects for all SNPs is NACorrelation between prior and observed effects for SNPs with GWAS p-value < 0.001 is NAThe file CoefficientsByChromosome.csv has been successfully written.
Computing observed Bayes Factor for all SNPs...Done! Computing BF p-values...using a distribution approach: Estimating p-values for posterior effects...Done! Estimating p-values for direct effects...Done!
Selecting significant SNPs according to p-values...0 SNPs left Selecting significant SNPs according to p-values...4 SNPs left Pruning significant SNPs...distance : 500Kb Selecting significant SNPs according to p-values...0 SNPs left Here are the scripts used for the above analysis: keep only z-scores columnsfull %>% fraction of missing valuessum(is.na(Zscores))/(nrow(Zscores)*ncol(Zscores)) #use the newly saved full matrix in bGWAS Do you have any idea of what might causing the issues? Many thanks, |
Just ran bGWAS with the ZMatrix_Full including only common SNPs across all traits (7,954,416 SNPs vs. 10,754,303 the union of SNPs across all traits). The correlation between prior and observed effects was no longer NA (0.1277 for all SNPs and 0.4098 for SNPs with GWAS p-value < 0.001) and the number of SNPs to start was 7,954,416. So I assume bGWAS ran successfully. Still no significant SNP was detected according to BFs p-values, though more than a dozen of SNPs were genome-wide significant based on the p value of direct or posterior effects. |
Dear author,
Another issue came along as I was using bGWAS (as in the title and below). I have checked the two ZMatrices files manually. There are large amount of SNPs in common among the gwas studies.
Adding data from the conventional GWAS :
"allEC_unmunged"
Done!
0 SNPs in common between prior studies and the conventional GWAS
Thresholding...
0 SNPs left after thresholding
breast cancer - Age of Menarche - PCOS - uterine fibroids - epithelial ovarian cancer - T2D - HDL cholesterol - hypertension - Pulse wave Arterial Stiffness index - Impedance of whole body - High light scatter retic ulocyte percentage - Creatinine (enzymatic) in urine - Alanine aminotransferase (U/L) - C-reactive protein (mg/L) - SHBG (nmol/L) - Ankle spacing width - Length of menstrual cycle - Vascular/heart problems diagnosed by doctor: Heart attack : removed (less than 2 instrument after thresholding)
0 studies left after thresholding
Pruning MR instruments...
distance : 500Kb
Then it stayed in this status for hours without progressing.
I read the original code, but didn't find a clue.
Do you have any idea what's going on?
Regards,
xuemin
The text was updated successfully, but these errors were encountered: