New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Paintor Pipeline. #2
Comments
Hi James! Many thanks for your interest in our framework. I've been meaning to set something similar up but haven't had the time. I really appreciate the pipeline!! Please do link it to the repo. Regarding the NLopt failure, I have had on occasion that it fails on me. What is the size of your locus? |
Hi, I have added reference to your repo and the citation to my README.md. NLopt fails at different loci depending on how large I make the window etc, I assume this is because the algorithm cannot converge or something similar. Will find a specific failure today and upload to this thread. In my analysis pipeline I basically just run each locus independently first, and if I get a failure running http://permalink.gmane.org/gmane.science.analysis.nlopt.general/278 I am yet to determine the exact failure but it looks like -10 was returned. Cheers James. |
I have added two test files to the following dropbox folder to illustrate the failure. test1 and test1.LD contain a locus that is 200kb, I am using every SNP with a MAF > 2% in the European population. The region contains 1057 SNPs, is this a problem? Other regions with more SNPs do not fail. This is run with test2 and test2.LD contain a locus that is 100kb and contains less SNPs (601). This locus has a very high-set of Z-scores, very significant locus could that be driving this problem. https://www.dropbox.com/sh/quekc6qjn3nttrc/AAB4aqBLBY4IpvfLz1Iqlh1Na?dl=0 Cheers James |
Hi James, I noticed you have a lot of Zero-valued z-scores in your file (almost 1/3). Is there a particular reason why you left these in? I would definitely recommend removing them as (A) they will never be causal (B) you will see a large boost in speed by reducing the number of SNPs (and probably more numerical stability). |
Hi Gleb, Thanks, yes I will remove all those zeros. Yes I have also worked out another problem I believe on my end. Sometimes I find the haplotype R value is 100% different from the genotype R value I calculate in plink, I think that this has created Do you know of a tool that will calculate the R-value from haplotypes instead of genotypes easily, Plink is no good here afaik. Will fix these things and get back to you. Thanks for your great support. Hi James, I noticed you have a lot of Zero-valued z-scores in your file (almost 1/3). Is there a particular reason why you left these in? I would definitely recommend removing them as (A) they will never be causal (B) you will see a large boost in speed by reducing the number of SNPs (and probably more numerical stability). — |
Hi James, Yes, I would say the most challenging thing is getting your LD to match up correctly. If the genotypes and haplotypes are based on the same sample, it is strange that the correlation coefficient you obtain from haplotypes is different than the genotypes-- mathematically they should be the same. Maybe you're having an issue with phasing? Are you trying to calculate LD from a reference panel? If so, you need to make sure that the effect alleles for the z-score calculation match the effect allele (i.e the 1 allele) in the reference panel. |
Ahhh, got it PLINK is flipping alleles from the reference (need to reset them) to minor/major. Will just have to script that into my analysis. For anyone using plink to calculate the LD this is likely going Great help. Thanks |
@theboocock @gkichaev - hey gentlemen, I am running v2.1 despite it being deprecated for specific reasons ... I am getting the following error: terminate called after throwing an instance of 'std::runtime_error' This appears to happen when I am running v2.1 with annotations only, whether I am trying to get the marginal enrichment for each annotation or running the final model with annotated data. When I run it on Unannotated variants the algorithm runs to completion. Is the best solution here to trim the average locus size (I am submitting ~90 loci) or to reduce the number of loci in the run? |
@vlaufer. My suggestion would be to trim. Maybe filter out variants that have very low Z-scores (say in the meta analysis). --Gleb |
@gkichaev yeah - this makes a lot of sense in particular in light of your reasoning in the PAINTOR3 manuscript. I'll give it a try, and thanks. |
Hi,
I have been writing a pipeline to help with Paintor analyses, automating the annotation collection LD etc etc, all the parts that will make PAINTOR hard to use for your average biologist. This should give it a far-wider reach as a method.
I don't know if this is of interest to you guys, but I will post the link to the repo anyways.
https://github.com/smilefreak/fine_mapping_pipeline
Thanks for creating this software.
The text was updated successfully, but these errors were encountered: