Paintor Pipeline. #2

theboocock · 2015-06-11T00:01:59Z

Hi,

I have been writing a pipeline to help with Paintor analyses, automating the annotation collection LD etc etc, all the parts that will make PAINTOR hard to use for your average biologist. This should give it a far-wider reach as a method.

I don't know if this is of interest to you guys, but I will post the link to the repo anyways.
https://github.com/smilefreak/fine_mapping_pipeline
Thanks for creating this software.

gkichaev · 2015-06-11T16:14:03Z

Hi James!

Many thanks for your interest in our framework. I've been meaning to set something similar up but haven't had the time. I really appreciate the pipeline!! Please do link it to the repo.

Regarding the NLopt failure, I have had on occasion that it fails on me. What is the size of your locus?

theboocock · 2015-06-11T21:55:38Z

Hi,

I have added reference to your repo and the citation to my README.md.

NLopt fails at different loci depending on how large I make the window etc, I assume this is because the algorithm cannot converge or something similar. Will find a specific failure today and upload to this thread.

In my analysis pipeline I basically just run each locus independently first, and if I get a failure running
PAINTOR I delete that locus from input.files. I investigated the exact error following thread on NLOpt

http://permalink.gmane.org/gmane.science.analysis.nlopt.general/278

I am yet to determine the exact failure but it looks like -10 was returned.

Cheers James.

theboocock · 2015-06-11T23:45:11Z

I have added two test files to the following dropbox folder to illustrate the failure.

test1 and test1.LD contain a locus that is 200kb, I am using every SNP with a MAF > 2% in the European population. The region contains 1057 SNPs, is this a problem? Other regions with more SNPs do not fail. This is run with -c 1 otherwise the runtime is unreasonable.

test2 and test2.LD contain a locus that is 100kb and contains less SNPs (601). This locus has a very high-set of Z-scores, very significant locus could that be driving this problem.

https://www.dropbox.com/sh/quekc6qjn3nttrc/AAB4aqBLBY4IpvfLz1Iqlh1Na?dl=0

Cheers James

gkichaev · 2015-06-16T02:04:03Z

Hi James,

I noticed you have a lot of Zero-valued z-scores in your file (almost 1/3). Is there a particular reason why you left these in? I would definitely recommend removing them as (A) they will never be causal (B) you will see a large boost in speed by reducing the number of SNPs (and probably more numerical stability).

theboocock · 2015-06-16T02:13:52Z

Hi Gleb,

Thanks, yes I will remove all those zeros.

Yes I have also worked out another problem I believe on my end.

Sometimes I find the haplotype R value is 100% different from the genotype R value I calculate in plink, I think that this has created
many downstream problems for me using these fine-mapping tool. This is definitely a problem because the Z-scores are going in
the wrong direction for the genotype R value. Not sure how this effects the results but something I am solving now, it definitely makes
a difference in the results of the newly published method caviarbf.

Do you know of a tool that will calculate the R-value from haplotypes instead of genotypes easily, Plink is no good here afaik.

Will fix these things and get back to you.

Thanks for your great support.
Also, the trans-ethnic fine-mapping you have added looks very interesting.
Thanks James.
On 16/06/2015, at 2:04 pm, Gleb Kichaev <notifications@github.com mailto:notifications@github.com> wrote:

Hi James,

I noticed you have a lot of Zero-valued z-scores in your file (almost 1/3). Is there a particular reason why you left these in? I would definitely recommend removing them as (A) they will never be causal (B) you will see a large boost in speed by reducing the number of SNPs (and probably more numerical stability).

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-112259305.

gkichaev · 2015-06-16T02:36:45Z

Hi James,

Yes, I would say the most challenging thing is getting your LD to match up correctly.

If the genotypes and haplotypes are based on the same sample, it is strange that the correlation coefficient you obtain from haplotypes is different than the genotypes-- mathematically they should be the same. Maybe you're having an issue with phasing?

Are you trying to calculate LD from a reference panel? If so, you need to make sure that the effect alleles for the z-score calculation match the effect allele (i.e the 1 allele) in the reference panel.

theboocock · 2015-06-16T03:06:07Z

Ahhh, got it PLINK is flipping alleles from the reference (need to reset them) to minor/major.

Will just have to script that into my analysis. For anyone using plink to calculate the LD this is likely going
to be a problem.

Great help.

Thanks

ghost · 2017-10-11T19:37:51Z

@theboocock @gkichaev - hey gentlemen, I am running v2.1 despite it being deprecated for specific reasons ...

I am getting the following error:

terminate called after throwing an instance of 'std::runtime_error'
what(): nlopt failure

This appears to happen when I am running v2.1 with annotations only, whether I am trying to get the marginal enrichment for each annotation or running the final model with annotated data. When I run it on Unannotated variants the algorithm runs to completion.

Is the best solution here to trim the average locus size (I am submitting ~90 loci) or to reduce the number of loci in the run?

gkichaev · 2017-10-11T23:43:01Z

@vlaufer. My suggestion would be to trim. Maybe filter out variants that have very low Z-scores (say in the meta analysis).

--Gleb

ghost · 2017-10-12T01:48:49Z

@gkichaev yeah - this makes a lot of sense in particular in light of your reasoning in the PAINTOR3 manuscript.

I'll give it a try, and thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paintor Pipeline. #2

Paintor Pipeline. #2

theboocock commented Jun 11, 2015

gkichaev commented Jun 11, 2015

theboocock commented Jun 11, 2015

theboocock commented Jun 11, 2015

gkichaev commented Jun 16, 2015

theboocock commented Jun 16, 2015

gkichaev commented Jun 16, 2015

theboocock commented Jun 16, 2015

ghost commented Oct 11, 2017 •

edited by ghost

gkichaev commented Oct 11, 2017

ghost commented Oct 12, 2017 •

edited by ghost

Paintor Pipeline. #2

Paintor Pipeline. #2

Comments

theboocock commented Jun 11, 2015

gkichaev commented Jun 11, 2015

theboocock commented Jun 11, 2015

theboocock commented Jun 11, 2015

gkichaev commented Jun 16, 2015

theboocock commented Jun 16, 2015

gkichaev commented Jun 16, 2015

theboocock commented Jun 16, 2015

ghost commented Oct 11, 2017 • edited by ghost

gkichaev commented Oct 11, 2017

ghost commented Oct 12, 2017 • edited by ghost

ghost commented Oct 11, 2017 •

edited by ghost

ghost commented Oct 12, 2017 •

edited by ghost