Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem computing cdf for chiseq, input is NaN, will exit #607

Open
stuartwillis opened this issue Nov 21, 2023 · 3 comments
Open

Problem computing cdf for chiseq, input is NaN, will exit #607

stuartwillis opened this issue Nov 21, 2023 · 3 comments

Comments

@stuartwillis
Copy link

I'm trying to run the hybrid score test with -doAsso 5 and received the cited error "Problem computing cdf for chiseq, input is NaN, will exit". Notably, the score test (-doAsso 2) and latent genotype (-doAsso 4) models both run on the same data. I would guess the issue lies with my data, but I'm not sure how to filter to avoid it. I have tried with both the conda and github versions of ANGSD. Command line is below. Any thoughts? TIA

/home/swillis/bin/angsd/angsd -b ocean-summer-NOR-male-bam.txt -out OUT_doAssoc2/ocean-summer-NOR-male-pheno-fork-assoc -ref /data/genomes/Ots/Otsh_v2.0/Ots2.0_LGnamed.fasta -sites OUTp/Ots_AAM_angsd.ALL.majorminor.txt -rf OUTp/Ots_AAM_angsd.ALL.regions.txt -uniqueOnly 1 -remove_bads 1 -only_proper_pairs 1 -trim 0 -C 50 -baq 1 -minMapQ 20 -minQ 20 -doCounts 1 -minInd 10 -setMinDepthInd 3 -GL 1 -nThreads 8 -doMajorMinor 3 -SNP_pval 0.000001 -doSnpStat 1 -sb_pval 0.001 -doHWE 1 -maxHetFreq 0.95 -doMaf 1 -minMaf 0.05 -doPost 1 -skipTriallelic 0.05 -doAsso 5 -minHigh 1 -minCount 10 -yQuant ocean-summer-NOR-male-pheno-fork.txt -cov ocean-summer-NOR-male-cov.txt -Pvalue 1 |& tee log_doAssoc5_ocean-summer-NOR-male-pheno-fork.txt

@auNathalie
Copy link

Hi,
Did you find the issue or a solution to the issue?
I'm running into the same error. Although using a different angsd option.

Best,
Nathalie

@ANGSD
Copy link
Owner

ANGSD commented Feb 6, 2024

Without having access to the data it is difficult to know what causes it.
I am afraid it will be some trial and error to figure out which position that is causing it, and why it is happening.

The programs reports the current chr and position for where it has successfully been able to complete an analyses for a region of the genome. I would relaunch the program but from the last known good position and then printout for every chunk -howoften 1

I would continue until I know the exact position that was causing the issue and then look a bit more into the sequencing data for that position. My guess is that it is a non so widely used combination of filters that causes a lack of data which in effect turns into NaN in the analyses. But I of course don't know.

Sorry for not being more helpful and for the late reply.

@stuartwillis
Copy link
Author

Thanks, I will give that a try. It just seems odd that the constituent models (-doAsso2 and 4) would run fine but the hybrid would be tripped. I suppose if the hybrid is just a time saver over the latent genotype model, then there's no major concern other than time...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants