Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error loading tabix index for building a region specific tree #31

Open
ccastane9 opened this issue Dec 9, 2020 · 2 comments
Open

error loading tabix index for building a region specific tree #31

ccastane9 opened this issue Dec 9, 2020 · 2 comments

Comments

@ccastane9
Copy link

Hi, I am trying to build a tree based on a specific region on a chromosome - however I am receiving the error below saying that there was an issue loading/reading the tabix file. I have my .vcf file and it's corresponding tabix file (.vcf.gz.tbi) in the same working folder, and I was able to build a tree based off of my .vcf file.

(vcf-kit) [ccastane9@andersserver-01 FKBP6_home]$ vk phylo tree nj ECA13_260.vcf 13:11230000-11700000 > ECA13_tree_11230000_11700000_260.newick
[E::idx_find_and_load] Could not retrieve index file for 'ECA13_260.vcf'
Traceback (most recent call last):
File "/home/ccastane9/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/phylo.py", line 104, in
main()
File "/home/ccastane9/miniconda3/envs/vcf-kit/lib/python3.7/site-packages/vcfkit/phylo.py", line 57, in main
for line in variant_set:
File "cyvcf2/cyvcf2.pyx", line 442, in call
AssertionError: error loading tabix index for b'ECA13_260.vcf'

@danielecook
Copy link
Contributor

@ccastane9 you have to bgzip the VCF file and index it with bcftools for this to work.

bcftools view -O z your_vcf.vcf > out.vcf.gz
bcftools index out.vcf.gz

Then the command should work.

@ccastane9
Copy link
Author

I believe this has worked, although now I am getting the error that there are no genotypes in my desired region (roughly 3.6Mb). I know this can't be true as there are genotypes at this region (roughly 104,000 variants) in the vcf file. The message is below:

vk phylo tree nj ECA13_260_vcfkit.vcf.gz I:8091163-11699996 > newtree.newick
no intervals found for b'ECA13_260_vcfkit.gz' at I8091163-11699996
no genotypes

I tried to run the file without using specific coordinates and that was unsuccessful as well. It didn't give the error that there were no genotypes like above, the tree output was just empty.
I believe I ran the commands as mentioned above:
bcftools view -O z [input.vcf] > [output.vcf.gz]
bcftools index [output,vcf.gz] --> this generated a file.vcf.gz.csi

Thanks for all of the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants