Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with single GT allele value - UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 1: ordinal not in range(128) #291

Open
davmlaw opened this issue Dec 22, 2023 · 2 comments

Comments

@davmlaw
Copy link
Contributor

davmlaw commented Dec 22, 2023

If a VCF has a single GT value, cyvcf2 crashes out with:

UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 1: ordinal not in range(128)

VCF spec says

Haploid calls, e.g. on Y, male non-pseudoautosomal X, or mitochondrion, are indicated by having only one allele value

Example file:

single_gt.vcf.gz

Test code (using vcf above)

In [1]: from cyvcf2 import VCF

In [2]: reader = VCF("./single_gt.vcf.gz")

In [3]: v = next(iter(reader))

In [4]: v.format("GT")
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
Cell In [4], line 1
----> 1 v.format("GT")

File /usr/local/lib/python3.10/dist-packages/cyvcf2/cyvcf2.pyx:1353, in cyvcf2.cyvcf2.Variant.format()

UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 1: ordinal not in range(128)

@brentp
Copy link
Owner

brentp commented Dec 22, 2023

Hi Dave,
Don't use format for GT. use variant.genotype for an object or variant.genotypes for an array.

@davmlaw
Copy link
Contributor Author

davmlaw commented Jan 2, 2024

Thanks - will do that as a workaround.

I usually use the specific methods (which work fine), in this case I wanted to store all format fields in JSON

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants