Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing data is ignored with variant files but not .Rtab #157

Open
julibeg opened this issue Jun 3, 2021 · 4 comments
Open

missing data is ignored with variant files but not .Rtab #157

julibeg opened this issue Jun 3, 2021 · 4 comments
Assignees

Comments

@julibeg
Copy link
Contributor

julibeg commented Jun 3, 2021

Missing genotypes in variant files are ignored:

pyseer/pyseer/input.py

Lines 485 to 486 in 2e27979

elif sample in d and np.isnan(d[sample]) and str(haplotype) != '.':
del d[sample]

However, in .Rtab files they are treated as missing data and the fit fails later on:

pyseer/pyseer/input.py

Lines 423 to 424 in 2e27979

elif present == ".":
d[sample] = np.nan

Is this intended? For now I have replaced d[sample] = np.nan with continue to also get a fit for genes with a few missing entries.

@mgalardini
Copy link
Owner

I think this is due to the fact that we do not really expect missing values in an .Rtab file, whereas they can be quite common in vcf files. I think we could implement your proposed change to be more consistent.

Just out of curiosity, was the .Rtab file you were using coming out of panaroo/roary? If so I was not aware of the fact that it could contain missing values

@mgalardini mgalardini self-assigned this Jun 3, 2021
@julibeg
Copy link
Contributor Author

julibeg commented Jun 3, 2021

Makes sense.

No, it was a custom .Rtab file.

@mgalardini
Copy link
Owner

Ok that makes sense. If you would like to open a PR we could merge this change. If you know how to add unit tests that would also be great. If not, I can do it once the change it's merged

@julibeg
Copy link
Contributor Author

julibeg commented Jun 6, 2021

will do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants