Add functionality to use chewBBACA's v3 new hashed output #15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
chewBBACA recently got an update that includes a feature to output hashed sequences instead of allele numbers. Since cgmlst-dists has specific filters for the input data, it is incompatible with this new format.|
This PR addresses this issue by adding a -H parameter to the command line interface to indicated the new hashed sequence format. When the parameter is supplied, a 64 bit integer is calculated from the hexadecimal hash and is used instead of the allele number when calculating distances.
Furthermore, when any allele in the pairwise comparison is of the form "-" or "NA", it is ignored and does not add to the distance.