Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove max_count/max_lineage 'voting' logic from usher_parsing #521

Merged
merged 1 commit into from
May 15, 2023

Conversation

AngieHinrichs
Copy link
Member

Finally getting around to something I've been meaning to do since #492: removing the logic that overrides usher's tie-breaker logic with the plurality of lineage placements in case of multiple placements in different lineages. For example, usher might find 3 equally parsimony-optimal placements (EPPs), one in BA.5 and two in BA.5.2 -- and initially I thought that would mean it's more likely that the sequence fits in BA.5.2, but with increasing amplicon dropout problems over time, sometimes it simply means that the sequence happens to have Ns in places that allow it to be placed in different parts of BA.5.2 even if it doesn't necessarily have the BA.5.2-defining mutation. The more uncertain the placement is, the more speculative the "voting" is, and the better usher's tie-breaker (which I think favors the branch with more descendants, usually the more basal branch) seems to do.

I tested this on GISAID seqs with IDs in the range EPI_ISL_15340000-15349999 and it behaved as expected, leaving most assignments unchanged but no longer assigning the lineage with the most EPPs in several cases.

@rmcolq feel free to review the changes or not depending on time / interest. I will merge it in a couple days if I don't hear otherwise.

After this is merged, may I tag a pre-release?

If the next pangolin-data release does not include the pangoLEARN *.joblib files then it will require pangolin v4.3, so I think it would be better to release pangolin v4.3 at least a day before the next pangolin-data release (which is still probably at least a week away). I don't anticipate any problems from using pangolin v4.3 with the current release of pangolin-data.

@AngieHinrichs AngieHinrichs merged commit 09e78b1 into cov-lineages:master May 15, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant