Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

B.1.118 gets called as B.1 #49

Open
KatSteinke opened this issue Nov 21, 2023 · 5 comments
Open

B.1.118 gets called as B.1 #49

KatSteinke opened this issue Nov 21, 2023 · 5 comments

Comments

@KatSteinke
Copy link

KatSteinke commented Nov 21, 2023

With version 1.23.1, one of our positive controls which has been consistently called as B.1.118 suddenly gets called as B.1.
We're running pangolin 4.3 in usher placement mode, relevant versions are

 - constellations==0.1.12
  - pangolin==4.3
  - pangolin-data==1.23.1
  - scorpio==0.3.17
  - tabulate<0.9.0
  - usher==0.6.3

Given it's a positive control I should be able to share the sequence if needed, but it looks like this might be a general issue with B.1.118 sequences - UCSC UShER gives the same results for a bunch of B.1.118 genomes from GISAID, while COG-UK (still on 1.22) gives B.1.118 - kudos to Ammar Aziz over on the µbioinfo slack for digging into it.

@rmcolq
Copy link
Contributor

rmcolq commented Nov 21, 2023

There was a comment on the release of 1.23:
*** NOTE: the v1.23 tree provokes a corner-case bug in usher-sampled prior to version 0.6.3 that causes some lineage A samples to be assigned to A.* sublineages or even B or B.* sublineages. If you will be running pangolin on early 2020 sequences that may be lineage A, then it is highly recommended to use the assignment cache (install by running pangolin --add-assignment-cache, run pangolin on input sequences with --use-assignment-cache) and to update the usher package in your pangolin environment to 0.6.3 as soon as it is released.
Are you using the assignment cache mode?

@KatSteinke
Copy link
Author

KatSteinke commented Nov 21, 2023

We’re not - we don’t have any A lineages among our control samples, and I‘d understood the instructions in the notes as a workaround until Usher 0.6.3 was available and thus assumed it wasn’t relevant now that version was out. I’ll try and see how it looks with assignment cache mode as soon as I can.

@rmcolq
Copy link
Contributor

rmcolq commented Nov 21, 2023

Perhaps @AngieHinrichs can clarify if this is the problem still?

@KatSteinke
Copy link
Author

KatSteinke commented Nov 21, 2023

The issue seems to persist with --add-assignment cache followed by running with --use-assignment-cache.

 pangolin /path/to/positive_control.consensus.fasta --outfile /path/to/pangolin-assignment.csv --threads 6 --analysis-mode usher --use-assignment-cache

in a fresh conda env with the specs given above results in the following output:

taxon lineage conflict ambiguity_score scorpio_call scorpio_support scorpio_conflict scorpio_notes version pangolin_version scorpio_version constellation_version is_designated qc_status qc_notes note
positive_control B.1 0.0 PUSHER-v1.23.1 4.3 0.3.17 v0.1.12 False pass Ambiguous_content:0.02 Usher placements: B.1(1/1)

@AngieHinrichs
Copy link
Member

Thanks for reporting this. I will fix it in the next release.

Due to a recent shuffling around of the order in which mutations are annotated on successive branches, B.1.118 is annotated on a small branch within the larger branch where it should be annotated, with two extra mutations, one of which is absent from most samples. In previous versions, although B.1.118 was annotated on a branch that had the two extra mutations, the extra mutations were placed on a larger branch, and then one was reverted to reference on a sub-branch that covered most of the samples. The effect was that in the previous version, B.1.118 samples without the extra mutation(s) would be placed on the branch where B.1.118 was annotated (with a reversion on the mutation that shouldn't have been in the path in the first place), but now, with an arguably better structure / order of mutations, B.1.118 is annotated on a sub-branch and I need to fix the annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants