Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chronic BA.2.12.1 in Chile led to 7 subsequent infections, in multiple forms. [9 seqs] #1625

Closed
Sinickle opened this issue Feb 7, 2023 · 16 comments
Labels
BA.2 monitor currently too small, watch for future developments Saltation Appears on long branch length with no intermediates

Comments

@Sinickle
Copy link

Sinickle commented Feb 7, 2023

I don't think this can be designated at the moment, but I believe it may be spreading, and wanted to raise some awareness on it.

There are two older sequences (EPI_ISL_15792351 Oct 26, EPI_ISL_16653618 Jan 08) which appear to be from the same patient, and show a standard progression for a chronic sequence.
There are two sequences from two different patients (EPI_ISL_16834974, EPI_ISL_16833893), neither of which are the chronic patient, both sampled on Jan 26.

I've noted the chronic patient as "1" and the other two as "2" and "3".
-click for USHER link-
image

I believe this chronic patient's infection must be parental to the other two patient's genomes. However, how closely these are all related to one another isn't exactly clear. This is made worse by there being many non-accounting nucleotides, especially in the chronic patient's sequences. I suspect this is because they have at least two distinct genome types in them, which results in the large difference in the two infected patients. I've made a grid for the mutations relative to BA.2.12.1 in these 4 sequences, here.

For the following section, I consider a non-accounting nucleotide to mean that it is everything, and therefore that spot can't be considered unique for any sequence.

All 4 sequences have the mutations:
C11767T, C13197T, C20233T, G21604A, T22054G, G22770A, C23673T, T24869C.

EPI_ISL_16834974 (red 2 on the tree) seems to have the following unique mutations:
T13114C, G15451T, C19029T, A19600C, C21762T, G21901T, C23520T, C27002T, T28297C

EPI_ISL_16833893 (red 3 on the tree) seems to have the following unique mutations:
G22899A, A25470C

I think that patient 3 is well described by the chronic patient's second sequence, but patient 2 is not as well described by either of the chronic patient's sequences. Also, despite USHER showing pat.2 as not being a descendent of the second chronic sequence, they have two unique similarities.
The chronic patient's second sequence shares the following similarities with patient 2:

  • C16293T is unaccounted in the chronic sequence, is in pat.2, and not present in the other 2 sequences.
  • G22899T is unaccounted in the chronic sequence, is in pat.2, and not present in the other 2 sequences.

Additionally, I think the timing of patient 2 and 3 being sequenced on the same day seems too coincidental for it to not be related...

All 3 patients are in the same region, but separate cities.

I'll highlight the spike mutations...
Chronic 1: T95?, S112?, W152?, N164K, D339?, K417?, R403K, G446S, L452?, T478?, E484?, F486?, Q493Q, L1103F
Chronic 2: T95?, S112P, W152?, P139S, N164K, D339E, R403K, G446?, L452K, N460?, T478R, E484?, F486?, Q493Q, V503?, P621?, D1084?, F1103L
Patient 2: T95I, K113N, W152C, N164K, R403K, G446I, A484V, F486P, R493Q, A653V, F1103L
Patient 3: S112P, P139S, W152C, N164K, D339E, R403K, K417?, N440?, G446D, L452K, N460K, K478R, F486L, R493Q, P621S, D1084E, F1103L

Notably, the non-accounting spike mutations in patient 3 are ones that are fairly frequently mutated as well (K417T, N440R). These two mutations also came up in #1624 , along with a mutation at D339.
F486P is notable in patient 2 of course. G446I is quite odd as a 2 nt mutation which seems to be very detrimental according to JBloom's numbers.

There's a lot going on with these sequences, so feel free to comment your own analysis or point out if I missed anything!

Also, thanks to @ryhisner who found a few of the sequences, talked through it with me and also created the USHER tree used here!

@oobb45729
Copy link

Another long branch with S:P621S after this one https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?variantQuery=S%3A621S%26S%3A642G%26S%3A681R& and this one https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?variantQuery=S%3A621S%26S%3A222D& ?
What does it do?
It is rising even when the main lineage that carries it (XAY) is not counted.
https://cov-spectrum.org/explore/World/AllSamples/AllTimes/variants?variantQuery=%21nextcladePangoLineage%3AXAY*%26S%3AP621S&
P621S

@thomasppeacock thomasppeacock added BA.2 monitor currently too small, watch for future developments labels Feb 7, 2023
@corneliusroemer corneliusroemer added the Saltation Appears on long branch length with no intermediates label Feb 7, 2023
@corneliusroemer
Copy link
Contributor

I was just investigating this one as I had also come by it. Good to know that we now catch these things independently pretty quickly - it means we're unlikely to miss things.

@ryhisner
Copy link

ryhisner commented Feb 16, 2023

There's a new sequence in this lineage uploaded today, and it's from a different individual than the previous sequences. EPI_ISL_16942017

Once again, it has quite a few different mutations than the others. All the sequences have come from the Chilean region of Coquimbo, but they have come from three different cities: Coquimbo (1 seq), Vicuna (2 seq, 1 patient), and La Serena (2 seq, 2 patients).
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/Chile_BA.2.12.1_2023-2-16_subtreeAuspice1_genome_2335e_e6ddd0.json
image

@FedeGueli
Copy link
Contributor

Bad it seems the first 2nd gen ba.2.12.1 based variant arose. @ryhisner how many samples from.that area in 2023? just to guess a prevalence there.

@ryhisner
Copy link

Oh, wow, only 45 sequences from Coquimbo in 2023 so far, so this is a substantial percentage.

@Sinickle
Copy link
Author

Sinickle commented Feb 17, 2023

Okay, I've updated the chart to have a new page which compares/contrasts the 4 previous sequences with this new one.
From what I can see, one really does appear to be a second descendent of the chronic patient's first sequence, and a sibling of patient 2's sequence.

There is only one reversion compared to Chronic 1, at 21987. All other spots are at worst ambiguous.
Only other stand-out peculiarity is that at 16293, patient 2 shares the mutation, but it is absent in chronic 1, and ambiguous in chronic 2.

I think though that chronic 2 is a slightly better match than chronic 1 up to 16293. Both other patient's sequences are a poor fit.

Here are the differences.
I'll abbreviate to chronic 1 (C1), chronic 2 (C2), new sequence (NS)
13458 - ambiguous in C1, not mutated in C2, not mutated in NS.
16293 - not mutated in C1, ambiguous in C2, mutated in NS
17528 - not mutated in C1, ambiguous in C2, not mutated in NS

Then at 21986, C1 is ambiguous, C2 is mutated, and NS is not.

I don't think it's strong enough to make a conclusion that there was a recombination event though.

Chronic 1: T95?, S112?, W152?, N164K, D339?, K417?, R403K, G446S, L452?, T478?, E484?, F486?, Q493Q, L1103F
Chronic 2: T95?, S112P, W152?, P139S, N164K, D339E, R403K, G446?, L452K, N460?, T478R, E484?, F486?, Q493Q, V503?, P621?, D1084?, F1103L
Patient 2: T95I, K113N, W152C, N164K, R403K, G446I, A484V, F486P, R493Q, A653V, F1103L
Patient 3: S112P, P139S, W152C, N164K, D339E, R403K, K417?, N440?, G446D, L452K, N460K, K478R, F486L, R493Q, P621S, D1084E, F1103L
Patient 4: T95I, S151I, W152C, N164K, D339E, R403K, G446S, L452K, K478R, A484V, F486P, R493Q, G496S, H505Y, A575S, D936H, F1103L

H505Y is a strange one. This has been very well conserved since Omicron emerged.

@Sinickle Sinickle changed the title Chronic BA.2.12.1 in Chile led to 2 subsequent infections, in 2 forms. [4 seqs] Chronic BA.2.12.1 in Chile led to 3 subsequent infections, in 3 forms. [5 seqs] Feb 17, 2023
@Sinickle
Copy link
Author

Sinickle commented Feb 22, 2023

Today two more were uploaded, and they're also strange...
They're both from La Serena, like patient 2 (P2) and 4 (P4).
I've updated the chart again [here]
(https://docs.google.com/spreadsheets/d/19y7g9XSHTn1JhQcggJvQowFp45IbKjkgznT3pXECa9g/edit#gid=228803138).

Honestly, there's no straightforward picture looking at the differences between them all. I think the end of the genome looks pretty similar between P6 and P5, but definitely not the first majority of it. And they share some similarities with the other sequences even at the end that they don't share with each other.

I'd love to provide more explanation but I just can't make sense of what it could be. These new sequences have a lot more similarities to P2 and P4 than the tree suggests.
tree
image
For example, 16293 is mutated in P2, P4, P5, P6, but not P3. Same for 21846, 22898, and 24869.

Chronic 1: T95?, S112?, W152?, N164K, D339?, K417?, R403K, G446S, L452?, T478?, E484?, F486?, Q493Q, L1103F
Chronic 2: T95?, S112P, W152?, P139S, N164K, D339E, R403K, G446?, L452K, N460?, T478R, E484?, F486?, Q493Q, V503?, P621?, D1084?, F1103L
Patient 2: T95I, K113N, W152C, N164K, R403K, G446I, A484V, F486P, R493Q, A653V, F1103L
Patient 3: S112P, P139S, W152C, N164K, D339E, R403K, K417?, N440?, G446D, L452K, N460K, K478R, F486L, R493Q, P621S, D1084E, F1103L
Patient 4: T95I, S151I, W152C, N164K, D339E, R403K, G446S, L452K, K478R, A484V, F486P, R493Q, G496S, H505Y, A575S, D936H, F1103L
Patient 5: F4L, T95I, W152C, N164K, D339E, R346T, R403K, G446S, L452K, K478R, F486L, R493Q, D936H, F1103L
Patient 6: 95I, S112P, P139S, W152C, N164K, D339E, R403K, G446S, L452K, N460K, K478R, F486L, R493Q, Y796H, F1103L

Since the first collection date, this variant(s) make up 4/16 of uploaded sequences in La Serena, and 5/21 of Coquimbo as a whole.

@Sinickle Sinickle changed the title Chronic BA.2.12.1 in Chile led to 3 subsequent infections, in 3 forms. [5 seqs] Chronic BA.2.12.1 in Chile led to 5 subsequent infections, in multiple forms. [7 seqs] Feb 22, 2023
@FedeGueli
Copy link
Contributor

FedeGueli commented Feb 22, 2023

thx @Sinickle to me this is another BA.4/BA.5 event.
There is a bulk of common defining mutations S:N164K,S:G339E,S:R403K, S:F1003L, Orf3a:T208N, Orf1b:P2256L,Orf1a:T4311A,Orf1a:I2230T but we have no time to wait things deeper in the tree will become clearer in the next days , but given the high concerning profile i urge to designate it from the root:
Starting from BA.2.12.1 > S:N164K ( T22054G ) >C11767T > Orf1b:P2256L (C20233T), S:F1003L (T24869C)

Tree:
Schermata 2023-02-22 alle 11 35 19
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_27d90_5e8e40.json?c=gt-nuc_22018,22579&label=id:node_8173180

Gisaid query: Spike_F1103L,Spike_N164K

cc @corneliusroemer @thomasppeacock @InfrPopGen @AngieHinrichs a fast designation of the broader lineage will help a lot in tracking its further evolution.

@oobb45729
Copy link

It seems that the S:15-23del is real.
Aside from B.1.427/B.1.429, S:W152C is not common. Up till now, over 99% of sequences that have S:W152C also have S:S13I.
The reason is laid down in this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9835956/
In WT the signal peptide cleavage is between S13 and Q14, and C15 and C136 forms a disulfide bond.
However, after S13I mutation, the signal peptide cleavage is between C15 and V16 so C136 no longer forms a disulfide bond with C15. The W152C enables a new a disulfide bond between C136 and C152.
This lineage features W152C, which indicates that C15 might be lost. That is what S:15-23del would do.

@ryhisner
Copy link

@oobb45729, could Y248C restore the disulfide bond lost through C136F? Recently two XBB.1 sequences from Japan—EPI_ISL_16972732, EPI_ISL_16972733—had both C136F and Y248C, both rare mutations. There was also a May 2022 sequence from Romania—EPI_ISL_14433737—that had C136F and Y248C (along with eight other private spike mutations).

@oobb45729
Copy link

It could be, as Y248 is not far from C15.

@Sinickle
Copy link
Author

Sinickle commented Mar 3, 2023

Despite a large upload from Chile today (241 sequences in 2023) with 13 from Coquimbo, there were no BA.2.12.1.

@ryhisner
Copy link

ryhisner commented Mar 22, 2023

There are now two additional sequences in this lineage. The metadata indicate that both are new patients who have not been sequenced before.

One of the new sequences (EPI_ISL_17270470) features 6-7 AA mutations that are genuinely new to the tree: ORF1a:V665A, ORF1a:L1459F, ORF1a:T3506A, S:G339H, S:N481del, S:G482S, and S:D1259N. S:339 is all over the place in this branch. Maybe it's all due to low-quality sequencing, but if you take the sequences seriously, we've seen G339, D339, E339, and now H339. It also features S:A484V and ORF1b:G662C, both of which have only been found in one other sequence here.

The other sequence (EPI_ISL_17270618) has several mutations new to the lineage as well—ORF1a:V38A, ORF1a:Q1519H, S:N450D, S:N481D, S:D867G. This sequence also has ∆243-244 and ORF9b:I5T, which have only been found in one other sequence in this branch.

What's strange is that ∆243-244 and ORF9b:I5T (in EPI_ISL_17270618) and S:A484V and ORF1b:G662C (in EPI_ISL_17270470)—the mutations mentioned above as having only been found in one other sequence—were all in the same other sequence: EPI_ISL_16834974, which is the topmost sequence in the tree pictured below. It seems clear there must be a close connection between these three sequences, yet they appear quite distant from one another in the tree.

The tree for this one continues to get crazier. Much of the apparent diversity is actually due to poor coverage and artifactual reversions, but beneath all that I think there are still substantial differences. The red arrows below point to the two new sequences. EPI_ISL_17270470, EPI_ISL_17270618
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/BA.2.12.1_Chile_2023_3_22_subtreeAuspice1_genome_e61_b48b80.json
image

@Sinickle Sinickle changed the title Chronic BA.2.12.1 in Chile led to 5 subsequent infections, in multiple forms. [7 seqs] Chronic BA.2.12.1 in Chile led to 7 subsequent infections, in multiple forms. [7 seqs] Mar 22, 2023
@Sinickle Sinickle changed the title Chronic BA.2.12.1 in Chile led to 7 subsequent infections, in multiple forms. [7 seqs] Chronic BA.2.12.1 in Chile led to 7 subsequent infections, in multiple forms. [9 seqs] Mar 23, 2023
@silcn
Copy link

silcn commented Mar 23, 2023

The crazy diversity reminds me of C.1.2, where the originating patient appears to have transmitted 8 distinct lineages, some of which are related by recombination. Somewhat surprised no-one did a detailed analysis when it was still circulating.

@Sinickle
Copy link
Author

I've updated the chart again.

I agree with Ryan's points, but will also add that there are oddities that separate them from EPI_ISL_16834974 AKA Patient 2 AKA the top most sequence.

T15009C is not present in that sequence, but is present in both of the new sequences, as well as Patient 4, EPI_ISL_16942017 and patient 6 EPI_ISL_16989160.

C23013T and T23019C appear in the new sequence EPI_ISL_17270470, and otherwise just Patient 2 and Patient 4. These spots are ambiguous in the chronic patient's sequences.

T22942G is missing in Patient 2, but present in Patient 3 and Patient 6, and ambiguous in one of the chronic patients. It is also present in new sequence EPI_ISL_17270618.

It does seem like these sequences could be related by recombination, but not in a straight forward manner.

Chronic 1: T95?, S112?, W152?, N164K, D339?, K417?, R403K, G446S, L452?, T478?, E484?, F486?, Q493Q, L1103F
Chronic 2: T95?, S112P, W152?, P139S, N164K, D339E, R403K, G446?, L452K, N460?, T478R, E484?, F486?, Q493Q, V503?, P621?, D1084?, F1103L
Patient 2: T95I, K113N, W152C, N164K, R403K, G446I, A484V, F486P, R493Q, A653V, F1103L
Patient 3: S112P, P139S, W152C, N164K, D339E, R403K, K417?, N440?, G446D, L452K, N460K, K478R, F486L, R493Q, P621S, D1084E, F1103L
Patient 4: T95I, S151I, W152C, N164K, D339E, R403K, G446S, L452K, K478R, A484V, F486P, R493Q, G496S, H505Y, A575S, D936H, F1103L
Patient 5: F4L, T95I, W152C, N164K, D339E, R346T, R403K, G446S, L452K, K478R, F486L, R493Q, D936H, F1103L
Patient 6: T95I, S112P, P139S, W152C, N164K, D339E, R403K, G446S, L452K, N460K, K478R, F486L, R493Q, Y796H, F1103L
Patient 7: T95I, W152C, D339H, R403K, K440N, G446S, (481del, G482S), A484V, F486P, R493Q, F1103L, D1259N
Patient 8: T95I, W152C, N164K, D339G, R403K, G446S, N450D, L452K, N460K, N481D, F486L, R493Q, D867G, F1103L

@InfrPopGen
Copy link
Contributor

The discussion above is really interesting, but as this does not appear to have spread widely, persisted, or spawned other lineages of importance so far, this can be closed for now. Of course, please request re-opening if the situation changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BA.2 monitor currently too small, watch for future developments Saltation Appears on long branch length with no intermediates
Projects
None yet
Development

No branches or pull requests

8 participants