Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JN.1.39 + T111C [5' UTR] (469 seq, Apr 6) #2554

Open
ryhisner opened this issue Apr 6, 2024 · 13 comments
Open

JN.1.39 + T111C [5' UTR] (469 seq, Apr 6) #2554

ryhisner opened this issue Apr 6, 2024 · 13 comments

Comments

@ryhisner
Copy link

ryhisner commented Apr 6, 2024

Description
Sub-lineage of: JN.1.39 (JN.1 + G2782T)
Earliest sequence: 2023-11-27 – USA, Ohio — EPI_ISL_18609351; England — EPI_ISL_18598897
Most recent sequence: 2024-3-25 – China, Fujian — EPI_ISL_19025051
Continents circulating: North America (242), Asia (108), Europe (85), Oceania (16), Africa (15), South America (3)
Top Countries circulating:
North America (2 countries)—USA (226), Canada (16)
Asia (12 countries)—Indonesia (30), China (15), Oman (14), South Korea (13), Singapore (12), Japan (11)
Europe (14 countries)—UK (33), Sweden (12), France (10)
Africa (2 countries)—Nigeria (13), South Africa (2)
South America (1 country)—Brazil (3)
Oceania (2 countries)—Australia (14), New Zealand (2)
Number of Sequences: 469
GISAID Nucleotide Query: T111C, G2782T, -A12T, -C5512T, -C21762T
CovSpectrum Query: Nextcladepangolineage:JN.1* & [2-of: T111C, G2782T] & [exactly-0-of: C5512T, C21762T]
Substitutions on top of X:
5' UTR: T111C
Nucleotide: T111C

USHER Tree
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/JN.1.39_T111C.json?c=gt-nuc_111&gmax=1111&label=id:node_6957831

image

Evidence
I find it intriguing that there are two large JN.1* lineages with the synonymous G2782T and the 5' UTR mutation T111C. The other lineage with both T111C and G2782T is JN.1.33, which also has the synonymous C5221T and S:A67V (C21762T). Both lineages appear to have a modest but consistent growth advantage (~10-15% weekly) over baseline JN.1* despite not having S:R346T, S:F456S, or S:T572I, the three major mutations that clearly confer growth advantages at the moment.

I suppose there are three possibilities here:

  1. The Usher tree is confused and these two lineages are actually related. T
  2. The co-occurrence of these two mutations is coincidental and the modest growth advantages (10-15%) are not real but can be put down to unrepresentative sampling and founder effects.
  3. There's some sort of connection between these two nucleotide mutations, maybe in secondary RNA structure, that somehow confers a slight benefit for JN.1.

There has been no connection between these two mutations before. In fact, if you search GISAID for T111C and G2782T, the only sequences returned are from JN.1 and three Bat-CoV sequences collected in Yunnan, China, in 2020 (RmYN05, RmYN08, RsYN04).

T111C is on SL4 in the 5' UTR. It is paired with G101, which is a weak, non-Watson-Crick base-pair bond. T111C would create a much stronger C-G base pair, which could conceivably affect the stability of SL4 and perhaps have some unknown effect on viral fitness.
image
5' UTR Image above is from: https://pubmed.ncbi.nlm.nih.gov/33636127/
Sun L, Li P, Ju X, et al. In vivo structural characterization of the SARS-CoV-2 RNA genome identifies host proteins vulnerable to repurposed drugs. Cell. 2021;184(7):1865-1883.e20. doi:10.1016/j.cell.2021.02.008

I aligned the nucleotide sequences for the 19 Bat-CoV sequences I could find on GISAID, and six of them had T111C (RsYN03, RsYN04, RsYN05, RsYN07, RsYN08, RsYN09). SARS-CoV-1 and murine hepatitis virus (MHV) also have T111C.

Genomes

Genomes EPI_ISL_18598897, EPI_ISL_18609351, EPI_ISL_18640194, EPI_ISL_18641344, EPI_ISL_18668153, EPI_ISL_18673408, EPI_ISL_18673415, EPI_ISL_18673443, EPI_ISL_18673459, EPI_ISL_18673461, EPI_ISL_18673464, EPI_ISL_18673469, EPI_ISL_18673506, EPI_ISL_18673545, EPI_ISL_18673559, EPI_ISL_18685150, EPI_ISL_18690159, EPI_ISL_18690695, EPI_ISL_18691057, EPI_ISL_18697903, EPI_ISL_18699715, EPI_ISL_18700764, EPI_ISL_18703530, EPI_ISL_18705205, EPI_ISL_18705264, EPI_ISL_18711419, EPI_ISL_18717639, EPI_ISL_18721865, EPI_ISL_18721872, EPI_ISL_18725663, EPI_ISL_18730171, EPI_ISL_18732843-18732844, EPI_ISL_18733930, EPI_ISL_18754905, EPI_ISL_18763459, EPI_ISL_18763857, EPI_ISL_18763868, EPI_ISL_18765704, EPI_ISL_18770476, EPI_ISL_18770484, EPI_ISL_18770839, EPI_ISL_18775021, EPI_ISL_18775724, EPI_ISL_18778094, EPI_ISL_18779414, EPI_ISL_18779431, EPI_ISL_18779505, EPI_ISL_18779671, EPI_ISL_18779692, EPI_ISL_18781154, EPI_ISL_18781196, EPI_ISL_18782741, EPI_ISL_18782813, EPI_ISL_18782820, EPI_ISL_18785068, EPI_ISL_18785672, EPI_ISL_18785695, EPI_ISL_18792753, EPI_ISL_18794289, EPI_ISL_18796593, EPI_ISL_18796695, EPI_ISL_18801605, EPI_ISL_18806595, EPI_ISL_18806612, EPI_ISL_18806732, EPI_ISL_18808433, EPI_ISL_18809074, EPI_ISL_18809519, EPI_ISL_18810849, EPI_ISL_18810964, EPI_ISL_18814982, EPI_ISL_18815006, EPI_ISL_18815384, EPI_ISL_18815420, EPI_ISL_18815423-18815424, EPI_ISL_18816773, EPI_ISL_18818182, EPI_ISL_18818750, EPI_ISL_18821806, EPI_ISL_18823781, EPI_ISL_18824080, EPI_ISL_18825509, EPI_ISL_18825536, EPI_ISL_18826697, EPI_ISL_18826828, EPI_ISL_18827100, EPI_ISL_18827694, EPI_ISL_18827710, EPI_ISL_18827759, EPI_ISL_18827918, EPI_ISL_18828248, EPI_ISL_18831501, EPI_ISL_18831505, EPI_ISL_18831507-18831508, EPI_ISL_18831698, EPI_ISL_18831766, EPI_ISL_18832582, EPI_ISL_18835463, EPI_ISL_18835660, EPI_ISL_18835674, EPI_ISL_18836159, EPI_ISL_18838674, EPI_ISL_18838715, EPI_ISL_18839765, EPI_ISL_18842099, EPI_ISL_18846849, EPI_ISL_18852111, EPI_ISL_18852272, EPI_ISL_18853330, EPI_ISL_18853599, EPI_ISL_18856226, EPI_ISL_18858394, EPI_ISL_18858483, EPI_ISL_18859187, EPI_ISL_18859474, EPI_ISL_18859609, EPI_ISL_18860088, EPI_ISL_18860107, EPI_ISL_18860200, EPI_ISL_18861910, EPI_ISL_18863248, EPI_ISL_18863546, EPI_ISL_18864176, EPI_ISL_18864206, EPI_ISL_18868123, EPI_ISL_18869153, EPI_ISL_18870229, EPI_ISL_18870396, EPI_ISL_18871160, EPI_ISL_18872220, EPI_ISL_18872572, EPI_ISL_18872858, EPI_ISL_18877514, EPI_ISL_18878306, EPI_ISL_18878508, EPI_ISL_18879832, EPI_ISL_18879849, EPI_ISL_18879851-18879852, EPI_ISL_18879861, EPI_ISL_18879871, EPI_ISL_18879888, EPI_ISL_18879912, EPI_ISL_18880359, EPI_ISL_18880488, EPI_ISL_18880626, EPI_ISL_18881887, EPI_ISL_18882893-18882894, EPI_ISL_18884244, EPI_ISL_18885315, EPI_ISL_18885478, EPI_ISL_18886334, EPI_ISL_18886354, EPI_ISL_18886389, EPI_ISL_18886408, EPI_ISL_18886536, EPI_ISL_18887542, EPI_ISL_18892276, EPI_ISL_18892401, EPI_ISL_18893021, EPI_ISL_18895583, EPI_ISL_18895774, EPI_ISL_18900748, EPI_ISL_18901832, EPI_ISL_18902164, EPI_ISL_18902186, EPI_ISL_18903232, EPI_ISL_18903636, EPI_ISL_18903683, EPI_ISL_18907506, EPI_ISL_18907509, EPI_ISL_18907540, EPI_ISL_18907576, EPI_ISL_18907579, EPI_ISL_18907598, EPI_ISL_18907601, EPI_ISL_18907603, EPI_ISL_18907648, EPI_ISL_18907691, EPI_ISL_18907697, EPI_ISL_18907703, EPI_ISL_18907865, EPI_ISL_18907873, EPI_ISL_18907905, EPI_ISL_18907907, EPI_ISL_18907938, EPI_ISL_18907962, EPI_ISL_18907976, EPI_ISL_18908569, EPI_ISL_18909386, EPI_ISL_18909536, EPI_ISL_18910549, EPI_ISL_18912663, EPI_ISL_18912735, EPI_ISL_18913053, EPI_ISL_18913545, EPI_ISL_18914646, EPI_ISL_18915614, EPI_ISL_18915880, EPI_ISL_18916642-18916644, EPI_ISL_18916647, EPI_ISL_18916667, EPI_ISL_18916680-18916681, EPI_ISL_18917371, EPI_ISL_18918222, EPI_ISL_18918366, EPI_ISL_18918393, EPI_ISL_18919387, EPI_ISL_18919506, EPI_ISL_18920092-18920093, EPI_ISL_18920293, EPI_ISL_18921100, EPI_ISL_18921168, EPI_ISL_18921559, EPI_ISL_18921838, EPI_ISL_18921913, EPI_ISL_18921948, EPI_ISL_18922394, EPI_ISL_18923350, EPI_ISL_18923635, EPI_ISL_18924093, EPI_ISL_18927287, EPI_ISL_18927477, EPI_ISL_18927559, EPI_ISL_18927730, EPI_ISL_18928567, EPI_ISL_18928774, EPI_ISL_18928949, EPI_ISL_18930545, EPI_ISL_18930557-18930560, EPI_ISL_18930583, EPI_ISL_18930672, EPI_ISL_18931373, EPI_ISL_18931446, EPI_ISL_18931727, EPI_ISL_18931729, EPI_ISL_18931735, EPI_ISL_18931746, EPI_ISL_18931758, EPI_ISL_18932343, EPI_ISL_18932528, EPI_ISL_18932657, EPI_ISL_18935614, EPI_ISL_18936054, EPI_ISL_18937006, EPI_ISL_18937168, EPI_ISL_18937400, EPI_ISL_18937586, EPI_ISL_18939942, EPI_ISL_18940294, EPI_ISL_18940629, EPI_ISL_18942615, EPI_ISL_18942668, EPI_ISL_18942671, EPI_ISL_18942740, EPI_ISL_18944081, EPI_ISL_18944109, EPI_ISL_18944149, EPI_ISL_18946273-18946274, EPI_ISL_18946314, EPI_ISL_18946318, EPI_ISL_18946369, EPI_ISL_18946481, EPI_ISL_18946514, EPI_ISL_18948387, EPI_ISL_18948504, EPI_ISL_18949261, EPI_ISL_18949274, EPI_ISL_18950316, EPI_ISL_18950946, EPI_ISL_18952122, EPI_ISL_18954443, EPI_ISL_18954544, EPI_ISL_18954611, EPI_ISL_18954706, EPI_ISL_18955768, EPI_ISL_18956004, EPI_ISL_18956352, EPI_ISL_18956462, EPI_ISL_18956489, EPI_ISL_18956700, EPI_ISL_18956772, EPI_ISL_18957080, EPI_ISL_18957148, EPI_ISL_18957455, EPI_ISL_18958990-18958992, EPI_ISL_18959055, EPI_ISL_18959126, EPI_ISL_18959174, EPI_ISL_18959998, EPI_ISL_18960035, EPI_ISL_18960810, EPI_ISL_18960942, EPI_ISL_18961010, EPI_ISL_18961057, EPI_ISL_18961185, EPI_ISL_18961263, EPI_ISL_18963777, EPI_ISL_18963787, EPI_ISL_18964019, EPI_ISL_18964231, EPI_ISL_18964487, EPI_ISL_18964750, EPI_ISL_18964760, EPI_ISL_18964769, EPI_ISL_18964846, EPI_ISL_18965284, EPI_ISL_18965945, EPI_ISL_18966519, EPI_ISL_18966932, EPI_ISL_18966938, EPI_ISL_18967258, EPI_ISL_18967290, EPI_ISL_18967321, EPI_ISL_18967398, EPI_ISL_18967679, EPI_ISL_18967681, EPI_ISL_18967728, EPI_ISL_18967731, EPI_ISL_18967787, EPI_ISL_18967793, EPI_ISL_18967852, EPI_ISL_18968327, EPI_ISL_18968571, EPI_ISL_18968604, EPI_ISL_18968876, EPI_ISL_18968958, EPI_ISL_18970561, EPI_ISL_18970785, EPI_ISL_18971333, EPI_ISL_18972168, EPI_ISL_18972438, EPI_ISL_18972693, EPI_ISL_18972698, EPI_ISL_18972701-18972703, EPI_ISL_18972716, EPI_ISL_18973651, EPI_ISL_18973777, EPI_ISL_18974049, EPI_ISL_18974222, EPI_ISL_18974271, EPI_ISL_18974653, EPI_ISL_18974655, EPI_ISL_18974659, EPI_ISL_18975322, EPI_ISL_18976602, EPI_ISL_18976604, EPI_ISL_18977970, EPI_ISL_18978012, EPI_ISL_18979360, EPI_ISL_18979459, EPI_ISL_18979639, EPI_ISL_18981821, EPI_ISL_18981983, EPI_ISL_18981985, EPI_ISL_18982345, EPI_ISL_18982454, EPI_ISL_18982516, EPI_ISL_18983131, EPI_ISL_18983555, EPI_ISL_18985129, EPI_ISL_18985147, EPI_ISL_18985166, EPI_ISL_18985279, EPI_ISL_18985286, EPI_ISL_18985293, EPI_ISL_18985307, EPI_ISL_18985313, EPI_ISL_18985335, EPI_ISL_18985394, EPI_ISL_18985442-18985443, EPI_ISL_18986082, EPI_ISL_18986092, EPI_ISL_18986584, EPI_ISL_18987172, EPI_ISL_18988431, EPI_ISL_18988433, EPI_ISL_18989584, EPI_ISL_18990059, EPI_ISL_18992465, EPI_ISL_18993922, EPI_ISL_18994082, EPI_ISL_18994511, EPI_ISL_18995378, EPI_ISL_18998035, EPI_ISL_18998070, EPI_ISL_18998873, EPI_ISL_18998896, EPI_ISL_18999139, EPI_ISL_18999993, EPI_ISL_19000198, EPI_ISL_19000432, EPI_ISL_19001281, EPI_ISL_19001583, EPI_ISL_19002546, EPI_ISL_19002641, EPI_ISL_19003690, EPI_ISL_19003704, EPI_ISL_19004887, EPI_ISL_19004889, EPI_ISL_19004912, EPI_ISL_19004926, EPI_ISL_19006232, EPI_ISL_19006721, EPI_ISL_19006744, EPI_ISL_19006812, EPI_ISL_19008045, EPI_ISL_19008247, EPI_ISL_19009679, EPI_ISL_19009965, EPI_ISL_19012307, EPI_ISL_19012429, EPI_ISL_19015361, EPI_ISL_19015396, EPI_ISL_19016001, EPI_ISL_19016651, EPI_ISL_19016667, EPI_ISL_19016791, EPI_ISL_19017357, EPI_ISL_19018044, EPI_ISL_19018183, EPI_ISL_19018267, EPI_ISL_19019132, EPI_ISL_19019180, EPI_ISL_19019350, EPI_ISL_19019549, EPI_ISL_19021182, EPI_ISL_19021185, EPI_ISL_19021187-19021188, EPI_ISL_19021207, EPI_ISL_19021212, EPI_ISL_19021969, EPI_ISL_19021984, EPI_ISL_19022508, EPI_ISL_19022534, EPI_ISL_19022873, EPI_ISL_19023113, EPI_ISL_19024326, EPI_ISL_19024331, EPI_ISL_19025051, EPI_ISL_19025164, EPI_ISL_19025953, EPI_ISL_19027995, EPI_ISL_19028278, EPI_ISL_19028540-19028541, EPI_ISL_19030177, EPI_ISL_19030430, EPI_ISL_19030566, EPI_ISL_19032769, EPI_ISL_19032771
@aviczhl2
Copy link
Contributor

aviczhl2 commented Apr 6, 2024

The usher tree is likely confused. Some JN.1.33 seqs have position 111 missing coverage, so usher categorize them as not having T111C, therefore it places T111C after S:A67V.

However no JN.1.33 seq has 111T, suggesting that those 111 missing coverage seqs do have T111C and JN.1.33 is actually a sub-branch of JN.1.39, or a recombinant involving JN.1.39 as its 5' parent.

@ryhisner
Copy link
Author

ryhisner commented Apr 6, 2024

But T111C doesn't have anything to do with why Usher puts these in separate trees. It separates them because all JN.1.33 have C5512T and no JN.1.39 + T111C have C5512T, and it appears that C5512T comes before G2782T because there are many sequences that have C5512T but not G2782T.
image

A CovSpectrum search for

Nextcladepangolineage:JN.1* & [3-of: T2781T, G2782G, A2783A] & [3-of: A5511A, C5512T, G5513G]

returns 350 sequences from 33 different countries, for example.

@aviczhl2
Copy link
Contributor

aviczhl2 commented Apr 7, 2024

But T111C doesn't have anything to do with why Usher puts these in separate trees. It separates them because all JN.1.33 have C5512T and no JN.1.39 + T111C have C5512T, and it appears that C5512T comes before G2782T because there are many sequences that have C5512T but not G2782T. <img alt="image" width="1499" src="https://private-user-

There is one sequence with G2782T, T111C, C5512T but S:67A, EPI_ISL_18982930 from China.

The sequence suggest the correct order shall be JN.1.39->T111C->C5512T->S:A67V,
C5512T is gotten either through convergent evolution, or via recombination with the C5512T branch of JN.1, S:A67V is then gotten.

@FedeGueli
Copy link
Contributor

cc @corneliusroemer @AngieHinrichs could you look at this? it is from some weeks @aviczhl2 is raising the issue of the T111C , G2782T being splitted in JN.1.33 and JN.1.39 , to me is hard to reach a consensus on this.

@ryhisner
Copy link
Author

ryhisner commented Apr 8, 2024

There is one sequence with G2782T, T111C, C5512T but S:67A, EPI_ISL_18982930 from China.

The sequence suggest the correct order shall be JN.1.39->T111C->C5512T->S:A67V, C5512T is gotten either through convergent evolution, or via recombination with the C5512T branch of JN.1, S:A67V is then gotten.

Mutations next to deletions are frequently misread, so I'd be surprised if there aren't numerous sequences with G2782T, T111C, C5512T, and S:67A. But the fact that there are 350 sequences with C5512T but without G2782T, T111C, or S:A67V makes it clear these two lineages are very unlikely to be directly related unless it's through recombination.

@aviczhl2
Copy link
Contributor

aviczhl2 commented Apr 8, 2024

Mutations next to deletions are frequently misread, so I'd be surprised if there aren't numerous sequences with G2782T, T111C, C5512T, and S:67A. But the fact that there are 350 sequences with C5512T but without G2782T, T111C, or S:A67V makes it clear these two lineages are very unlikely to be directly related unless it's through recombination.

Yeah I also think it is likely a recombinant. China is not submitting much sequences so lineages with S:67A is likely to have very few seqs due to it cannot compare with S:67V. I just wanna point out the correct order of the mutation shall be JN.1->G2782T->T111C->C5512T->S:A67V as there are no seq of 5512T+[exactly-1-of: 2782T, 111C, S:67V]

@ryhisner
Copy link
Author

ryhisner commented Apr 8, 2024

Yeah, but there are hundreds of sequences with C5512T and not G2782T. C5512T had to come first on the JN.1.33 branch. There are zero sequences on the JN.1.39 branch that have C5512T.

@aviczhl2
Copy link
Contributor

aviczhl2 commented Apr 8, 2024

There are zero sequences on the JN.1.39 branch that have C5512T.

That's because they are placed to JN.1.33.

@ryhisner
Copy link
Author

ryhisner commented Apr 8, 2024

They don't have S:A67V either though. And even if that's the case, it still doesn't explain how there are hundreds of sequences with C5512T but without T111C, G2782T, or S:A67V.

@aviczhl2
Copy link
Contributor

aviczhl2 commented Apr 8, 2024

They don't have S:A67V either though. And even if that's the case, it still doesn't explain how there are hundreds of sequences with C5512T but neither T111C nor A:S67V.

1:There is a JN.1+C5512T branch (which is where "hundreds of seqs" come from)
2:There is a JN.1+G2782T+T111C branch

3: There is a JN.1+T111C,G2782T,C5512T branch, and usher place them as JN.1+C5512T+G2782T,T111C.

3 is likely a recomb of 1 and 2, and if we don't consider recombs 3 shall more likely to be placed under 2.

4: S:A67V is a sub-branch of 3.

@aviczhl2
Copy link
Contributor

image

There is a large S:R346T branch under this now, also with an S:F456L sub-branch.

@FedeGueli
Copy link
Contributor

image

There is a large S:R346T branch under this now, also with an S:F456L sub-branch.

it is branch 51 of sars-cov-2-variants/lineage-proposals#1089

@FedeGueli
Copy link
Contributor

the 456l branch with 1104L could be a recombinant ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants