Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hgvs_c to hgvs_p projection throws IndexError for "NM_001253909.2:c.416_417insGTG" #707

Open
holtgrewe opened this issue Oct 21, 2023 · 2 comments · May be fixed by #708
Open

hgvs_c to hgvs_p projection throws IndexError for "NM_001253909.2:c.416_417insGTG" #707

holtgrewe opened this issue Oct 21, 2023 · 2 comments · May be fixed by #708
Labels
bug Something isn't working

Comments

@holtgrewe
Copy link

Said variant is a (disruptive) inframe_insertion in the stop codon with stop_retained effect. This corresponds to SPDI GRCh37:10:5139789:A:AGTG.

Based on the current main branch, the following test fails:

diff --git a/tests/test_hgvs_variantmapper.py b/tests/test_hgvs_variantmapper.py
index 63a61ce..dcd5bb4 100644
--- a/tests/test_hgvs_variantmapper.py
+++ b/tests/test_hgvs_variantmapper.py
@@ -39,6 +39,12 @@ class Test_VariantMapper_Exceptions(unittest.TestCase):
         cls.vm = hgvs.variantmapper.VariantMapper(cls.hdp)
         cls.hp = hgvs.parser.Parser()

+    def test_map_stop_retained(self):
+        hgvs_c = "NM_001253909.2:c.416_417insGTG"
+        var_c = self.hp.parse_hgvs_variant(hgvs_c)
+        with self.assertRaises(HGVSInvalidVariantError):
+            self.vm.c_to_p(var_c)
+
     def test_gcrp_invalid_input_type(self):
         hgvs_g = "NC_000007.13:g.36561662C>T"
         hgvs_c = "NM_001637.3:c.1582G>A"

The exception backtrace is as follows:

============================================================================================= test session starts ==============================================================================================
platform linux -- Python 3.10.6, pytest-7.4.2, pluggy-1.3.0 -- /home/holtgrem_c/Development/hgvs/venv/bin/python
cachedir: .pytest_cache
HGVS_CACHE_MODE: learn
HGVS_SEQREPO_URL: None
UTA_DB_URL: postgresql://anonymous:anonymous@uta.biocommons.org:/uta/uta_20210129
hgvs.easy.hdp=postgresql://anonymous:anonymous@uta.biocommons.org:/uta/uta_20210129
hgvs.easy.hdp.seqfetcher.source='bioutils.seqfetcher (network fetching)'
rootdir: /home/holtgrem_c/Development/hgvs
configfile: pytest.ini
plugins: recording-0.13.0, cov-4.1.0, anyio-4.0.0
collected 1 item

tests/test_hgvs_variantmapper.py::Test_VariantMapper_Exceptions::test_map_stop_retained FAILED                                                                                                           [100%]

=================================================================================================== FAILURES ===================================================================================================
_____________________________________________________________________________ Test_VariantMapper_Exceptions.test_map_stop_retained _____________________________________________________________________________
self = <test_hgvs_variantmapper.Test_VariantMapper_Exceptions testMethod=test_map_stop_retained>

    def test_map_stop_retained(self):
        hgvs_c = "NM_001253909.2:c.416_417insGTG"
        var_c = self.hp.parse_hgvs_variant(hgvs_c)
        with self.assertRaises(HGVSInvalidVariantError):
>           self.vm.c_to_p(var_c)

tests/test_hgvs_variantmapper.py:46:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/hgvs/variantmapper.py:447: in c_to_p
    var_p = builder.build_hgvsp()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    def build_hgvsp(self):
        """Compare two amino acid sequences; generate an hgvs tag from the output

        :return list of variants in sequence order
        :rtype list of dict
        """

        variants = []

        if not self._is_ambiguous and len(self._alt_seq) > 0:
            do_delins = True
            if self._ref_seq == self._alt_seq:
                # Silent p. variant
                start = self._alt_data.variant_start_aa
                if start - 1 < len(self._ref_seq):
                    deletion = self._ref_seq[start - 1]
                    insertion = deletion
                else:
                    start = ""
                    deletion = ""
                    insertion = ""
                    self._is_frameshift = False
                variants.append({"start": start, "ins": insertion, "del": deletion})
                do_delins = False
            elif self._is_substitution:
                if len(self._ref_seq) == len(self._alt_seq):
                    diff_pos = [
                        (i, self._ref_seq[i], self._alt_seq[i])
                        for i in range(len(self._ref_seq))
                        if self._ref_seq[i] != self._alt_seq[i]
                    ]
                    if len(diff_pos) == 1:
                        (start, deletion, insertion) = diff_pos[0]
                        variants.append({"start": start + 1, "ins": insertion, "del": deletion})
                        do_delins = False

            if do_delins:
                if self._alt_data.is_frameshift:
                    start = self._alt_data.variant_start_aa - 1
                    while self._ref_seq[start] == self._alt_seq[start]:
                        start += 1
                    insertion = list(self._alt_seq[start:])
                    deletion = list(self._ref_seq[start:])
                    variants.append({"start": start + 1, "ins": insertion, "del": deletion})

                else:  # non-frameshifting delins or dup
                    # get size diff from diff in ref/alt lengths
                    start = self._alt_data.variant_start_aa - 1
                    delta = len(self._alt_seq) - len(self._ref_seq)
>                   while self._ref_seq[start] == self._alt_seq[start]:
E                   IndexError: string index out of range

src/hgvs/utils/altseq_to_hgvsp.py:102: IndexError

Also see here:

The VEP prediction is correct * => ** plus inframe_insertion&stop_retained_variant.

#Uploaded_variation Location Allele Consequence IMPACT SYMBOL Gene Feature_type Feature BIOTYPE EXON INTRON HGVSc HGVSp cDNA_position CDS_position Protein_position Amino_acids Codons Existing_variation DISTANCE STRAND FLAGS SYMBOL_SOURCE HGNC_ID MANE_SELECT MANE_PLUS_CLINICAL TSL APPRIS REFSEQ_MATCH REFSEQ_OFFSET GIVEN_REF USED_REF BAM_EDIT SIFT PolyPhen AF CLIN_SIG SOMATIC PHENO PUBMED MOTIF_NAME MOTIF_POS HIGH_INF_POS MOTIF_SCORE_CHANGE TRANSCRIPTION_FACTORS
. 10:5139789-5139789 GTG inframe_insertion,stop_retained_variant MODERATE AKR1C3 8644 Transcript NM_001253909.2 protein_coding 3/3 - - - 447-448 416-417 139 /* taa/taGTGa - - 1 - EntrezGene - - - - - - - - - - - - - - - - - - - - - -
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Nov 28, 2023
Copy link

github-actions bot commented Dec 5, 2023

This issue was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 5, 2023
@reece reece removed stale Issue is stale and subject to automatic closing closed-by-stale labels Dec 8, 2023
@reece reece reopened this Dec 8, 2023
@jsstevenson jsstevenson added the bug Something isn't working label Dec 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants