Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dup should have an alt when available #638

Open
nh13 opened this issue Jun 23, 2022 · 5 comments
Open

Dup should have an alt when available #638

nh13 opened this issue Jun 23, 2022 · 5 comments
Labels
resurrected stale Issue is stale and subject to automatic closing

Comments

@nh13
Copy link

nh13 commented Jun 23, 2022

Related to #581.

If we have DUMMY:c.2_4dupATC, then I was expecting the Dup class to have both a ref AND alt defined. Currently, it only has ref defined.

The HGVS recommendation is not to include the duplicated bases, so in that case the current code has ref be None, but when the duplicated bases are included ref is properly stored. To save space, alt could just be:

@property
def alt(self):
    return self.ref * 2

What to folks think?

@reece
Copy link
Member

reece commented Jun 25, 2022

I'm open to this idea, but it opens up a much larger issue for a few design principles:

  • Distinguish parsing (syntactic correctness) from validation. It is intentionally possible (and practically necessary) to be able to parse and emit syntactically correct but semantically broken HGVS expressions because they occur in the wild. For example, ref mismatches and invalid location ranges are parsed but invalid.
  • An implication of the above is that an hgvs object instance might be incorrect or incomplete (e.g., missing an optional ref sequence).
  • Data lookups are expensive (especially if remote sources are used), and so should not be implicit. Validation, reference correction, etc are expected to be explicit operations.
  • And, relevant for this issue, don't implicitly fetch data in order to format an expression.

For this proposal, we need to consider what happens when ref is not provided. (And, in fact, excluding the ref sequence is actually preferred by many on the HGVS committee because it's redundant information.)

>>> v = parse("DUMMY:c.2_4dupATC")
>>> v.posedit.edit.ref
'ATC'
>>> v = parse("DUMMY:c.2_4dup")
>>> v.posedit.edit.ref
''

Clearly, ref * 2 will be plainly wrong in this case, and it's a significant (and perhaps even majority) use case.

Variants do have a fill_ref() method to make this easier, like so:

>>> v = parse("NM_000314.4:c.2_4dup")
>>> v.fill_ref(hdp)
SequenceVariant(ac=NM_000314.4, type=c, posedit=2_4dup, gene=None)
>>> v.posedit.edit.ref
'TGA'

I don't have an easy solution to handle the proposed alt method without either 1) adopting a new implicit behavior and setting a global data provider, or 2) requiring that the user fill_ref() before calling alt, which means that we also need to handle the case when they don't do so.

Copy link

github-actions bot commented Dec 8, 2023

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Dec 8, 2023
Copy link

This issue was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 17, 2023
@reece reece reopened this Feb 19, 2024
@reece reece added resurrected and removed stale Issue is stale and subject to automatic closing labels Feb 19, 2024
@reece
Copy link
Member

reece commented Feb 19, 2024

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resurrected stale Issue is stale and subject to automatic closing
Projects
None yet
Development

No branches or pull requests

2 participants