You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is in regard to Sequence Alignment/Map Optional Fields Specification (2022-08-17).
The base modifications (MM) field allows modifications to be either short codes or an ChEBI ID. Short codes are constrained to [a-z]+ (i.e., lowercase letters) but the table of "standard common types" lists ambiguity codes that do not match this (i.e., uppercase letters).
Unmodified base
Code
Abbreviation
Name
ChEBI
C
C
Ambiguity code; any C mod
T
T
Ambiguity code; any T mod
U
U
Ambiguity code; any U mod
A
A
Ambiguity code; any A mod
G
G
Ambiguity code; any G mod
N
N
Ambiguity code; any mod
The text was updated successfully, but these errors were encountered:
The short codes are modified bases, so "m" and "h" being 5mC and 5hmC. It doesn't make any sense to have a base modification from nucleotide to ambiguity code, so I'm not sure I follow this.
We don't support ambiguity codes in the unmodified base component, so we couldn't do MM:Z:Y+h,4; for example as it wouldn't may sense. "N" covers this case anyway with the different counting regime.
I'm referring to the Code column of the standard common types table under the MM description. It defines codes that are uppercased, but the MM field pattern does not allow it: MM:Z:([ACGTUN][-+]([a-z]+|[0-9]+)[.?]?(,[0-9]+)*;)*. I referred to the short code portion as [a-z]+ originally.
The description for ML gives an example of using an ambiguous modification:
For example MM:Z:C+C,10; ML:B:C,229 indicates a C call with a probability of 90% of having some form of unspecified modification."
See that it uses C as the modification code, which does not match ([a-z]+|[0-9]+).
This is in regard to Sequence Alignment/Map Optional Fields Specification (2022-08-17).
The base modifications (
MM
) field allows modifications to be either short codes or an ChEBI ID. Short codes are constrained to[a-z]+
(i.e., lowercase letters) but the table of "standard common types" lists ambiguity codes that do not match this (i.e., uppercase letters).The text was updated successfully, but these errors were encountered: