Skip to content

Get sequences command give me an error #328

Answered by padix-key
rpestana94 asked this question in Q&A
Discussion options

You must be logged in to vote

'J' is currently not a symbol in the amino acid alphabet. Hence neither a NucleotideSequence or ProteinSequence can be created from the sequences in your FASTA file. There are two possible solutions to this issue, both using the low-level API of FastaFile that returns strings instead of Sequence objects:

1. Read the sequences as string, replace the symbol with an appropriate replacement and create a ProteinSequence

sequences = {header : ProteinSequence(seq_str.replace("J", "L")) for header, seq_str in fasta_file.items()}

2. Read the sequences as string and create GeneralSequence objects with a custom alphabet

alphabet = LetterAlphabet(ProteinSequence.alphabet.get_symbols() + ["J"])
sequences

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@rpestana94
Comment options

@padix-key
Comment options

@rpestana94
Comment options

@padix-key
Comment options

@rpestana94
Comment options

Answer selected by padix-key
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants