Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate gene #205

Open
wants to merge 13 commits into
base: dev
Choose a base branch
from
Open

Translate gene #205

wants to merge 13 commits into from

Conversation

jpjarnoux
Copy link
Member

@jpjarnoux jpjarnoux commented Mar 28, 2024

Two things have been done here:

  1. Added the ability to write the translated sequence of all genes using MMSeqs2 with the --genes_prot option, which works like the other options in the ppanggolin fasta command.
  2. In MSA, if the gene sequence length was not divisible by 3, PPanGGOLiN raised an error. To be more flexible, the last one or two nucleotides are deleted if the sequence size is not modulo 3. A warning message is issued if necessary to inform the user.

@axbazin axbazin self-requested a review April 9, 2024 07:16
@jpjarnoux
Copy link
Member Author

MMseqs2 create db must be forced in mode 1. So in translate_genes we must force the mode.

ppanggolin/formats/writeSequences.py Show resolved Hide resolved
@@ -43,6 +135,29 @@ def write_gene_sequences_from_annotations(genes_to_write: Iterable[Gene], file_o
file_obj.flush()


def translate_genes(sequences: TextIO, tmpdir: Path, threads: int = 1, code: int = 11) -> Path:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to "translate_with_mmseqs" in the "align" module, maybe we can replace the "translate_with_mmseqs" with this function as well to centralize "translation" in ppanggolin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you talked about it already, but removing genetic_codes.py is definitely a good idea if we don't use it anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants