This Python Package provides a probabilistic model to classify nucleotide sequences in metagenome samples. It was developed as a framework to help researchers to reconstruct individual genomes from such datasets using custom workflows and to give developers the possibility to integrate the model into their programs.
- Free software: GPLv3 license
- Source code: https://github.com/fungs/mglex
- Documentation: https://mglex.readthedocs.io
- Integrates nucleotide composition, multi-sample coverage and taxonomic annotation
- Learns a model in linear time with respect to the number of input sequences
- Classifies novel sequences in linear time
- Calculates likelihood and p-values
- Calculates probabilistic distances between genome bins
MGLEX is a Python 3 package, it does not run with Python 2 versions. It depends on
- NumPy
- SciPy (for few functions)
- docopt
We show how to install MLGEX under Debian and Ubuntu, but other platforms are similar.
You can simply install the requirements as system packages.
We recommend to create a Python virtual installation enviroment for MGLEX. In order to do so, install the venv package for your Python version (e.g. the Debian package python3.4-venv), if not included (or use virtualenv). The following command will make use of the installed system packages.
MGLEX is deposited on the Python Package Index and we recommend to install it via pip.
This package was created using NumPy by Johannes Dröge at the Computational Biology of Infection Research Group at the Helmholtz Centre for Infection Research, Braunschweig, Germany.