Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjusting alignment parameters? #24

Open
ericmjl opened this issue Nov 17, 2021 · 1 comment
Open

Adjusting alignment parameters? #24

ericmjl opened this issue Nov 17, 2021 · 1 comment
Labels
enhancement New feature or request low-priority Low priority issues

Comments

@ericmjl
Copy link
Contributor

ericmjl commented Nov 17, 2021

I have a need to adjust alignment parameters; for example, I have encountered something akin to this issue, and the proposed solution from the author of MAFFT is to adjust one of the MAFFT parameters.

Adjusting alignment parameters via the .seq.align() API might be helpful. A few designs for the user-facing API that I can think of include:

# default aligner is MAFFT, so we can pass through the command line options via kwargs.
sequences.seq.align(ep=1.59, op=0.0)
# want to use MUSCLE instead of MAFFT
from seqlike.AlignCommandLine import MuscleCommandLine as muscle
sequences.seq.align(aligner=muscle, muscle_arg1=something, muscle_arg2=something)
@ericmjl ericmjl added enhancement New feature or request low-priority Low priority issues labels Jun 1, 2022
@ndousis
Copy link
Contributor

ndousis commented Jun 2, 2022

The first example (MAFFT with kwargs) already works as written.

The second example (using MUSCLE) requires some extra wrapper code to align letter annotations

from seqlike.alignment_commands import _generic_aligner_commandline_stdout, _generic_alignment
from seqlike.AlignCommandLine import MuscleCommandLine

def muscle_alignment(seqrecs, preserve_order=True, **kwargs):
    """Align sequences using Muscle 3.8.

    :param seqrecs: a list or dict of SeqRecord that will be aligned to ref
    :param preserve_order: if True, reorder aligned seqrecs to match input order.
    :param **kwargs: additional arguments for alignment command
    :returns: a MultipleSeqAlignment object with aligned sequences
    """

    def commandline(file_obj, **kwargs):
        cline = MuscleCommandline(input=file_obj.name, **kwargs)
        return _generic_aligner_commandline_stdout(cline)

    # Muscle reorders alignment by default, but don't overwrite 'group' if already set
    if "group" not in kwargs:
        kwargs["group"] = not preserve_order
    return _generic_alignment(commandline, seqrecs, preserve_order=preserve_order, **kwargs)

which then allows:

sequences.seq.align(aligner=muscle_alignment, muscle_arg1=something, muscle_arg2=something)

Note that this only works for Muscle 3.8; the latest version of Muscle (5.1) has a new interface that is incompatible with MuscleCommandline :(

ericmjl added a commit that referenced this issue Aug 11, 2022
* add commandline wrapper function for Muscle 3.8

* Add mkdocstrings-python

* Modify docstrings to include Nasos' notes on the MuscleAlignment tool.

* Fix mkdocstrings configuration.

* Update changelog.

Co-authored-by: Nasos Dousis <ndousis@tesseratx.com>
Co-authored-by: Eric Ma <ericmjl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request low-priority Low priority issues
Projects
None yet
Development

No branches or pull requests

2 participants