Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Mashmap #485

Merged
merged 21 commits into from May 13, 2022
Merged

feat: Mashmap #485

merged 21 commits into from May 13, 2022

Conversation

tdayris
Copy link
Contributor

@tdayris tdayris commented May 6, 2022

Description

This PR adds MashMap, a long read aligner.

QC

For all wrappers added by this PR, I made sure that

  • there is a test case which covers any introduced changes,
  • input: and output: file paths in the resulting rule can be changed arbitrarily,
  • either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • all environment.yaml specifications follow the respective best practices,
  • wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • the meta.yaml contains a link to the documentation of the respective tool or command,
  • Snakefiles pass the linting (snakemake --lint),
  • Snakefiles are formatted with snakefmt,
  • Python wrapper scripts are formatted with black.

bio/mashmap/meta.yaml Outdated Show resolved Hide resolved
bio/mashmap/wrapper.py Outdated Show resolved Hide resolved
@fgvieira
Copy link
Collaborator

fgvieira commented May 6, 2022

It seems that MashMap has its own output format, "space-delimited with each line consisting of query name, length, 0-based start, end, strand, target name, length, start, end and mapping nucleotide identity".
Just for the sake com consistency with other read mappers, is there a way to convert this output to a SAM/BAM/CRAM?

Also, does MasMap uses any temp folder?

@tdayris
Copy link
Contributor Author

tdayris commented May 6, 2022

Just for the sake com consistency with other read mappers, is there a way to convert this output to a SAM/BAM/CRAM?

Mashmap format to SAM/BAM/CRAM is not supported by MashMap. There is an open issue about that on their git. I'm wrapping this tool to follow Salmon guideline to create a gentrome file with decoy sequences.

Also, does MasMap uses any temp folder?

There is no official variable or command line argument to overload any default temporary directory.

bio/mashmap/meta.yaml Outdated Show resolved Hide resolved
@fgvieira fgvieira merged commit c05006d into snakemake:master May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants