Skip to content

Solutions to the bioinformatic coding challenges at rosalind.info

License

Notifications You must be signed in to change notification settings

danhalligan/rosalind.info

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

http://rosalind.info

CI workflow License Code style

This repository contains solutions to bioinformatics coding challenges from rosalind.info. Problems are organised by the various different locations:

Running the solutions

This repository is written as a python module and uses poetry and typer.

Solutions for each problem are located in individual files inside the directory for each location.

You can install the versions of dependencies used here with:

poetry install

To run solutions within this environment run, e.g.:

poetry run rosalind ini2 rosalind_ini2.txt

To run the solution on the provided "Sample Dataset" from rosalind.info (which should reproduce the "Sample Output"), run the solution in "test" mode:

poetry run rosalind --test ini2

Testing

pytest-snapshot is used to test solutions to problems. In many cases solutions generated will and should exactly match the "Sample Output" given at rosalind.info. In cases, where e.g. ordering is not important, the expected solutions (in tests/expected) have been updated to match code used here, but are equally valid solutions.

To run the tests use:

poetry run pytest

To update the tests (adding or modifying snapshots / expected output) use:

poetry run pytest --snapshot-update

Note that some solutions (that use Entrez) require an email address. This should be set as an environment variable, e.g.:

export ENTREZ_EMAIL=rosalind.franklin@cam.ac.uk

About

Solutions

Python Village

Bioinformatics Stronghold

Notes

  • For "QRTD" I have cheated by using tqDist. For the solution to run you will need to install quartet_dist and have it available in your path. Well done to anyone else who solved this properly!

Bioinformatics Armory

Notes

  • For "MEME" and "CLUS" I have not written a solution. Use the web interface as instructed.
  • For "SUBO", you need to run the online interface, identify the 32-40 bp and then can use the solution here to count the occurrences of this in the sequences.

Bioinformatics Textbook Track

Algorithmic Heights