Skip to content
@BITSapienza

Biology IT Sapienza

Modules for Computational Biology & Micro Algae Platform

MicroAlgae DB

Docker

Credits

federico-rosatelli Mat Loriv3 Samsey Calli

Sources:

Project structure:

The project consists of the following four modules:

Main goals

Description

This is a platform of microalgae with cross-referencing functionality, linking data from the most important biologic databases. It's meant for consulting taxonomic, nucleotide and protein data on over 900+ species of microalgae. One of the major focus of the project was to identify the species with carbon capture properties in order to reduce CO2 level in atmosphere.

Solved Issues

Consulting data is now possible in an easier way: you can search for a species, what genes are related to it, what Bioprojects and Experiments have been carried out. It provides references to raw data that can be passed through some major analysis tools. The output from one of this program is intended to be passed to another one, here are listed what it has been used for our end, in this exact order of execution:

  • Busco for checking genomes quality;
  • trimmomatic for trimming of genomic data and pair-end division in strands 1 and 2;
  • hisat2 for creation, from fasta 1 and 2, a sam file that contains treated data;
  • Samtools sort for organizing data contained in sam files and transform them in .bam format;
  • Stringtie that creates .gtf files from .srr, thanks to the notation and to the .bam file bam obtained before.

If data are available after these steps, our platform can also show them.

Development choices

For modularity and portability purpoposes we created two dockerized modules, one for the Frontend and one for the Backend, and another module for retrieving data from NCBI databases. We chose to use a no-SQL database due to its properties to store and manage eterogeneous data. So we preferred MongoDB for its reliability and documentation.

We chose to use Python for the Biopyparse module because there are useful libraries like biopython to fetch data.

We chose to use GoLang for the Backend because it allowed us to develop the API in an easier way than other languages, with its great handling of JSON structures.

We chose to use Vue for the Frontend because it is a modern, standard Framework and we had previous experience using it.

How to run the project

We offer two modalities of use:

  • Follow the Installation paragraph showed below. Only Backend and Frontend need to be downloaded if you want to restore a backup database with MongoDB.
  • Run the modules stand-alone. You can use the docker-files to build locally or you can follow the commands listed in the omonimous sub-repositories, in relative README.md, for dev mode.

Installation

  • Install Mongotools
    • Mongotools is needed only to manage DB data. It provides functions like mongorestore to retrive data from a local backup.
  • Install Docker following docker-engine or docker-server guide
  • Make a new folder, move to that directory and run these commands:
     wget https://raw.githubusercontent.com/BITSapienza/.github/main/docker-compose.yml
     git clone https://github.com/BITSapienza/Bio-Server
     git clone https://github.com/BITSapienza/Vulgaris-Platform
  • Finally, run:
     docker compose up -d
  • Optionally, restore a DB backup using the mongorestore command:
     mongorestore --db=<DB-Name> <backup_path>

Pinned

  1. biopyparse biopyparse Public

    Python Module for handling NCBI data and subsequently creating a NoSQL database

    Python

  2. Bio-Server Bio-Server Public

    GO Server Backend for Biology platforms

    Go 1

  3. Vulgaris-Platform Vulgaris-Platform Public

    Front-end for views and tables in web interface

    Vue 1

Repositories

Showing 4 of 4 repositories

Top languages

Loading…

Most used topics

Loading…