Biology IT Sapienza

MicroAlgae DB

Credits

federico-rosatelli Mat Loriv3 Samsey Calli

Sources:

NCBI
EBI

Project structure:

The project consists of the following four modules:

Database: MongoDB
Parser: BioPyParse (Python)
Backend: BioServer (GoLang)
Frontend: Vulgaris Platform (Vue.js)

Main goals

Description

This is a platform of microalgae with cross-referencing functionality, linking data from the most important biologic databases. It's meant for consulting taxonomic, nucleotide and protein data on over 900+ species of microalgae. One of the major focus of the project was to identify the species with carbon capture properties in order to reduce CO2 level in atmosphere.

Solved Issues

Consulting data is now possible in an easier way: you can search for a species, what genes are related to it, what Bioprojects and Experiments have been carried out. It provides references to raw data that can be passed through some major analysis tools. The output from one of this program is intended to be passed to another one, here are listed what it has been used for our end, in this exact order of execution:

Busco for checking genomes quality;
trimmomatic for trimming of genomic data and pair-end division in strands 1 and 2;
hisat2 for creation, from fasta 1 and 2, a sam file that contains treated data;
Samtools sort for organizing data contained in sam files and transform them in .bam format;
Stringtie that creates .gtf files from .srr, thanks to the notation and to the .bam file bam obtained before.

If data are available after these steps, our platform can also show them.

Development choices

For modularity and portability purpoposes we created two dockerized modules, one for the Frontend and one for the Backend, and another module for retrieving data from NCBI databases. We chose to use a no-SQL database due to its properties to store and manage eterogeneous data. So we preferred MongoDB for its reliability and documentation.

We chose to use Python for the Biopyparse module because there are useful libraries like biopython to fetch data.

We chose to use GoLang for the Backend because it allowed us to develop the API in an easier way than other languages, with its great handling of JSON structures.

We chose to use Vue for the Frontend because it is a modern, standard Framework and we had previous experience using it.

How to run the project

We offer two modalities of use:

Follow the Installation paragraph showed below. Only Backend and Frontend need to be downloaded if you want to restore a backup database with MongoDB.
Run the modules stand-alone. You can use the docker-files to build locally or you can follow the commands listed in the omonimous sub-repositories, in relative README.md, for dev mode.

Installation

Install Mongotools
- Mongotools is needed only to manage DB data. It provides functions like mongorestore to retrive data from a local backup.
Install Docker following docker-engine or docker-server guide

Make a new folder, move to that directory and run these commands:

 wget https://raw.githubusercontent.com/BITSapienza/.github/main/docker-compose.yml

 git clone https://github.com/BITSapienza/Bio-Server

 git clone https://github.com/BITSapienza/Vulgaris-Platform

Finally, run:
```
 docker compose up -d
```
Optionally, restore a DB backup using the mongorestore command:
```
 mongorestore --db=<DB-Name> <backup_path>
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Biology IT Sapienza

MicroAlgae DB

Credits

Sources:

Project structure:

Main goals

Description

Solved Issues

Development choices

How to run the project

Installation

Pinned

Repositories

People

Top languages

Most used topics