Skip to content

babelomics/MechACov

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project 34: Development of a tool for mechanistic meta-analyses using COVID-19 available data as proof of concept

Abstract

The development of biomedical high-throughput technologies has made omic analyses more affordable and, thus, accessible. This technological boom has rapidly taken us to a scenario where we are gathering a vast amount of data at public repositories, such as GEO, EGA, SRA or ArrayExpress, changing the data challenge, that now lies in the integration of all the available data, and in drawing conclusions from it as a whole.

Given the recent events of COVID-19 pandemic, we expect to be flooded with a huge amount of omics data from cells or patients infected with SARS-CoV-2 in the upcoming years, since several initiatives are arising worldwide. These studies have great value by themselves, however, the joint effort of all the research institutions around the globe will show more power after analysing all the generated data as a whole.

In this project we aim to develop a tool that would be able to retrieve all the transcriptomic information available at several public data repositories and to optimize a workflow to perform gene expression and mechanistic meta-analyses with a simple but robust methodology. This workflow will be able to deal with several issues regarding automatization the meta-analysis process, such as the heterogeneity of samples, the access to unified metadata, standardize variable codification, sample selection or cross-platform effect. We will work in all these aspects trying to find the best solution for each issue, using already available open workflows and resources that meet ELIXIR criteria, or developing new ones if needed.

Topics

Covid-19 Data Platform Tools Platform

Project Number: 34

EasyChair Number: 54

Team

Lead(s)

Maria Peña-Chilet (author) mariapch84@gmail.com

Nominated participant(s)

Marina Esteban marina.estebanm@gmail.com

Jose Luis Fernandez-Rueda josel.fernandez.rueda@juntadeandalucia.es

Expected outcomes

The final outcome of this project is a user-friendly web-based tool that will retrieve transcriptomic data from a given disease, or diseases, and perform a meta-analysis, on both gene expression and pathway activation.

Scripting:

An optimized workflow using python and R to retrieve and integrate transcriptomic information from several repositories and to integrate data and perform meta-analyses.

  • Choose data repositories.
  • Establish a data retrieval method.
  • Explore study information and relevant metadata.
  • Define standardization of data and metadata.
  • Search for to account for data structure, heterogeneity and cross-platform issues.
  • Develop a pipeline for meta-analysis.
  • Perform gene expression meta-analysis of available COVID-19 data.
  • Perform pathway activity meta-analysis of available COVID-19 data.
  • If there are no sufficient number of COVID-19 studies, we will explore data from other respiratory syndromes such as SARS, MERS or influenza.

Coding:

A web tool structure that will allocate the tool functionalities as modules that will run the developed scripts, the web development will be focused on data visualization.

  • Decide type of data to visualize.
  • Establish output and input data.
  • Define data structure.

Both workgroups (scripters and coders) will work parallel, but have some dependencies that need to be addressed. We do not plan to finish all the tool, but to develop the structure and outline the desired tool functionalities and best strategy to achieve our tool goals.

Expected audience

Scripters and helpers

Data scientists, Biostatisticians, Biologists or Biotechnologists and Bioinformatics.

Coders

Software engineers, Front-end developers, Web developers.

Some desired, but not mandatory, skills:

Statistics, biomedicine or bioinformatics, programming, scripting, python, R, R/Shiny, React, JavaScript, data visualization libraries such as D3 or cytoscapeJS.

Everyone can join us!

Number of expected hacking days: 4 days