GitHub

Welcome to MEGA-V Repository

General Notes and How to cite MEGA-V =======================================

MEGA-V (Mutation Enrichment Gene set Analysis of Variants) is a development of the method used in Cereda et al. (2016) Nature Comm. 7; doi:10.1038/ncomms12072 and pubblished by Gambardella et al. in Bioinformatics (2016); doi:10.1093/bioinformatics/btw809.

MEGA-V is a R application with a Shiny web interface that allows its execution from a web-based environment (section 4). The stand-alone version is also supported (section 5). Examples on how to use MEGA-V are provided (section 6).

How to cite MEGA-V:

Gambardella et al (2016) Bioinformatics; doi:10.1093/bioinformatics/btw809
Cereda et al. (2016) Nature Comm. 7; doi:10.1038/ncomms12072

GNU General Public License =============================

MEGA-V is a free software that can be redistributed and/or modified under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should receive a copy of the GNU General Public License along with the program. If not, please refer to http://www.gnu.org/licenses/.

MEGA-V Description =====================

MEGA-V identifies gene sets with a significantly higher number of variants in a cohort of interest (cohort A) as compared to (1) a control cohort (cohort B) or (2) a random distribution generated using Monte Carlo.

Gene sets are predefined by the user and they can be group of genes involved in the same biological processes, Gene Ontology groups, or genes associated to the same disease.

The type of genetic variants to be compared are also predefined by the user and can be damaging alterations, rare variants, or other types of genetic modifications.

(1) If a control cohort B is used, the user can choose to compare the two distributions of mutation counts between the two cohorts using either the Wilcoxon-rank sum test or the Kolmogorov–Smirnov test according to the data distribution. If multiple gene sets are tested, the resulting p-values are corrected for multiple testing. When the sample size between the two cohorts differs substantially, MEGA-V can run a random sampling with replacement where the larger cohort is randomly down-sampled to the size of the smaller cohort.

(2) If a random distribution of variants in cohort A is used, Monte Carlo permutations are applied to estimate the expected mutation counts within each gene set across all individuals of cohort A. An empirical p-value is then measured as the fraction of observed mutation counts that is greater or equal than the estimated mutation counts.

How to run MEGA-V as a R shiny app in your browser =====================================================

Install the Shiny package (and the required dependencies) in R, and use the function runGithub(). See the example below,

install.packages("shiny")
install.packages("shinyjs")
install.packages("shinythemes")
install.packages("DT")
install.packages("parallel") # for Monte Carlo

library(shiny)
library(shinyjs)
library(shinythemes)
library(DT)

shiny::runGitHub('ciccalab/MEGA')

How to run MEGA-V as a stand-alone app =========================================

Clone the repository on your local machine and open R from the repository folder. Load the following files in the R Global Environment:

source("functions/MEGA.R")

Exemplar usage of MEGA-V ===========================

Two lists of 186 biological processes (ncomm.cereda.186.KEGG.gmt) and 346 disease-associated gene sets (ncomm.cereda.346.gwas.gmt) are provided in the folder ‘gene_sets’.

Two example datasets of gene variants for cohort A (ncomm.cereda.syCRCs.tsv) and cohort B (ncomm.cereda.1000.genomes.tsv) are provided in the folder “example_dataset”. These derive from Cereda et al. Nature Comm 2016 and can be used as exemplar files to run MEGA-V.

Example 1: Use MEGA-V to identify altered gene sets using a control cohort

1. Load the MEGA-V functions in the Global Environment
> source("./functions/MEGA.R")

2. Load the predefined list of gene sets. As example, here we use KEGG gene sets
> gene.sets.kegg = read.gmt.file("gene_sets/ncomm.cereda.186.KEGG.gmt")

3. Load the two sets of patients with synchronous CRCs and individuals of 1000 Genomes Project. A and B: Data frame object containing the mutations counts. Columns = samples; rows = mutations. The first column must always contain the name of the mutated gene.
> A <- read.delim("./example_dataset/ncomm.cereda.syCRCs.tsv.gz",stringsAsFactors = F)
> B <- read.delim("./example_dataset/ncomm.cereda.1000.genomes.tsv.gz",stringsAsFactors = F)

4. Run MEGA-V and identify which gene sets are significantly mutated in the
group of samples A as compared to B
> r = MEGA(A,B,gene.sets.kegg, bootstrapping=TRUE, nsim=1000)

Example 2: Use MEGA-V to identify altered gene sets without a control cohort

1. Load the MEGA-V functions in the Global Environment
> source("./functions/MEGA.R")

2. Load the predefined list of gene sets. As example, here we use KEGG gene sets
> gene.sets.kegg = read.gmt.file("gene_sets/ncomm.cereda.186.KEGG.gmt")

3. Load the set of patients with synchronous CRCs 
> A <- read.delim("./example_dataset/ncomm.cereda.syCRCs.tsv.gz",stringsAsFactors = F)

4. Run MEGA-V and identify which gene sets are significantly mutated in the group of samples A 
> r = MEGA(A,gene.sets = gene.sets.kegg, montecarlo=TRUE, nsim=1000)

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
RData		RData
example_dataset		example_dataset
functions		functions
gene_sets		gene_sets
README.md		README.md
server.R		server.R
ui.R		ui.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RData

RData

example_dataset

example_dataset

functions

functions

gene_sets

gene_sets

README.md

README.md

server.R

server.R

ui.R

ui.R

Repository files navigation

Welcome to MEGA-V Repository

About

Releases

Packages

Contributors 4

Languages

ciccalab/MEGA

Folders and files

Latest commit

History

Repository files navigation

Welcome to MEGA-V Repository

About

Resources

Stars

Watchers

Forks

Languages