Skip to content
This repository has been archived by the owner on Jan 31, 2020. It is now read-only.

Clinical Sequencing

Malachi Griffith edited this page May 28, 2014 · 2 revisions

in progress

Contents

Overview

The Clinical Sequencing pipeline ("ClinSeq") is a post-processing and reporting layer that summarizes the analyses that have been performed on a dataset.

Guides

Inputs

All of the following inputs are optional. The pipeline will produce reports based on what inputs are available.

WGS Model

A Somatic Variation model with whole-genome data.

Exome Model

A Somatic Variation model with exome data.

Tumor RNASeq Model

An RNA-seq model for the "tumor" data.

Normal RNASeq Model

An RNA-seq model for the "normal" data.

Differential Expression Model

A Differential Expression model comparing the Tumor and Normal RNASeq data.

Cancer Annotation DB

Annotation and gene ID mapping files to help map to Entrez/Ensembl/UCSC IDs.

Miscellaneous Annotation DB

Contains BreakAnnot data and additional annotation files for Copy Number analysis.

COSMIC Annotation DB

A local copy of COSMIC, the "Catalogue of Somatic Mutations in Cancer".

Force

A boolean flag. If set it will allow analysis of data that appears to not match--that is, to be from different patients or otherwise discrepant sources.

Data Products

Build Processes

The ClinSeq pipeline dynamically generates its Workflow based on the supplied inputs. The following processes are available:

Summarize Builds

This step produces a textual report summarizing information in the builds which are inputs to the ClinSeq build.

Import SNVs and Indels

This step takes the SNVs and Indels from the Somatic Variation build inputs and reformats and merges the files for subsequent processing.

Get Variant Sources

This step attempts to determine which variant caller generated each variant call in its Somatic Variation build inputs.

Create Mutation Diagrams

This step produces mutation-spectrum plots based on the Somatic Variation build inputs.

Tophat Junctions Absolute

This step takes the Tophat Junctions from the RNA-Seq build inputs and reformats the files.

Cufflinks Expression Absolute

This step takes the Cufflinks output from the RNA-Seq build inputs and creates several derivative files.

Chimerascan Intersect SV

This step intersects the fusion calls from the RNA-Seq build with the SV calls from the Whole-Genome Somatic Variation build.

Dump IGV XML

This step produces IGV session files for convenience in viewing the dataset.

Generate Clonality Plots

This step produces Clonality plots from the Whole-Genome Somatic Variation build.

Run CopyNumber View

This step uses the clonality analysis above to produce a Copy Number report using R.

Run Microarray CNV

This step attempts to grab Genotype Microarray models for the data set and queries their last complete builds to produce a Copy Number report.

Run Exome CNV

This step produces Copy Number plots for the exome data.

Summarize CNVs

This step produces a summary using the output from Run CopyNumber view.

Summarize SVs

This step produces a summary of SVs from the Whole-Genome Somatic Variation build.

Annotate Genes by Category

This step categorizes (e.g. kinase) the genes in the variants.

DGIdb Gene Annotation

This step produces a DGIdb gene annotation report containing known drug interactions.

Summarize Tier 1 SNV Support

This step runs BAM-readcount.

Make Circos Plot

This step produces a circos plot from the available variant data. (At a minimum the Whole Genome Somatic Variation build must be present to run this step.)

Converge SNV Indel Report

This step runs several R scripts and the annotator to produce reports.

Tutorials

Under development

Known Issues

Under development

Clone this wiki locally