Skip to content

Genometric/MSPC

Repository files navigation

MSPC

Quick Start | Documentation | Download | Publication

About

The analysis of ChIP-seq samples outputs a number of enriched regions, each indicating a protein-DNA interaction or a specific chromatin modification. Enriched regions (commonly known as "peaks") are called when the read distribution is significantly different from the background and its corresponding significance measure (p-value) is below a user-defined threshold.

When replicate samples are analysed, overlapping enriched regions are expected. This repeated evidence can therefore be used to locally lower the minimum significance required to accept a peak. Here, we propose a method for joint analysis of weak peaks.

Given a set of peaks from (biological or technical) replicates, the method combines the p-values of overlapping enriched regions: users can choose a threshold on the combined significance of overlapping peaks and set a minimum number of replicates where the overlapping peaks should be present. The method allows the "rescue" of weak peaks occuring in more than one replicate and outputs a new set of enriched regions for each replicate.

In general, the method groups enriched regions as background, weak, or stringent based on user-defined weak and stringency thresholds. The method then confirms or discards the weak and stringent enriched regions if their combined stringency is at least as significant as a user-defined threshold. The method then performs a multiple testing correction on confirmed enriched regions at a user-defined false-discovery rate, identifying true-positive and false-positive regions. See the following figure as an example, and you may refer to MSPC publications, slides on slideshare, or documentation page for more details.


Download and Run

MSPC is distributed as a cross-platform console application, a .NET library, and a Bioconductor R package.