Skip to content

cresswellkg/SpectralTAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpectralTAD

BioC status Build Status Lifecycle: stable Codecov test coverage

Cresswell, Kellen G., John C. Stansfield, and Mikhail G. Dozmorov. “SpectralTAD: An R Package for Defining a Hierarchy of Topologically Associated Domains Using Spectral Clustering.” BMC Bioinformatics 21, no. 1 (December 2020): 319.

SpectralTAD is a TAD caller that uses a modified form of spectral clustering to quickly identify hierarchical topologically associating domains (TADs). Users input a contact matrix and receive a BED file containing the coordinates of TADs and their levels in a hierarchy. The Level 1 TADs are generally large, well-defined, while the subsequent levels are less well-pronounced yet sufficiently distinct to be recognized as TADs.

The two main functions are SpectralTAD() and SpectralTAD_Par(). SpectralTAD() is a function for calling TADs. SpectralTAD_Par() is the parallelized version. The input data can be an n x n, an n x (n+3), or a sparse 3-column matrix (see vignette (`browseVignettes("SpectralTAD"))

Installation

If necessary, install the dependencies:

install.packages(c('dplyr', 'PRIMME', 'cluster',
    'Matrix',
    'parallel',
    'magrittr',
    'HiCcompare'))

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install('BiocParallel')

The latest version of SpectralTAD can be directly installed from Github:

devtools::install_github('cresswellkg/SpectralTAD', build_vignettes = TRUE)
library(SpectralTAD)

Alternatively, the package can be installed from Bioconductor (to be submitted):

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("SpectralTAD")
library(SpectralTAD)

Input

There are three types of input accepted:

  1. n x n contact matrices
  2. n x (n+3) contact matrices
  3. 3-column sparse contact matrices

These formats are explained in depth in the vignette.

Usage

Multi-Level TADs

#Load example contact matrix
data("rao_chr20_25_rep")
#Find TADs
tads = SpectralTAD(rao_chr20_25_rep, chr = "chr20", levels = 2, qual_filter = FALSE)

The output is a list where each entry corresponds to a level of the TAD hierarchy.

First level sample output:

chr     start     end  Level
chr20   50000 1200000     1
chr20 1200000 2450000     1
chr20 2450000 3525000     1
chr20 3525000 4075000     1

Second level sample output:

chr     start     end  Level
chr20   50000  550000     2
chr20  550000  675000     2
chr20  675000 1200000     2
chr20 1200000 1750000     2

Citation

Cresswell, Kellen G., John C. Stansfield, and Mikhail G. Dozmorov. "SpectralTAD: an R package for defining a hierarchy of Topologically Associated Domains using spectral clustering." BMC bioinformatics 21, no. 1 (2020): 1-19. https://doi.org/10.1186/s12859-020-03652-w

@article{cresswell2020spectraltad,
  title={SpectralTAD: an R package for defining a hierarchy of Topologically Associated Domains using spectral clustering},
  author={Cresswell, Kellen G and Stansfield, John C and Dozmorov, Mikhail G},
  journal={BMC bioinformatics},
  volume={21},
  number={1},
  pages={1--19},
  year={2020},
  publisher={Springer}
}

Contributions & Support

Suggestions for new features and bug reports are welcome. Please, create a new issue for any of these or contact the author directly: @cresswellkg (cresswellkg[at]vcu[dot]edu)

Contributors

Authors: @cresswellkg (cresswellkg[at]vcu[dot]edu) & @mdozmorov (mikhail.dozmorov[at]vcuhealth[dot]org)