Overview

A Collection of Tools for Detecting Rhythmic Genes

Overview

Allows analysis of rhythmic genes to be easily carried out on transcriptomics data using R. Designed to be as flexible as possible such as by allowing an unequal number of replicates across all time points. Where possible, functions have parallel alternatives in order to increase performance on multicore machines. Mundane tasks such as removing genes which show no activity can also be handled by CircadianTools.

Install Guide

From R:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("rain", "seqinr")

install.packages("devtools")
devtools::install_github("nathansam/CircadianTools")
library(CircadianTools)

Documentation is available for CircadianTools and all of its functions via the usual help commands:

?CircadianTools

Full List Of Functions

Simple Plots

BasicPlot: Plots activity data as points and average activity as lines
CompPlot: Plots two genes from a gene activity dataset
DatasetPlot: Saves plots of all genes in a dataset. WARNING! Don't run on a large dataset! Intended for a filtered dataset
TurningPlot: Fits a spline to a given gene in a given dataset. Finds the turning points. Plots the turning points and spline.

Clustering

ClusterCenterGenerator: Finds the center of every cluster in a dataset
ClusterCorDatasetPlot: Uses ClusterCorPlot to plot all of the clusters generated by a clustering method when absolute Pearson's correlation was used as a distance measure.
ClusterCorPlot: Plots the activity level for a cluster generated by using absolute Pearson's correlation as a distance measure. Plots positively and negatively correlated genes as two different lines.
ClusterDatasetPlot: Plots the mean and error bars for all clusters across time
ClusterParamSelection: Calculates validation metrics for different clustering methods and different numbers of partitions. The validation metrics are plotted.
ClusterPlot: Plots the mean and error bars for the genes in a cluster across time
ClusterSpread: Shows how many genes are in each cluster after clustering has been applied.
ClusterText: Takes a dataframe of clusters and stores the name of all genes in a text file. The row number deontes the cluster number.
ClusterTimeProfile: Provides a dataframe of median values at each time point for each cluster.
CommonSingletonFinder: Finds the genes which belong to common singleton clusters in two different clustered datasets.
DendogramDatasetPlot: Plots the dendogram for every cluster in a clustered dataset.
DendogramPlot: Plots the dendogram for a cluster in a clustered dataset
DianaClustering: Applies Diana (DIvisive ANAlysis) clustering to a transcriptomics dataset and appends a cluster column to this dataset for all genes.
DianaParamSelection: Runs DIANA (DIvisive ANAlysis) clustering with differing numbers of partitions and returns validation metrics.
FindClusterDistanceQuantiles: Finds The distances between the center of each cluster and the centers of all the other clusters.
FindClusterMedian: Finds the center of a cluster
FindClusterQuantile: Finds The distances between the center of a cluster and the centers of all other clusters.
HClustering: Applies hierarchical clustering, clustering to a transcriptomics dataset and appends a cluster column to this dataset for all genes.
HclustParamSelection: Runs hierarchical clustering with differing numbers of partitions and returns validation metrics.
MDSPlot: Applies multidimensional scaling to a clustered transcriptomics dataset to reduce the clusters to two dimensions and then plots the clusters.
PamClustering: Applies PAM (Partitioning around Medoids) clustering to a transcriptomics dataset and appends a cluster column to this dataset for all genes.
PamParamSelection: Runs PAM with differing numbers of partitions and returns validation metrics.
QuantilePlots: Finds the quartiles for intercluster distances and plots these distances as a set of histograms
SingletonNameFinder: Finds the genes which belong to singleton clusters.

Correlation

CorAnalysis: Ranks correlation between a given gene and all other genes in a dataset. Plots both the given gene and highly correlated genes for a given correlation value
CoranalysisCluster: Correlates the average activity of a cluster with the average activity of every other cluster.
CoranalysisClusterDataset: Correlates the average activity of each cluster with every other cluster in a dataset.
CorAnalysisDataset: Correlates every gene in a dataset with every other gene in the same dataset. Allows a timelag between genes to be correlated.
CorAnalysisPar: Parallel Implementation of CorAnalysis
CorSignificantPlot: Prints or saves the genes found to be most significant by CorAnalysis or CorAnalysisPar

Cosinor

CosinorAnalysis: Fits cosinor models to transcriptomics data and plots the best-fitting models using ggplot2.
CosinorAnalysisPar: Parallel Implementation of CosinorAnalysis.
CosinorPlot: Fits a cosinor model to a given gene in a given dataset and plots the model.
CosinorResidualDatasetPlot: Fits a cosinor model and plot the residuals for multiple genes in a dataset
CosinorResidualPlot: Fits a cosinor model to a gene and plots the residuals
CosinorSignificantPlot: Prints or saves the genes found to be most significant by CosinorAnalysis.
MultiCosinorTest: Fits a cosinor model and carries out ANOVA using raw coefficients. Then fits a cosinor model with additonal sine and cosine terms with a different period. ANOVA tests are carried out on the more complex model as well as directly comparing the two models.

Cytoscape

CytoscapeFile: Converts a correlation dataframe object into a format suitable for cytoscape and saves as a csv file.
CytoscapeFilter: Reduces the size of a file intended for Cytoscape by filtering out the genes/clusters which are not correlated

Fasta Files

ContigGen: Finds all unique contig IDs in a transcriptomics dataset
FastaSub: Creates a fasta file from only certain sequences in another fasta file

Filtering

AnovaFilter: Filters a gene activity dataframe via ANOVA.
CombiFilter: Filters a transcriptomics dataset by using ZeroFilter, AnovaFilter and SizeFilter.
SizeFilter: Filters the genes with the smallest range from a transcriptomics dataset.
TFilter: Applies a filter where a t-test is carried out on gene activity levels between time points.

ZeroFilter: Filters a transcriptomics dataset such that there is a minimum number of non-zero activity readings for each gene in the resultant dataset.

RAIN

RainAnalysis: Carries out RAIN analysis on a transcriptomics dataset.
RainSignificantPlot Prints or saves the plots of genes found to be most significant by RainAnalysis.

Utility Functions

AbsCorDist: Calculates a distance matrix based on the distance measure of: 1 - |cor(x, y)|
ActivitySelect: Returns gene activity by either gene name or row number
FileConflict: Checks if a file which will be created already exists and, if necessary asks the user if this file should be overwritten.
ggplot.cosinor.lm: Adapted from the Cosinor package by Michael Sachs. Given a cosinor.lm model fit, generate a plot of the data with the fitted values.
GeneRange: Finds the range of gene activity for each gene in a dataframe. The median for the replicates is used for each time point.
GeneScale: Centers/scales every gene in a transcriptomics dataset.
GeneSub: Takes an object where the first column is genenames (IE a column of known Circadian genes) and subsets from a dataset containing activity for these genes.
MakeTimevector: Produces a vector of time values for the gene activity readings.
GeneClean: Removes columns and rows which show no gene activity over time.
MedList: Provides a dataframe of median values at each time point for each gene from a transcriptomics dataset.
TAnalysis: Experimental! A t.test is carried out on gene activity levels between time points and the number of significant increases & decreases is returned.

Examples

Basic Plotting

 BasicPlot("comp100001_c0_seq1",Laurasmappings)

Cosinor Plotting

CosinorPlot("comp102333_c0_seq21", Laurasmappings)

Turnpoint Plotting

TurningPlot("comp101252_c0_seq2", Laurasmappings)

Correlation Analysis

CorAnalysis("comp100002_c0_seq2",Laurasmappings, print=TRUE, threshold=0.97, save=TRUE)

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
R		R
data		data
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CT_github.jpg		CT_github.jpg
CircadianTools.Rproj		CircadianTools.Rproj
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
_config.yml		_config.yml
appveyor.yml		appveyor.yml
codecov.yml		codecov.yml
corex.png		corex.png
cosinorex.png		cosinorex.png
saveplotex.png		saveplotex.png
turnpointex.png		turnpointex.png

License

nathansam/CircadianTools

Folders and files

Latest commit

History

Repository files navigation

Overview

Install Guide

Full List Of Functions

Simple Plots

Clustering

Correlation

Cosinor

Cytoscape

Fasta Files

Filtering

RAIN

Utility Functions

Examples

Basic Plotting

Cosinor Plotting

Turnpoint Plotting

Correlation Analysis

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages