With the Tornielli GB, Sandri M, Fasoli M, Amato A, Pezzotti M, Zuccolotto P, Zenoni S (2023) A molecular phenology scale of grape berry development. Horticulture Research, Volume 10, Issue 5:uhad048. doi:10.1093/hr/uhad048 |
![]() |
The molecular phenology scale (MPhS) represents a new tool to precisely
align time-series of fruit samples on the basis of molecular changes and
to quantify their transcriptomic distance.
The MPhS was built by
exploiting molecular-based information from several grape berry
transcriptomic datasets.
The proposed statistical pipeline consists
of an unsupervised learning procedure yielding an innovative combination
of semiparametric, smoothing, and dimensionality reduction tools.
The MPhS is a complementary method for mapping the progression of grape
berry development with higher detail compared to classic time- or
phenotype-based approaches, and could help coping with challenges such
as those raised by climate change.
You can install the development version of MPhS from GitHub with:
# install.packages("pak")
pak::pak("sndmrc/MPhS")
This is a basic example that shows you how to map the RPKMdata
dataset
(included in the package) onto the MPhS.
For more information about
this dataset, please see the RPKMdata help documentation by using
?RPKMdata
.
Load libraries and data.
library(MPhS)
library(tidyr)
library(dplyr)
data("RPKMdata")
The MPhStimepoints
command, which maps data to the molecular phenology
scale, requires an input dataframe organized with samples as rows and
genes as columns, with the following characteristics:
- expression
values for each gene must be in separate columns;
- additional
columns must be included for experimental conditions and maturation
stages;
- each column representing gene expression levels must be
named using its gene ID (either V1 or V3 annotation).
Create variables representing the experimental conditions and a variable that defines the maturation stage.
exp_cond <- names(RPKMdata)[-1]
genes <- RPKMdata$gene_id
dts_vars <- data.frame(exp_cond) %>%
separate(exp_cond, into=c("Cultivar", "Stage", "Replicate"), sep="_")
Transpose the gene expression matrix and add the newly derived variables.
dts <- t(RPKMdata[, -1])
dts <- cbind(dts, dts_vars)
names(dts) <- c(genes, names(dts_vars))
For each stage and each cultivar, calculate the mean value of the 3 replicates (it can takes several minutes).
dts_means <- dts %>%
group_by(Cultivar, Stage) %>%
summarize(across(all_of(genes), mean))
Map data onto the transcriptomic scale using the MPhStimepoints
command.
MPhS_out <- MPhStimepoints(data=dts_means, strata_var="Cultivar", stage_var="Stage")
The MPhS_out
object can be used to visualize the position of the
samples on the transcriptomic scale.
p <- plot(MPhS_out)
print(p)