You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the code I used to remove the batch effect :
First I installed the package limma using mamba. For DESeq2 I load the module sequana.
library(limma)
library(DESeq2)
# load the raw counts
raw_counts <- read.csv("path_to_raw_counts", sep=",",header=T)
design <- read.csv("design.csv", sep=",", header=T)
# transform the counts to a matrix
cts <- as.matrix(raw_counts)
# transformation into factors
coldata <- DataFrame(label=factor(design$label), condition=factor(design$condition), batch=factor(design$batch))
# check that they. have the same order
all(rownames(coldata) == colnames(cts))
# load data with deseq2
dds <- DESeqDataSetFromMatrix(countData = cts,colData = coldata,design = ~ batch + condition)
# create the dds object
dds <- DESeq(dds)
# vst transformation for PCA
vsd <- vst(dds)
# remove the batch effect with limma
assay(vsd) <- limma::removeBatchEffect(assay(vsd), vsd$batch)
# plot the PCA
pdf("pca_batch.pdf")
plotPCA(vsd, intgroup = c("condition"))
dev.off()
In the case of two batch effects :
# the first lines of code are the same
# transformation of the design file into factors
coldata <- DataFrame(label=factor(design$label), condition=factor(design$condition), group=factor(group$batch))
# here we load the dataset with a simple design
dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ condition)
# we add the column ind.n in colData for individuals
coldata$ind.n <- factor(rep(rep(1:4,each=1),4))
# we modify the data already loaded with DESeq2 and add the col ind.n
colData(dds) = coldata
#the modified design will result in this matrix
model.matrix(~ group + group:ind.n + group:condition, coldata)
# PCA part
vsd <- vst(dds)
# we remove the two batch effects
assay(vsd) <- limma::removeBatchEffect(assay(vsd), vsd$group, vsd$ind.n)
pdf("pca_test2.pdf")
plotPCA(vsd, intgroup = c("cnd"))
dev.off()
No description provided.
The text was updated successfully, but these errors were encountered: