Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Following up: Inconsistent behavior of 'merge' for RNA and SCT assays #8687 #8775

Open
anaccsilva opened this issue Apr 15, 2024 · 1 comment

Comments

@anaccsilva
Copy link

Hello,
@viktorfeketa clearly discussed the issues I am having in issue #8687, but I was not able to completely understand the solution.
It does seem to be a recurrent issue for people that was used to Seurat V4 running SCT, but it seems a little bit more complex to use SCT and Seurat V5.

I have 12 individual samples that I previously run SCT normalization. I merged these objects - does it matter RNA, SCT assay? Should I split and run SCT again before integrating?

Here is the code I used (but I am getting strange clusters that I can't really identify):

OM.list <- c(OM39, OM44, OM46, OM19, OM22,OM28,OM32,OM36,OM37,OM45,OM48,OM51)
OM.merge <- merge( x = OM39, y= list(OM44, OM46, OM19, OM22,OM28,OM32,OM36,OM37,OM45,OM48,OM51))


OM.merge[["RNA"]] <- split(OM.merge[["RNA"]], f = OM.merge$disease)
OM.all <- SCTransform(OM.merge)
OM.all <- RunPCA(OM.all)
OM.all <- RunUMAP(OM.all, dims = 1:30)
DimPlot(OM.all, reduction = "umap", group.by = c("disease", "om"))

OM.all <- IntegrateLayers(object = OM.all, method = HarmonyIntegration, normalization.method = "SCT", verbose = F)
OM.all <- FindNeighbors(OM.all, reduction = "harmony", dims = 1:30)
OM.all <- FindClusters(OM.all, resolution = c(0, 0.1, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0, 1.2, 1.5), verbose = FALSE)

I really appreciate if someone can show the workflow for integrating multiple samples using SCT and starting from a list of individual Seurat objects.
Thank you,
Ana

@mhkowalski
Copy link
Contributor

Hello,

Your code looks reasonable to me, except what is the disease variable? Running SCTransform on an object with layers split by disease will create a separate model for each value of disease, which might not be what you want, especially if you have samples from different sequencing runs that have the same value of "disease".

Instead, you could split by "sample" or whatever variable corresponds to sequencing runs, which will also be used for integration.
OM.merge[["RNA"]] <- split(OM.merge[["RNA"]], f = OM.merge$sample)

The updated workflow we show in the vignette (and in your code above) is equivalent to runningSCTransform and RunPCA separately on a list of SeuratObjects. It's easier to merge them initially, split them by layer, and then run preprocessing steps on 1 multi-layered object so that you don't have to loop over a list of seurat objects.

Hope that is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants