Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to handle .parquet output from Vizgen #7190

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
5fe5b89
updates for Vizgen
alikhuseynov Apr 13, 2023
aa32cc6
updates for Vizgen
alikhuseynov Apr 13, 2023
56d983c
updates for `ReadVizgen()`
alikhuseynov Apr 13, 2023
115e45d
updates for `LoadVizgen`
alikhuseynov Apr 13, 2023
eff89e9
fix for argument 'metadata'
alikhuseynov Apr 13, 2023
87d9303
param args for `LoadVizgen`
alikhuseynov Apr 13, 2023
089740a
fix args for `ReadVizgen`
alikhuseynov Apr 13, 2023
547000e
add support for `future.apply`
alikhuseynov Apr 19, 2023
f73dbdb
Vizgen support single `.parquet` file
alikhuseynov May 24, 2023
5ca1069
major fix for `.parquet` segmentations
alikhuseynov May 31, 2023
9389ce9
fix for `LoadVizgen`
alikhuseynov May 31, 2023
dc029ac
major fix for `ReadVizgen()`
alikhuseynov Jun 6, 2023
8f7153e
fix for `LoadVizgen()`
alikhuseynov Jun 6, 2023
799323d
update `ReadVizgen()`
alikhuseynov Jul 26, 2023
9ea4458
update `LoadVizgen()`
alikhuseynov Jul 26, 2023
38fad2a
resolving some conflicts in preprocessing.R
alikhuseynov Jul 26, 2023
bf15b6e
..update preprocessing.R from `develop`
alikhuseynov Jul 26, 2023
f8461ca
cleaning `ReadVizgen`
alikhuseynov Aug 18, 2023
a7be25a
small bug fix in `ReadVizgen`
alikhuseynov Aug 22, 2023
6352d56
added `sf` & filter polygons -> `ReadVizgen()`
alikhuseynov Aug 28, 2023
f43fed0
..updated `LoadVizgen()`
alikhuseynov Aug 28, 2023
69ba89d
adding `.filter_polygons()`
alikhuseynov Aug 28, 2023
038271f
optimized parallelization for `.filter_polygons()`
alikhuseynov Aug 28, 2023
9db188a
rm space-only changes in `convenience.R`
alikhuseynov Feb 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
213 changes: 137 additions & 76 deletions R/convenience.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ NULL
#' @rdname ReadAkoya
#'
LoadAkoya <- function(
filename,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend removing all of the space-only changes so that reviewers can focus on changes that actually make a functional difference. Thank you so much for all of your work on this!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I think most of the space-only changes are now removed. Thanks

type = c('inform', 'processor', 'qupath'),
fov,
assay = 'Akoya',
...
filename,
type = c('inform', 'processor', 'qupath'),
fov,
assay = 'Akoya',
...
) {
# read in matrix and centroids
data <- ReadAkoya(filename = filename, type = type)
Expand Down Expand Up @@ -115,7 +115,7 @@ LoadNanostring <- function(data.dir, fov, assay = 'Nanostring') {
assay = assay
)
obj <- CreateSeuratObject(counts = data$matrix, assay = assay)

# subset both object and coords based on the cells shared by both
cells <- intersect(
Cells(x = coords, boundary = "segmentation"),
Expand All @@ -129,45 +129,106 @@ LoadNanostring <- function(data.dir, fov, assay = 'Nanostring') {

#' @return \code{LoadVizgen}: A \code{\link[SeuratObject]{Seurat}} object
#'
#' @param add.zIndex If to add \code{z} slice index to a cell
#' @param update.object If to update final object, default to TRUE.
#' @param ... Arguments passed to \code{ReadVizgen}
#'
#' @importFrom SeuratObject Cells CreateCentroids CreateFOV
#' CreateSegmentation CreateSeuratObject
#' @import dplyr
#'
#' @export
#'
#' @rdname ReadVizgen
#'
LoadVizgen <- function(data.dir, fov, assay = 'Vizgen', z = 3L) {
data <- ReadVizgen(
data.dir = data.dir,
filter = "^Blank-",
type = c("centroids", "segmentations"),
z = z
)
segs <- CreateSegmentation(data$segmentations)
cents <- CreateCentroids(data$centroids)
segmentations.data <- list(
"centroids" = cents,
"segmentation" = segs
)
coords <- CreateFOV(
coords = segmentations.data,
type = c("segmentation", "centroids"),
molecules = data$microns,
assay = assay
)
obj <- CreateSeuratObject(counts = data$transcripts, assay = assay)
# only consider the cells we have counts and a segmentation for
# Cells which don't have a segmentation are probably found in other z slices.
coords <- subset(
x = coords,
cells = intersect(
x = Cells(x = coords[["segmentation"]]),
y = Cells(x = obj)
)
)
# add coords to seurat object
LoadVizgen <- function(data.dir, fov = "vz", assay = 'Vizgen',
add.zIndex = TRUE, update.object = TRUE,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would useful to include the filter = '^Blank-' to filter out the blank genes.

Also, need to have mol.type = 'microns' as an input variable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..could add optionallyfilter = '^Blank-', some users might want to keep the background genes for later checks.
..mol.type arg can be used in LoadVizgen()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree some users may want to look at them, but they should be removed before downstream analysis (and this way it works with the current Seurat Vizgen tutorial).

mol.type is also called outside of ReadVizgen() and it's producing errors for me when not loaded into the function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will add filter = "^Blank-" as default, if users want to keep them, then they should set filter = NA_character_ before loading data.

..will make sure that mol.type works without errors.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working off your latest commit I added it mol.type = 'microns to LoadVizgen() arguments and it worked for all my test cases so far.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! For the next commit, I will make sure mol.type arg. work without specifying it in LoadVizgen()

...) {
data <- ReadVizgen(data.dir = data.dir, ...)

#gc() %>% invisible()
message("Creating Seurat object..")
obj <- CreateSeuratObject(counts = data[["transcripts"]], assay = assay)

# in case no segmentation is present, use boxes
if (!"segmentations" %in% names(data)) {
if ("boxes" %in% names(data)) {
bound.boxes <- CreateSegmentation(data[["boxes"]])
cents <- CreateCentroids(data[["centroids"]])
bound.boxes.data <- list(centroids = cents,
boxes = bound.boxes)
message("Creating FOVs..", "\n",
">>> using box coordinates instead of segmentations")
coords <- CreateFOV(coords = bound.boxes.data,
type = c("boxes", "centroids"),
molecules = data[[mol.type]],
assay = assay)
} else { # in case no segmentation & no boxes are present, use centroids only
cents <- CreateCentroids(data[["centroids"]])
message("Creating FOVs..", "\n",
">>> using only centroids")
coords <- CreateFOV(coords = list(centroids = cents),
type = c("centroids"),
molecules = data[[mol.type]],
assay = assay)
}
# only consider the cells we have counts and a segmentation for
# Cells which don't have a segmentation are probably found in other z slices.
coords <- subset(x = coords,
cells = intersect(x = Cells(x = coords[["boxes"]]),
alikhuseynov marked this conversation as resolved.
Show resolved Hide resolved
y = Cells(x = obj)))
} else {
segs <- CreateSegmentation(data[["segmentations"]])
cents <- CreateCentroids(data[["centroids"]])
segmentations.data <- list(centroids = cents, segmentation = segs)
message("Creating FOVs..", "\n",
">>> using segmentations")
coords <- CreateFOV(coords = segmentations.data,
type = c("segmentation", "centroids"),
molecules = data[[mol.type]],
assay = assay)
coords <- subset(x = coords,
cells = intersect(x = Cells(x = coords[["segmentation"]]),
y = Cells(x = obj)))
}

# add z-stack index for cells
if (add.zIndex) { obj$z <- data$zIndex %>% pull(z) }

# add metadata vars
message(">>> adding metadata infos")
if (c("metadata" %in% names(data))) {
metadata <- match.arg(arg = "metadata", choices = names(data), several.ok = TRUE)
meta.vars <- names(data[[metadata]])
for (i in meta.vars %>% seq) {
obj %<>% AddMetaData(metadata = data[[metadata]][[meta.vars[i]]],
col.name = meta.vars[i])
}
}

# sanity on fov name
fov %<>% gsub("_|-", ".", .)

message(">>> adding FOV")
obj[[fov]] <- coords

## filter - keep cells with counts > 0
# small helper function to return metadata
callmeta <- function (object = NULL) { return(object@meta.data) }
nCount <- grep("nCount", callmeta(obj) %>% names, value = TRUE)
if (any(obj[[nCount]] > 0)) {
message(">>> filtering object - keep cells with counts > 0")
obj %<>% subset(subset = !!base::as.symbol(nCount) > 0)
}

if (update.object) {
message("Updating object..")
obj %<>% UpdateSeuratObject() }

message("Object is ready!")
return(obj)

gc() %>% invisible()
}

#' @return \code{LoadXenium}: A \code{\link[SeuratObject]{Seurat}} object
Expand All @@ -188,7 +249,7 @@ LoadXenium <- function(data.dir, fov = 'fov', assay = 'Xenium') {
data.dir = data.dir,
type = c("centroids", "segmentations"),
)

segmentations.data <- list(
"centroids" = CreateCentroids(data$centroids),
"segmentation" = CreateSegmentation(data$segmentations)
Expand All @@ -199,15 +260,15 @@ LoadXenium <- function(data.dir, fov = 'fov', assay = 'Xenium') {
molecules = data$microns,
assay = assay
)

xenium.obj <- CreateSeuratObject(counts = data$matrix[["Gene Expression"]], assay = assay)
if("Blank Codeword" %in% names(data$matrix))
xenium.obj[["BlankCodeword"]] <- CreateAssayObject(counts = data$matrix[["Blank Codeword"]])
else
xenium.obj[["BlankCodeword"]] <- CreateAssayObject(counts = data$matrix[["Unassigned Codeword"]])
xenium.obj[["ControlCodeword"]] <- CreateAssayObject(counts = data$matrix[["Negative Control Codeword"]])
xenium.obj[["ControlProbe"]] <- CreateAssayObject(counts = data$matrix[["Negative Control Probe"]])

xenium.obj[[fov]] <- coords
return(xenium.obj)
}
Expand Down Expand Up @@ -241,27 +302,27 @@ PCAPlot <- function(object, ...) {
#' @export
#'
SpatialDimPlot <- function(
object,
group.by = NULL,
images = NULL,
cols = NULL,
crop = TRUE,
cells.highlight = NULL,
cols.highlight = c('#DE2D26', 'grey50'),
facet.highlight = FALSE,
label = FALSE,
label.size = 7,
label.color = 'white',
repel = FALSE,
ncol = NULL,
combine = TRUE,
pt.size.factor = 1.6,
alpha = c(1, 1),
image.alpha = 1,
stroke = 0.25,
label.box = TRUE,
interactive = FALSE,
information = NULL
object,
group.by = NULL,
images = NULL,
cols = NULL,
crop = TRUE,
cells.highlight = NULL,
cols.highlight = c('#DE2D26', 'grey50'),
facet.highlight = FALSE,
label = FALSE,
label.size = 7,
label.color = 'white',
repel = FALSE,
ncol = NULL,
combine = TRUE,
pt.size.factor = 1.6,
alpha = c(1, 1),
image.alpha = 1,
stroke = 0.25,
label.box = TRUE,
interactive = FALSE,
information = NULL
) {
return(SpatialPlot(
object = object,
Expand Down Expand Up @@ -294,22 +355,22 @@ SpatialDimPlot <- function(
#' @export
#'
SpatialFeaturePlot <- function(
object,
features,
images = NULL,
crop = TRUE,
slot = 'data',
keep.scale = "feature",
min.cutoff = NA,
max.cutoff = NA,
ncol = NULL,
combine = TRUE,
pt.size.factor = 1.6,
alpha = c(1, 1),
image.alpha = 1,
stroke = 0.25,
interactive = FALSE,
information = NULL
object,
features,
images = NULL,
crop = TRUE,
slot = 'data',
keep.scale = "feature",
min.cutoff = NA,
max.cutoff = NA,
ncol = NULL,
combine = TRUE,
pt.size.factor = 1.6,
alpha = c(1, 1),
image.alpha = 1,
stroke = 0.25,
interactive = FALSE,
information = NULL
) {
return(SpatialPlot(
object = object,
Expand Down