Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downsample from each cluster #115

Open
hiraksarkar opened this issue Dec 23, 2021 · 3 comments
Open

Downsample from each cluster #115

hiraksarkar opened this issue Dec 23, 2021 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@hiraksarkar
Copy link

Hi @evanbiederstedt ,

Is there an equivalent function to this in Seurat satijalab/seurat#3116 (comment) within conos.

@hiraksarkar hiraksarkar added the enhancement New feature or request label Dec 23, 2021
@evanbiederstedt
Copy link
Collaborator

Hi @hiraksarkar

You'll have to write a method. There's more details here: satijalab/seurat#3033

Note that subset() is a S3 method, which you could modify for the package. For example of this, see https://github.com/kharchenkolab/conos/blob/main/R/access_wrappers.R

But yes, this will require a PR from you.

Thanks, Evan

@hiraksarkar hiraksarkar added the help wanted Extra attention is needed label Dec 24, 2021
@hiraksarkar
Copy link
Author

Hi Evan,

Thanks for your input

Hirak

@evanbiederstedt
Copy link
Collaborator

evanbiederstedt commented Dec 25, 2021

@hiraksarkar

I started writing up a function to do this, beginning with the normalized matrices (not cells in the cluster):

#' Applies downsampling uniformly to all samples in a valid Conos object. 
#' Specify the number of cells you'd like to remain via downsampling for the samples within the Conos object. 
#'
#' @param con conos object
#' @param number.of.cells numeric Number of cells to which to have remaining via downsampling. (Note: this is not the number of cells you'd like to remove, but the number of cells you'd like to have remaining.)
#' @return conos object with number of cells downsampled
#' @export
downsampleInputCells <- function(con, number.of.cells=NULL) {
  '%ni%' <- Negate('%in%')
  if ('Conos' %ni% class(con)) {
    stop("Input 'con' not a valid Conos object. ")
  }
  if (length(con$samples)==0) {
    stop("There are no samples in this Conos object to apply downsampling. ")
  }
  if (is.null(number.of.cells)) {
    message("Number of cells not specified, returning Conos object without downsampling. ")
    return(con)
  }
  if (!is.numeric(number.of.cells)) {
    stop("Parameter 'number.of.cells' must be an integer ")
  } else if (number.of.cells != as.integer(number.of.cells)) {
    stop("Parameter 'number.of.cells' must be an integer ")
  }
  ## Check that a sufficient number of cells exist in each sample before removing
  ## Iterate through list of samples. Check if Pagoda2 or Seurat. 
  ## If Pagoda2, then access counts
  for (i in 1:length(con$samples)) {
    if ('Pagoda2' %in% class(con$samples[[i]])) {
      sample = con$samples[[i]]
      ## number of cells in sample
      cells_in_sample = dim(sample$counts)[1]
      ## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
      ## If 'number.of.cells' is greater, than throw error
      if (number.of.cells > cells_in_sample) {
        stop(paste0("The sample ", con$samples[[i]], " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
      }
      subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
      con$samples[[i]]$counts = sample$counts[subsample, ]
    } else if ('Seurat' %in% class(con$samples[[i]])) {
      message("Note: this function creates a new Seurat object with downsampled cells ")
      sample = con$samples[[i]]
      message("First checking that object is most recent version of Seurat")
      sample = UpdateSeuratObject(sample)
      assay_data = GetAssayData(sample)
      ## number of cells in sample
      cells_in_sample = dim(assay_data)[2]
      ## Check that the number of cells is less than or equal to the 'number.of.cells' parameter
      ## If 'number.of.cells' is greater, than throw error
      if (number.of.cells > cells_in_sample) {
        stop(paste0("The sample ", sample, " has ", cells_in_sample, " number of cells. The parameter 'number.of.cells' specified is larger than the cells within the sample. Please correct this."))
      }
      subsample = sample(1:cells_in_sample, number.of.cells, replace=FALSE)
      assay_data = assay_data[, subsample]
      ##new.seurat.object <- SetAssayData(object = sample , slot = "counts", new.data = assay_data)
      ## investigate how to update Seurat object...
      con$samples[[i]] = CreateSeuratObject(assay_data)
    }
  }
}

But I've realized this is really not what you want. I think downsampling is always a bad idea. If you're removing cells for reasons other than QC, then I think it's a mistake.

On reflection, I think what you're trying to do is remove cells in the clusters for the heatmaps, correct? In that case, I think it's best to write a function modifying the heatmap for your purposes---play around with this:

https://github.com/jokergoo/ComplexHeatmap

(Also, the above function really shouldn't use a for-loop, which are bad in R. Try sccore::pbapply() https://www.rdocumentation.org/packages/sccore/versions/0.1.1/topics/plapply )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants