Skip to content

simslab/umap_projection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

umap_projection

This code enables the projection of single-cell RNA-seq profiles from one dataset into the UMAP embedding coordinates of a different dataset using Spearman's correlation as a similarity measure. Spearman's correlation is particularly useful in this context because 1) the cluster_diffex pipeline uses Spearman's correlation as a similarity metric and 2) the non-parametric nature of Spearman's correlation allows projection of scRNA-seq data generated by completely different methods from that used in the original embedding. For example, one could project SMART-seq data (e.g. TPM data) onto a UMAP embedding generated using 10x Genomics Chromium data (e.g. molecular counting data). This repository includes code for computing the transformation and generating simple figures.

Requirements:

  1. Python 3.6 or higher
  2. Numpy
  3. Scikit-learn
  4. UMAP (https://github.com/lmcinnes/umap)
  5. Scipy
  6. Numba
  7. Seaborn

Suggested usage:

  1. Install dependencies.

  2. Clone this repository.

  3. Run umap_transform.py. Example usage:

python umap_transform.py -rm REFDATA/REFDATA.matrix.txt -pm QUERY1/QUERY1.matrix.txt QUERY2/QUERY2.matrix.txt -p project_to_REFDATA/project_to_REFDATA --markers markers.txt -k 5

where REFDATA.matrix.txt is a tab-delimited matrix of molecular counts for the reference (first two columns contain GIDS and gene symbols, subsequent column contain counts for each cell), QUERYX.matrix.txt is a matrix of molecular counts for query sample X (same format as REFDATA.matrix.txt), markers.txt is a one-column list of GIDS for computing similarity (usually highly variable genes). There should be no header in any of the files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages