Skip to content

a scalable python suite for tree inference and advanced pseudotime analysis from scRNAseq data.

License

Notifications You must be signed in to change notification settings

LouisFaure/scFates

Repository files navigation

PyPI DOI Documentation Status Build and Test codecov Line count GitHub license Code style: black

Description

This package provides a scalable Python suite for fast tree inference and advanced pseudotime downstream analysis, with a focus on fate biasing. This package is compatible with anndata object format used in scanpy or scvelo pipelines. A complete documentation of this package is available here.

The related work is now available in Bioinformatics:

Louis Faure, Ruslan Soldatov, Peter V. Kharchenko, Igor Adameyko
scFates: a scalable python package for advanced pseudotime and bifurcation analysis from single cell data
Bioinformatics, btac746; doi: https://doi.org/10.1093/bioinformatics/btac746

Tree inference algorithms

The user have the choice between two algorithm for tree inference:

ElPiGraph

For scFates, the python implementation of the ElPiGraph algorithm is used, which include GPU accelerated principal tree inference. A self-contained description of the algorithm is available here or in the related paper

A R implementation of this algorithm is also available, coded by Luca Albergante

A native MATLAB implementation of the algorithm (coded by Andrei Zinovyev and Evgeny Mirkes) is also available

Simple PPT

A simple PPT inspired approach, translated from the crestree R package, code has been also adapted to run on GPU for accelerated tree inference.

Other Citations

Code for PPT inference and most of downstream pseudotime analysis was initially written in a R package by Ruslan Soldatov for the following paper:

Soldatov, R., Kaucka, M., Kastriti, M. E., Petersen, J., Chontorotzea, T., Englmaier, L., … Adameyko, I. (2019).
Spatiotemporal structure of cell fate decisions in murine neural crest.
Science, 364(6444).

if you are using ElPiGraph, please cite:

Albergante, L., Mirkes, E. M., Chen, H., Martin, A., Faure, L., Barillot, E., … Zinovyev, A. (2020).
Robust And Scalable Learning Of Complex Dataset Topologies Via Elpigraph.
Entropy, 22(3), 296.

Code for preprocessing has been translated from R package pagoda2, if you use any of these functions (scf.pp.batch_correct & scf.pp.find_overdispersed), please cite:

Nikolas Barkas, Viktor Petukhov, Peter Kharchenko and Evan
Biederstedt (2021). pagoda2: Single Cell Analysis and Differential
Expression. R package version 1.0.2.

Palantir python tool provides a great dimensionality reduction method, which usually lead to consitent trees with scFates, if use scf.pp.diffusion, please cite:

Manu Setty and Vaidotas Kiseliovas and Jacob Levine and Adam Gayoso and Linas Mazutis and Dana Pe'er (2019)
Characterization of cell fate probabilities in single-cell data with Palantir.
Nature Biotechnology

Installation

scFates is available on pypi, you can install it using:

pip install -U scFates

or the latest development version can be installed from GitHub:

pip install git+https://github.com/LouisFaure/scFates

With all dependencies

-pp.find_overdispersed, tl.test_association, tl.fit, tl.test_fork, tl.activation, tl.test_association_covariate, tl.test_covariate: Require R package mgcv interfaced via python package rpy2:

conda create -n scFates -c conda-forge -c r python=3.11 r-mgcv rpy2=3.4.2 -y
conda activate scFates
pip install scFates

to avoid any possible crashes due to rpy2 not finding the R install on conda, run the following import command:

import os, sys
os.environ['R_HOME'] = sys.exec_prefix+"/lib/R/"
import scFates

-tl.cellrank_to_tree: Requires cellrank to be installed in order to function::

pip install cellrank

On Apple Silicon

Installing mgcv using conda/mamba on Apple Silicon lead to the package not being able to find some dynamic libraries (BLAS). In that case it is recommended to install it separately:

mamba create -n scFates -c conda-forge -c bioconda -c defaults python numpy=1.24.4 "libblas=*=*accelerate" rpy2 -y
mamba activate scFates
Rscript -e 'install.packages("mgcv",repos = "http://cran.us.r-project.org")'

GPU dependencies (optional)

If you have a nvidia GPU, scFates can leverage CUDA computations for speedups for the following functions:

pp.filter_cells, pp.batch_correct, pp.diffusion, tl.tree, tl.cluster

The latest version of rapids framework is required. Create the following conda environment:

conda create --solver=libmamba -n scFates-gpu -c rapidsai -c conda-forge -c nvidia  \
    cuml=23.12 cugraph=23.12 python=3.10 cuda-version=11.2
conda activate scFates-gpu
pip install git+https://github.com/j-bac/elpigraph-python.git
pip install scFates

About

a scalable python suite for tree inference and advanced pseudotime analysis from scRNAseq data.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages