Pamona

Paper

Manifold alignment for heterogeneous single-cell multi-omics data integration using Pamona

The implementation is based on UnionCom and SCOT.

Enviroment

python >= 3.6

numpy >= 1.18.5
scikit-learn >= 0.23.2
umap-learn >= 0.3.10
Cython >= 0.29.21
scipy >= 1.4.1
matplotlib >= 3.3.1
POT >= 0.7.0

Install

Pamona software is available on the Python package index (PyPI), latest version 0.0.1. To install it using pip, simply type:

pip3 install pamona

Integrate data

Each row should contain the measured values for a single cell, and each column should contain the values of a feature across cells.

>>> from pamona import Pamona
>>> import numpy as np
>>> data1 = np.loadtxt("./scGEM/methylation_partial.txt")
>>> data2 = np.loadtxt("./scGEM/expression_partial.txt")
>>> type1 = np.loadtxt("./scGEM/methylation_type_partial.txt")
>>> type2 = np.loadtxt("./scGEM/expression_type_partial.txt")
>>> type1 = type1.astype(np.int)
>>> type2 = type2.astype(np.int)
>>> Pa = Pamona.Pamona(n_shared=[138], Lambda=10, output_dim=5) # shared cell number 138 is estimated by SPL
>>> integrated_data, T = Pa.run_Pamona([data1, data2])
>>> Pa.test_LabelTA(integrated_data[0],integrated_data[-1],type1,type2)
>>> Pa.alignment_score(integrated_data[0], integrated_data[-1][0:142], data2_specific=integrated_data[-1][142:177])
>>> Pa.Visualize([data1,data2], integrated_data, mode='UMAP')  # without datatype
>>> Pa.Visualize([data1,data2], integrated_data, [type1,type2], mode='UMAP')  # with datatype 
>>> # mode: ["PCA", "TSNE", "UMAP"], default as "PCA".

Example of disagreement matrix of prior information

If cell types are available, users can incorporate the information as follows

>>> gamma = 0.5 # gamma is a parameter, ranges from 0 to 1. 
>>> # A larger gamma gives more importance to the matching of prior information.
>>> DM = np.ones((len(data1), len(data2)))
>>> for i in range(len(data1)):
>>>    for j in range(len(data2)):
>>>        if type1[i] == type2[j]:
>>>            DM[i][j] = gamma
>>> ...
>>> Pa = Pamona.Pamona(..., M=DM, ...)

Examples on simualtions and real data sets (jupyter notebook)

Parameters of `class Pamona`

The list of parameters is given below:

data: list of numpy array, [dataset1, dataset2, ...] (n_datasets, n_samples, n_features). list of datasets to be integrated, in the form of a numpy array.
n_shared: int, default as the cell number of the smallest dataset. shared cell number between datasets.
epsilon: float, default as 0.001. the regularization parameter of the partial-GW framework.
n_neighbors: int, default as 30. the number of neighborhoods of the k-nn graph.
Lambda: float, default as 1.0. the parameter of manifold alignment to make a trade-off between aligning corresponding cells and preserving the local geometries
output_dim: int, default as 30. output dimension of the common embedding space after the manifold alignment
M: numpy array , default as None (optionally). disagreement matrix of prior information.

The other parameters include:

virtual_cells: int, number of virtual cells, default as 1.

max_iter: int, maximum iterations of the partial-GW framework, default as 1000.

tol: float, the precision condition under which the iteration of the partial-GW framework stops, default as 1e-9.

manual_seed: int, random seed, default as 666.

mode: {‘connectivity’, ‘distance’}, type of returned matrix: ‘connectivity’ will return the connectivity matrix with ones and zeros, and ‘distance’ will return the distances between neighbors according to the given metric. has to be either one of 'connectivity' or 'distance', default as 'distance'.

metric: str, the distance metric used to calculate the k-Neighbors for each sample point, default as ’minkowski’.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
MMD		MMD
PBMC		PBMC
SNARE		SNARE
Simulation1		Simulation1
Simulation2		Simulation2
__pycache__		__pycache__
example		example
scGEM		scGEM
scNMT		scNMT
LICENSE		LICENSE
Pamona.py		Pamona.py
README.md		README.md
eval.py		eval.py
utils.py		utils.py
visualization.py		visualization.py

License

caokai1073/Pamona

Folders and files

Latest commit

History

Repository files navigation

Pamona

Paper

Enviroment

Install

Integrate data

Example of disagreement matrix of prior information

Examples on simualtions and real data sets (jupyter notebook)

Parameters of class Pamona

Contact via caokai@amss.ac.cn

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

Parameters of `class Pamona`