Skip to content

GilbertLabUCSF/CanDI

Repository files navigation

CanDI - A global cancer data integrator

Documentation Status DOI Dataverse

Package Installation

CanDI is now available on PyPI and can be installed with pip:

pip install PyCanDI

For the latest version (development version) install from GitHub:

pip install git+https://github.com/GilbertLabUCSF/CanDI.git

Prepare Datasets

The python command from CanDI will automatically download and modify datasets.

python CanDI/CanDI/setup/install.py

Downloaded and formatted datasets would organize this way:

.
├── config.ini # modified after Installation 
├── depmap
│   ├── CCLE_expression.csv
│   ├── CCLE_fusions.csv
│   ├── CCLE_gene_cn.csv
│   ├── CCLE_mutations.csv
│   ├── CCLE_RNAseq_reads.csv
│   ├── CRISPR_gene_dependency.csv
│   ├── CRISPR_gene_effect.csv
│   └── sample_info.csv
├── genes
│   └── gene_info.csv
└── locations
    └── merged_locations.csv

Package Usage

Import CanDI into python

from CanDI import candi

CanDI Objects

  • data : Container for all candi datasets. All access to datasets go through data object.
  • Gene : Provides cross dataset indexing from the gene perspective.
  • CellLine : Provides cross dataset indexing from the cell line perspective.
  • Cancer : Provides cross dataset indexing by a group of cell lines that are all the same tissue.
  • Organelle: Provides cross dataset indexing for a group of genes whose proteins localize to the same organelle.
  • CellLineCluster : Provides cross dataset indexing for a group of user defined cell lines.
  • GeneCluster : Provides cross dataset indexing for a group of user defined genes.