Skip to content

Python script for extracting dataset from CCLE (Cancer Cell Line Encyclopedia)

License

Notifications You must be signed in to change notification settings

okadalabipr/ccle_extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ccle_extractor Actions Status

Extracting gene expression datasets from CCLE database.

Manual installation of package requirements

Language Packages
Python >= 3.7 pandas, matplotlib, seaborn
R biomaRt, dplyr, edgeR

CCLE Data

  • CCLE_RNAseq_rsem_genes_tpm_20180929.txt.gz
  • Cell_lines_annotations_20181226.txt
  • CCLE_RNAseq_genes_counts_20180929.gct.gz

Usage

from ccle.database import CancerCellLineEncyclopedia as CCLE

# Set gene_nemes
# Set ccle_names or cell_lines

selected_CCLE_subset = CCLE(
    gene_names = ['EGFR', 'ERBB2', 'ERBB3', 'ERBB4'],
    ccle_names = ['MCF7_BREAST', 'MDAMB231_BREAST']
)

''' or
selected_CCLE_subset = CCLE(
    gene_names = ['EGFR', 'ERBB2', 'ERBB3', 'ERBB4'],
    cell_lines = ['MCF7', 'MDA-MB-231']
)
'''

# GeneCards link (https://www.genecards.org)
selected_CCLE_subset.to_gene_summary()
# TPM value
selected_CCLE_subset.to_gene_expression()

Installation

$ git clone https://github.com/okadalabipr/ccle_extractor.git

License

MIT

About

Python script for extracting dataset from CCLE (Cancer Cell Line Encyclopedia)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published