Cancer Proteome: Analyzer and Visualizer

Abstract
Requirements
Development

Abstract

Proteomics is the study of protein that includes protein function and structure. One of the main objectives of this field is to explore the 3D structure of proteins. This application utilizes the main objective of proteomics. The user can explore a specific set of genes involved in cancer. It allows users to access information about that specific biomolecule, and run basic analysis and visualize its models.

Specific Objectives

Users can access pre-populated data such as features, gene ontology, cross-references on certain genes that are involved in cancer
User can select certain aspects of the protein to explore, such as looking at spatial distribution of the protein chains, a basic computation of the molecular distances, and 3D structure of the protein

Requirements

Development

Installation

The user can access limited about of protein data via a web page at Cancer Proteomics: Visualizer and Analyzer, this will allow users to see pre-populated data for a limited set of proteins. There is a drop-down menu, from where a user can choose the protein of interest.

Alternately, if the user is interested in a protein that is not listed on the web page, they can set the project locally by following these steps

    Fork and clone the CancerProteome_Analyzer_Visualizer Repository from GitHub`

To import the database

    $ mysql –u <yourusername> -p <yourpassword> -h localhost proteomics < proteomics.sql

Install Dependencies using HomeBrew(if on Windows use pip for all these packages)

    $ brew update
    $ brew install Python
    $ brew install mysql
    $ brew tap homebrew/science
    $ brew install matpoltlib

    Pandas, Requests and BioPython should install via pip:
    $ pip install pandas
    $ pip install biopython
    $ pip install requests

Server set-up to run CGI scripts(there are a lot of ways to do this)

    $ brew install node
    $ npm install http-server –g

To start the server type

    $ http-server

You should see the port at which the page is available, for me it was Port 8080. Open the page at

    localhost:8080

Another alternate is using either a server like Apache or WebBricks

Running-program

There are two file in the directory that can be used for analysis uniprot.py and pdb.py.

Sample Analysis using UniProt Script(uniprot.py)

In the project directory

    $ python
>>> import uniprot
>>> import requests
>>> import pandas as pd
>>> req(uniprot.server,  query=’gene:<protein name> AND organism: Human’ AND reviewed:yes’)
>>> uniprot_list = pd.read_table(StringIO.StringIO(req.text))
>>> uniprot_list.rename(columns={]Organism IS: ‘ID’}, inplace=True)
>>> <protein name> = uniprot_list[uniprot.list.ID == 9606][‘Entry Name’].tolist()[0]
>>> handle = ExPASy.getsprot_raw(<protein name>)
>>> sp_rec = SwissProt.read(handle)
>>> uniprot.extract_features(sp_rec)

Sample Analysis of a model for above protein using BioPython PDB(pdb.py)

In the project directory

    $ python
>>> from Bio import PDB
>>> from Bio.PDB import *
>>> import matplotliv.pyplot as plt
>>> repository = PDB.PDBList()
>>> parser = MMCIFParser()
>>> repository.retrieve_pdbfile(<model_of_choice>)
>>> <model name> = parser.get_structure(‘name’, <protein_model.cif>)
>>> pdb.describe_model(<name, <model name>)
>>> pdb.plot(<model name>)

Note: A protein can have many models, with many different configurations, this program and some of its methods are not generic methods. You might have to configure the methods to your model of choice e.g. looking for residues in a chain.

Alternate would be to configure these scripts according to your protein model of choice. Added benefit of this is you will be able to save this data to the proteomics database. For that add too data and pdb_data lists.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.idea		.idea
js		js
matplotlib_analysis		matplotlib_analysis
protein_models		protein_models
style		style
.gitignore		.gitignore
BOSHIKATARA_PROJECTANALYSIS.pdf		BOSHIKATARA_PROJECTANALYSIS.pdf
BoshikaTara_ProjectProposal.pdf		BoshikaTara_ProjectProposal.pdf
Procfile		Procfile
README.md		README.md
header_image.jpg		header_image.jpg
mysqlconnector.py		mysqlconnector.py
pdb.py		pdb.py
protein.cgi		protein.cgi
proteomics.html		proteomics.html
proteomics.sql		proteomics.sql
proteomics_model.py		proteomics_model.py
requirements.txt		requirements.txt
uniprot.py		uniprot.py

boshika/CancerProteome_Analyzer_Visualizer

Folders and files

Latest commit

History

Repository files navigation

Cancer Proteome: Analyzer and Visualizer

Abstract

Requirements

Development

Installation

Running-program

Sample Analysis using UniProt Script(uniprot.py)

Sample Analysis of a model for above protein using BioPython PDB(pdb.py)

About

Topics

Resources

Stars

Watchers

Forks

Languages