Homology modeled structure? #11

davidroberson · 2016-11-16T20:46:06Z

Is it possible to read in a homology modeled structure that is not in RCSB pdb using the

-pdb-file-dir

argument?

Thanks

The text was updated successfully, but these errors were encountered:

AdamDS · 2016-11-16T21:51:47Z

Dave,

I don't believe it will work with the current state of HotSpot3D. There may be some issues since HotSpot3D uses information from UniProt and other databases to help with structure mapping. If the model is not in UniProt for your gene/protein then there should be errors in the uppro/calpro step. HotSpot3D looks to the chain information contained in UniProt for DBREF/PDB structures.

However, if your structure file is in the same format as a .pdb file, then there may be a way that we can work with non-RCSB/non-UniProt listed structures.

-Adam

On 11/16/16 2:46 PM, Dave Roberson wrote:

Is it possible to read in a horology modeled structure that is not in RCSB pdb using the

-pdb-file-dir

argument?

Thanks

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/11, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEqLJ6BDXjGyfHe6pYyn84Qg2lINqNsiks5q-2uOgaJpZM4K0blC.

The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

davidroberson · 2016-11-17T15:43:17Z

Hi @AdamDS

The model is in pdb format and homology modeled off of 2ZPA in Swiss-Model.
http://www.rcsb.org/pdb/explore.do?structureId=2ZPA

Thanks for your help!

@sabrodie

AdamDS · 2016-11-17T22:44:38Z

@sabrodie,

I think that there is a way to get this to work then. You'll need to be sure of a couple of details:

Name your model file 2ZPA.pdb and store it in the local pdb-dir that HotSpot3D will use.
Make sure that the protein chains are the same - that your homologous protein is labeled for the same chains as the original protein given in 2ZPA.

There may be some other necessary details, but I think that these two are the most critical.

-Adam

On 11/17/16 9:43 AM, Dave Roberson wrote:

Hi @AdamDShttps://github.com/AdamDS

The model is in pdb format and homology modeled off of 2ZPA in Swiss-Model.
http://www.rcsb.org/pdb/explore.do?structureId=2ZPA

Thanks for your help!

@sabrodiehttps://github.com/sabrodie

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/11#issuecomment-261281518, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEqLJ5GMHjRXundha9FhtYdHfnXzTDgtks5q_HYVgaJpZM4K0blC.

The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

AdamDS · 2016-11-17T23:31:20Z

I just noticed that your protein is non-human. In the transcript annotation step there will be errors, because HotSpot3D expects transcripts from Ensembl. There is a line that will not know how to deal with EnsemblBacteria transcripts.
From what I can tell, the necessary files and lookups should largely be identical, so it could be possible to make some small tweaks to allow non-human proteins to be used. However, I am less familiar with processing them, so I cannot be sure how many other changes would be needed.

davidroberson · 2016-11-17T23:47:38Z

Thank you @AdamDS. I will talk to @sabrodie who is the functional scientist leading this project and get back to you. He did have one more question which I will paraphrase here:

...is it possible that the variants in our gene of interest are not in solved (crystalized) regions of the protein.
see http://www.uniprot.org/uniprot/O43683
Secondary structure
1
1085
Legend: HelixTurnBeta strand
Show more details
3D structure databases
Entry Method Resolution (Å) Chain Positions PDBsum
2LAH NMR - A 1-150 [»]
4A1G X-ray 2.60 A/B/C/D 1-150 [»]
4QPM X-ray 2.20 A/B 740-1085 [»]
4R8Q X-ray 2.31 A 724-1085 [»]
5DMZ X-ray 2.40 A/B 726-1085 [»]

It looks like the mutations fall into the AA#~500.  Does that meanit is not represented in the crystal structures in the RCSB database?

FInally, is there an ideal number of genes to have present in the MAF file? We have many whole exomes worth of data...but are just focusing on a few genes. Should we change our approach?

sabrodie · 2016-11-17T23:57:42Z

@dave , @AdamDS
That was in reference to another protein in the same project....a very different problem.

Seth Brodie PhD
Senior Scientist Functional Group
Cancer Genomics Research Laboratory (CGR)
Division of Cancer Epidemiology and Genetics, NCI
Leidos Biomedical Research, Inc.
8717 Grovemont Circle
ATC Room 225B(office) Room 109(lab)
Gaithersburg, MD 20877

-----Original Message-----
From: Dave Roberson [notifications@github.commailto:notifications@github.com]
Sent: Thursday, November 17, 2016 06:47 PM Eastern Standard Time
To: ding-lab/hotspot3d
Cc: Brodie, Seth (NIH/NCI) [C]; Mention
Subject: Re: [ding-lab/hotspot3d] Homology modeled structure? (#11)

Thank you @AdamDShttps://github.com/AdamDS. I will talk to @sabrodiehttps://github.com/sabrodie who is the functional scientist leading this project and get back to you. He did have one more question which I will paraphrase here:

...is it possible that the variants in our gene of interest are not in solved (crystalized) regions of the protein.
see http://www.uniprot.org/uniprot/O43683
Secondary structure
1
1085
Legend: HelixTurnBeta strand
Show more details
3D structure databases
Entry Method Resolution (Å) Chain Positions PDBsum
2LAH NMR - A 1-150 [»]
4A1G X-ray 2.60 A/B/C/D 1-150 [»]
4QPM X-ray 2.20 A/B 740-1085 [»]
4R8Q X-ray 2.31 A 724-1085 [»]
5DMZ X-ray 2.40 A/B 726-1085 [»]

It looks like the mutations fall into the AA#~500. Does that meanit is not represented in the crystal structures in the RCSB database?

FInally, is there an ideal number of genes to have present in the MAF file? We have many whole exomes worth of data...but are just focusing on a few genes. Should we change our approach?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com//issues/11#issuecomment-261406525, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AVw8zuuaNVbAef27qC_IvQ3P5EJE2kzDks5q_OebgaJpZM4K0blC.

AdamDS · 2016-11-18T00:22:35Z

Some variants do end up in non-solved regions of the models. HotSpot3D cannot do anything with these at this time.
If you know that you will only need to look at a handful of genes, I very much recommend that you use a subset of your original .maf that contains only mutations from your genes of interest. This will drastically reduce run time and storage space usage. For perspective, preprocessing the ~5k human protein pdb structures takes ~1week to run on an LSF server and the data will take up ~2TB of space. We are in the process of optimizing HotSpot3D preprocessing to improve both run time and storage usage, but these updates are not yet in place. For the analysis steps, even with ~1M mutations in several thousand genes, analysis run times can take ~1day (without the sigclus step), so even there it will be useful to reduce the .maf to the genes of interest.

AdamDS · 2017-02-22T01:56:36Z

@sabrodie
With the latest updates, we can now provide a way to support alternative Ensembl releases and reference genomes. I think that there are a couple of other things that could be done in the Trans.pm & Uppro.pm modules to support bacteria and other species. If you are still interested, perhaps we can work out a solution to help support other species data.

davidroberson changed the title ~~Horology modeled structure?~~ Homology modeled structure? Nov 16, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Homology modeled structure? #11

Homology modeled structure? #11

davidroberson commented Nov 16, 2016 •

edited

AdamDS commented Nov 16, 2016

davidroberson commented Nov 17, 2016

AdamDS commented Nov 17, 2016

AdamDS commented Nov 17, 2016

davidroberson commented Nov 17, 2016

sabrodie commented Nov 17, 2016

AdamDS commented Nov 18, 2016

AdamDS commented Feb 22, 2017

Homology modeled structure? #11

Homology modeled structure? #11

Comments

davidroberson commented Nov 16, 2016 • edited

AdamDS commented Nov 16, 2016

davidroberson commented Nov 17, 2016

AdamDS commented Nov 17, 2016

AdamDS commented Nov 17, 2016

davidroberson commented Nov 17, 2016

sabrodie commented Nov 17, 2016

AdamDS commented Nov 18, 2016

AdamDS commented Feb 22, 2017

davidroberson commented Nov 16, 2016 •

edited