Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REVDAT entry converted to PDB format #20

Open
jsoerensen opened this issue Mar 17, 2020 · 10 comments
Open

REVDAT entry converted to PDB format #20

jsoerensen opened this issue Mar 17, 2020 · 10 comments

Comments

@jsoerensen
Copy link

jsoerensen commented Mar 17, 2020

in the write_remarks function in to_pdb.hpp , could you add the REVDAT entry. I've pasted some code that should work below.

std::string token;
std::istringstream revNum(st.get_info("_pdbx_audit_revision_history.ordinal"));
std::vector<std::string> revNums;
while(std::getline(revNum, token, ';'))
{
  revNums.push_back(token);
}

std::istringstream revDate(st.get_info("_pdbx_audit_revision_history.revision_date"));
std::vector<std::string> revDates;
while(std::getline(revDate, token, ';'))
{
  token.erase(std::remove_if(token.begin(), token.end(), ::isspace), token.end());
  revDates.push_back(token);
}

for(int i = (int) revNums.size() -1; i >= 0; --i)
{
  WRITEU("REVDAT %3s   %-9s %-51s",
         revNums[i].c_str(), revDates[i].c_str(), st.get_info("_entry.id").c_str());
}

and in mmcif.hpp

  std::string old_revnum_tag = "_database_PDB_rev.num";
  std::string new_revnum_tag = "_pdbx_audit_revision_history.ordinal";
  add_info(old_revnum_tag);
  add_info(new_revnum_tag);
  if (st.info.count(old_revnum_tag) == 1 && st.info.count(new_revnum_tag) == 0)
    st.info[new_revnum_tag] = st.info[old_revnum_tag];

  std::string old_revdate_tag = "_database_PDB_rev.date";
  std::string new_revdate_tag = "_pdbx_audit_revision_history.revision_date";
  add_info(old_revdate_tag);
  add_info(new_revdate_tag);
  if (st.info.count(old_revdate_tag) == 1 && st.info.count(new_revdate_tag) == 0)
    st.info[new_revdate_tag] = st.info[old_revdate_tag];
@wojdyr
Copy link
Member

wojdyr commented Mar 18, 2020

Looking at it, translating revision record would be problematic.

For example, in 6LU7:

loop_
_pdbx_audit_revision_history.ordinal 
_pdbx_audit_revision_history.data_content_type 
_pdbx_audit_revision_history.major_revision 
_pdbx_audit_revision_history.minor_revision 
_pdbx_audit_revision_history.revision_date 
1 'Structure model' 1 0 2020-02-05 
2 'Structure model' 2 0 2020-02-12 
3 'Structure model' 2 1 2020-02-19 
4 'Structure model' 2 2 2020-02-26 
5 'Structure model' 2 3 2020-03-11 
# 
loop_
_pdbx_audit_revision_details.ordinal 
_pdbx_audit_revision_details.revision_ordinal 
_pdbx_audit_revision_details.data_content_type 
_pdbx_audit_revision_details.provider 
_pdbx_audit_revision_details.type 
_pdbx_audit_revision_details.description 
_pdbx_audit_revision_details.details 
1 1 'Structure model' repository 'Initial release'        ?                 ? 
2 2 'Structure model' author     'Coordinate replacement' 'Ligand geometry' ? 
# 
loop_
_pdbx_audit_revision_group.ordinal 
_pdbx_audit_revision_group.revision_ordinal 
_pdbx_audit_revision_group.data_content_type 
_pdbx_audit_revision_group.group 
1  2 'Structure model' Advisory                 
2  2 'Structure model' 'Atomic model'           
3  2 'Structure model' 'Data collection'        
4  2 'Structure model' 'Database references'    
5  2 'Structure model' 'Derived calculations'   
6  2 'Structure model' 'Refinement description' 
7  2 'Structure model' 'Structure summary'      
8  3 'Structure model' 'Database references'    
9  3 'Structure model' 'Structure summary'      
10 4 'Structure model' 'Data collection'        
11 5 'Structure model' 'Source and taxonomy'    
12 5 'Structure model' 'Structure summary'      
# 
loop_
_pdbx_audit_revision_category.ordinal 
_pdbx_audit_revision_category.revision_ordinal 
_pdbx_audit_revision_category.data_content_type 
_pdbx_audit_revision_category.category 
1  2 'Structure model' atom_site                    
2  2 'Structure model' citation                     
3  2 'Structure model' entity                       
4  2 'Structure model' pdbx_nonpoly_scheme          
5  2 'Structure model' pdbx_struct_assembly_prop    
6  2 'Structure model' pdbx_struct_sheet_hbond      
7  2 'Structure model' pdbx_struct_special_symmetry 
8  2 'Structure model' pdbx_validate_rmsd_bond      
9  2 'Structure model' pdbx_validate_symm_contact   
10 2 'Structure model' pdbx_validate_torsion        
11 2 'Structure model' refine                       
12 2 'Structure model' refine_hist                  
13 2 'Structure model' refine_ls_shell              
14 2 'Structure model' software                     
15 2 'Structure model' struct                       
16 2 'Structure model' struct_conn                  
17 2 'Structure model' struct_site                  
18 2 'Structure model' struct_site_gen              
19 3 'Structure model' citation                     
20 3 'Structure model' struct                       
21 4 'Structure model' diffrn_detector              
22 5 'Structure model' entity                       
23 5 'Structure model' entity_src_gen               
24 5 'Structure model' struct                       
# 
loop_
_pdbx_audit_revision_item.ordinal 
_pdbx_audit_revision_item.revision_ordinal 
_pdbx_audit_revision_item.data_content_type 
_pdbx_audit_revision_item.item 
1  2 'Structure model' '_citation.title'                                
2  2 'Structure model' '_entity.pdbx_number_of_molecules'               
3  2 'Structure model' '_pdbx_struct_assembly_prop.value'               
4  2 'Structure model' '_pdbx_struct_sheet_hbond.range_1_auth_comp_id'  
...

would need to be translated to:

REVDAT   5   11-MAR-20 6LU7    1       COMPND SOURCE                            
REVDAT   4   26-FEB-20 6LU7    1       REMARK                                   
REVDAT   3   19-FEB-20 6LU7    1       TITLE  JRNL                              
REVDAT   2   12-FEB-20 6LU7    1       TITLE  COMPND JRNL   REMARK              
REVDAT   2 2                   1       SHEET  LINK   SITE   ATOM                
REVDAT   1   05-FEB-20 6LU7    0                                                

Out of curiosity, what do you need it for?

@jsoerensen
Copy link
Author

I think the code I posted does something close to that. Although I've left out the last column with revision reasons, but I could add that in.
Mainly, we store the original date, and the last revision number and date in metadata in when deposit these in a database. Since structures can be revised, it's important for us to know which revision we currently have.

@jsoerensen
Copy link
Author

I don't mind posting the above as a PR with the extra column for the revision reason added, if that helps.

@wojdyr
Copy link
Member

wojdyr commented Mar 18, 2020

I meant that the last columns of REVDAT would be difficult to generate. How would you do it? From category?

@jsoerensen
Copy link
Author

Ah that is a fair point - I'm not sure if the RCSB has a conversion table. I'll look.

@wojdyr
Copy link
Member

wojdyr commented Mar 18, 2020

if I may ask - why do you switch between mmCIF and PDB?

@jsoerensen
Copy link
Author

jsoerensen commented Mar 18, 2020

It's a fair question, we convert the MMCIF header to a PDB-style header for historical reasons. The work involved switching our current codebase to parse each natively would be significant. And we only need to do this for those structure where there is only an MMCIF structure and not a corresponding PDB form.

@jsoerensen
Copy link
Author

Sadly, there doesn't seem to be a proper mapping between the PDB and MMCIF notations.
http://mmcif.wwpdb.org/docs/pdb_to_pdbx_correspondences.html#REVDAT

@wojdyr
Copy link
Member

wojdyr commented Mar 18, 2020

I'm inclined to leave REVDAT out, at least for now. Maybe you can find a workaround to store the last revision number and date in the database.

@jsoerensen
Copy link
Author

Given the limited mappings from the wwpdb, I agree with you. The code above does do that I need so I have it on a fork. I’d be much happier prioritizing the DBREF data instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants