Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CE-Symm assumes models come from BA #808

Open
sbliven opened this issue Oct 25, 2018 · 4 comments
Open

CE-Symm assumes models come from BA #808

sbliven opened this issue Oct 25, 2018 · 4 comments
Labels
bug Bugs and bugfixes

Comments

@sbliven
Copy link
Member

sbliven commented Oct 25, 2018

Running CE-Symm on multi-model structures seems to concatenate all models. This is desirable for the biological assembly PDB files, but undesirable for ensembles.

For instance, running on 4tz5.B produces heap overflows as it tries to concatenate 25 models.

@sbliven
Copy link
Member Author

sbliven commented Oct 25, 2018

This can cause inflation of the repeat number. Example: 2q4g.W has 8 models with 16 repeats each. Running with 8G of memory I can finish this structure, but get 128 repeats!

@lafita lafita added the bug Bugs and bugfixes label Oct 25, 2018
@lafita
Copy link
Member

lafita commented Oct 25, 2018

If I remember correctly the input to the CESymm method is actually the Atom array extracted from the Structure object.
I wonder if this could be a bigger problem related to how the representative Atoms of multi model structures are obtained.

@josemduarte
Copy link
Contributor

@sbliven since 5.0.0, BioJava will by default produce bioassemblies by expanding into new chains (instead of new models). That is the default now if you do:

StructureIO.getBiologicalAssembly(String pdbId);

or otherwise you can explicitly set the flag (multiMode=true for expanding into models instead of chains):

StructureIO.getBiologicalAssembly(String pdbId, int bioAssemblyId, boolean multiModel);

I think the feature is also available in 4.2.x, but there it is not the default.

So it should be totally safe to remove the concatenation of models in CESymm.

@sbliven
Copy link
Member Author

sbliven commented Oct 26, 2018

I need to look into this more, since I found a few cases where CE-Symm seems to be duplicating structures. We used to report the currect number of repeats for these, but now it's doubled. None of them have multiple models.

Name Symm Predicted Repeats Predicted Symm
d1xmpa_ C2 2 4
d2hiyc_ C2 2 4
d1kkta_ C7 7 14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and bugfixes
Projects
None yet
Development

No branches or pull requests

3 participants