New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CE-Symm assumes models come from BA #808
Comments
This can cause inflation of the repeat number. Example: 2q4g.W has 8 models with 16 repeats each. Running with 8G of memory I can finish this structure, but get 128 repeats! |
If I remember correctly the input to the |
@sbliven since 5.0.0, BioJava will by default produce bioassemblies by expanding into new chains (instead of new models). That is the default now if you do: StructureIO.getBiologicalAssembly(String pdbId); or otherwise you can explicitly set the flag ( StructureIO.getBiologicalAssembly(String pdbId, int bioAssemblyId, boolean multiModel); I think the feature is also available in 4.2.x, but there it is not the default. So it should be totally safe to remove the concatenation of models in CESymm. |
I need to look into this more, since I found a few cases where CE-Symm seems to be duplicating structures. We used to report the currect number of repeats for these, but now it's doubled. None of them have multiple models.
|
Running CE-Symm on multi-model structures seems to concatenate all models. This is desirable for the biological assembly PDB files, but undesirable for ensembles.
For instance, running on 4tz5.B produces heap overflows as it tries to concatenate 25 models.
The text was updated successfully, but these errors were encountered: