New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of ErtlFunctionalGroupsFinder and Sugar Removal Utility into CDK #854
Comments
hi @JonasSchaub, I'm sorry I dropped the ball. In my defense, I still need to write up a blog post about the CDK meeting. @johnmay, I think both utility are useful and should be accepted as features in the CDK. Do you agree? If so, let's set up that integration pull request to work out the details. |
Hi @egonw no worries, I am also looking forward to that blog post about our meeting! On the subject at hand: I guess it would be best if I simply create one or two separate new modules for the two functionalities to freely manage the dependencies. Would you agree? If yes, would you have name suggestions/preferences? And what should be the target branch of the pull request? But I also have to note that I probably will not have much time to work on this in the coming weeks, unfortunately. Kind regards, |
Sounds good, hard to decide where they would go. FuncGroups is sort a descriptor and SugarRemoval is sort of a standardisation (which we don't have a module for). |
Hi @johnmay I would propose to create a "standardisation" module, then. We are planning to implement further standardisation routines with/for CDK anyway in our group. |
Hi @egonw and @johnmay , For your consideration, here are the (updated) repositories:
The CDK integration would in both cases consist of one principal class implementing the functionality and one corresponding test class, respectively. |
I would try and avoid adding a new module, if anything I would like to try and simplify/collapse the number modules.
Quite busy these days but I will try and find time within the next week to take a close a look. At first glance the ErtlFunctionalGroupFinder looks more complicated/verbose than I imagined, there are lots of public helper functions ( |
Thanks @johnmay for the initial feedback!
These are utility functions for the user to pre-check whether their input molecule is a valid argument (in a specific case, strict input restrictions can be turned on but are turned off by default). If you think these are not necessary I can remove them.
Yes, our implementation is "longer" but it is optimised in terms of speed and has more functionality than the RDKit implementation. |
Dear CDK-developers,
I have been involved in two projects that might be interesting for integration into the CDK. I have spoken with Egon about them last week and I will give all necessary details for everyone below:
The first one is the ErtlFunctionalGroupsFinder, an open, CDK-based implementation of Peter Ertl's algorithm for functional group detection. Our implementation is described in this paper and the source code can be found here.
Getting it into CDK would require integrating one principal class implementing the functionality, a corresponding test class (these two can be found here), and (if you agree) a utility class with some convenient routines for pre-and postprocessing in a workflow using ErtlFunctionalGroupsFinder. The utility class along with its test class can be found here.
The second project is named Sugar Removal Utility, an open, CDK-based implementation of an algorithm for in silico deglycosylation. We have described the algorithm and its implementation in this paper and used it for the project described here that illustrates its possibilities and configurations in more detail. The source code can be found here.
As far as we know, there is no deglycosylation functionality in CDK, only a CDK-based KNIME node for this purpose.
The Sugar Removal Utility consists of one principal class implementing the functionality and a corresponding test class. In the latter, a few test methods would have to be removed that illustrate the deglycosylation results on a small test data set and test the surrounding command-line application.
Do you think these two projects should be integrated into the CDK? Should I create two pull request where we can work on the details?
One important question I have would be into which module to put my classes. Ideally, there would be one that fits thematically and supports the modules I need for my code. Otherwise, we could maybe also add one or two new modules.
The Sugar Removal Utility needs the following modules:
ErtlFunctionalGroupsFinder requires these modules:
I would be happy to give more details on specific aspects of the two projects or discuss any required code changes.
Kind regards,
Jonas Schaub
The text was updated successfully, but these errors were encountered: