Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transform/SMIRKS support to CDK. #916

Merged
merged 62 commits into from Mar 6, 2024
Merged

Add transform/SMIRKS support to CDK. #916

merged 62 commits into from Mar 6, 2024

Conversation

johnmay
Copy link
Member

@johnmay johnmay commented Sep 25, 2022

This patch provides the generic/low-level data structures and representations to describe a molecular transform. A conveient way of describing transforms is the SMIRKS syntax which is supported via functionality in the 'cdk-smarts' module.

Transforms are useful for standardisation, library generation, and retro synthesis.

Quick start

  1. Include cdk-smarts module
  2. Parse and apply in-place to all non-overlapping matches
if (Smirks.apply(mol, "[*:1][H]>>[*:1]Cl")) {
    System.err.println("Success!");
}
  1. A transform can be compiled and reused:
Transform transform = Smirks.compile("[*:1][H]>>[*:1]Cl");
for (IAtomContainer mol : mols) {
    if (transform.apply(mol))
       System.err.println("Success!");
}
  1. If you expect to handle invalid inputs, please use the parse method which returns true/false. You can provide a Transform or SmirksTransform depending on if you want it to automatically ensure correct aromaticity/ring flags.
Transform transform = new SmirksTransform();
if (!Smirks.parse(transform, "[*:1][H]>>[*:1]Cl")) {
  System.err.println("BAD SMIRKS - " + transform.message());
  return;
}

for (IAtomContainer mol : mols) {
    if (transform.apply(mol))
       System.err.println("Success!");
}

Features

The atomic number, hydrogen count, charge, and mass can be modified on mapped atoms

[SH2:1]>>[SH4:1] set hcnt to 4
[O:1]>>[O-:1] set the charge to -1
[C:1]>>[13C:1] set the isotope 
[Pb:1]>>[Au:1] set the element to gold

There is inconsistency in other implementations about how/when these properties get removed. For example [OH1:1]>>[O:1] may remove the hydrogen count. CDK's implementation follows a simpler system where by you must explicitly specify you want something removed (zero'd).

[SH2:1]>>[S:1] leave hnct unset (no-op)
[SH2:1]>>[SHO:1] set hcnt to 0
[O-:1]>>[O+0:1] set the charge to 0
[13C:1]>>[0C:1] remove the isotope 
[Pb:1]>>[#0:1] set atomic num 0, note '*' is treated as no-op

If the right-hand side has an expression it is examined to see if the property being modified is consistent. The safer portable SMIRKS it is recommend you stick to SMILES [<mass>?<symbol><hcnt>?<chg>?:<map>] on the right-hand side.

[S,O;H2:1]>>[S,O;HO:1] set hcnt to 0
[S,O;H2:1]>>[SH0,OHO:1] set hcnt to 0
[SH1,OH2:1]>>[SH0,OH1:1] inconsistent hcnt (so leave unset)

Hydrogens are not automatically adjusted, so if you change a property like charge or add/remove bonds you should adjust the hydrogen count accordingly. Subset atoms (B, C, N, O, P, S, F, Cl, Br, I) that are added (unmapped on right-hand side) not in square brackets will automatically set the implicit hydrogen count.

Although it is possible to explicitly set the number of hydrogens with H<cnt> it is more portable/powerful to use explicit hydrogens in the SMIRKS to adjust the hydrogen count.

[OH1:1]>>[O-:1] wrong
[OH1:1]>>[OH0-:1] acceptable
[H][O:1]>>[O-:1] preferred 

[C:1]>>[C:1]Cl wrong
[CH1:1]>>[CH0:1]Cl acceptable (but only matches [CH]!)
[H][C:1]>>[C:1]Cl preferred

Note SMIRKS transform does not need the hydrogens to be explicit in the molecule being modified, the transform is compiled and works on either hydrogen counts or explicit hydrogens.

Stereochemistry on atoms with a single point change will persist their stereochemistry.

[C:1]Br>>[C:1]Cl (smirks):

  C/C=C/Br (input)     C/C=C/Cl (output)
  Br[C@H](N)O (input)  Cl[C@H](N)O (output)

Multiple changes will result in the stereochemistry being lost

[C:1](Br)N>>[C:1](Cl)I (smirks):

  Br[C@H](N)O (input)  ClC(N)O (output)

API Overview

The low-level API in the 'cdk-isomorphism' module works with a substructure pattern and ops which change the atoms matched by the pattern. We therefore run a transform by matching the pattern to the molecule, then running the ops over the atoms matched by the pattern. The atoms in the molecule being transformed are permuted/ordering to be the same as the query (actually idx+1) such that the op-code parameters are consistent.

// [CD1:1][C:2][CD1:3]>>[C:1]1[C:2][C:3]1
// 
// in this example we create a query pattern (the maps are included for clarity)
// the ops then select atoms at index 1 (atoms[0+1]) and 3 (atoms[2+1]) and create a
// new bond with order=1 (single).
Transform tform = new Transform();
tform.init(SmartsPattern.create("[CD1:1]C[CD1:3]"), // [CD1]C[CD1] will work the same
           Arrays.asList(new TransformOp(Type.NewBond, 1, 3, 1)));

Future additions

Options

Testing other implementations there are some options that may be useful and flags on how things could be interpreted. I've put in the most sensible defaults in (IMO) and the flags would therefore change the behaviour away from that. A nice feature is we can then say "parse this as RDKIT compatibility mode". Although I am planning an ACS talk next spring which will highlight some of the existing behaviours of other implementations are inconsistent/ambiguous.

I've mainly not decided whether to do this via integer bitmaps or enums or another mechanism. Here is the enum option I was toying with:

public enum Option {
        /** The transform will be run right-to-left instead of left-to-right. */
        Reverse,
        /** Ignore attempts to set the hydrogen count with properties. */
        IgnoreHCnt,
        /**
         * Unless specified, zero the hydrogen count on a mapped atom if it's
         * counterpart had a hydrogen count specified.
         */
        ZeroHCntIfChanged,
        /** Interpret [CH0] the same as [C] */
        ZeroHIsUnset,
        /** Ignore attempts to set the isotopic mass of an atom. */
        IgnoreIso,
        /** Ignores attempts to change the element of an atom. */
        IgnoreTransmutation,
        /**
         * Unless specified, zero the charge on a mapped atom if it's
         * counterpart had a charge specified.
         */
        ZeroChargeIfChanged,
        /** Unless a charge is specified, default to zero. */
        ZeroCharge,

        UnpairedMaps,

        // where does this option go
        RECALCULATE_H,

        // options of the plan

        /**
         * Automatically add explicit hydrogens to a pattern before matching/
         * running the transform.
         */
        AutoExplH,
        /**
         * Remove stereo chemistry even when a single neighbour changes.
         */
        RemoveStereoOnSinglePointChange,
        /** If a bond already exists between two atoms and a new one  */
        OverwriteExistingBond,
        /** LillyMol - Not supported yet. */
        DeleteUnmapped
    }

    public final Set<Option> Daylight = EnumSet.of(Option.IgnoreHCnt,
                                                   Option.IgnoreIso,
                                                   Option.IgnoreTransmutation,
                                                   Option.ZeroCharge,
                                                   Option.AutoExplH,
                                                   Option.RemoveStereoOnSinglePointChange);

    public final Set<Option> OEChem = EnumSet.of(Option.IgnoreIso,
                                                 Option.ZeroHCntIfChanged);

@egonw egonw self-requested a review September 26, 2022 03:50
Copy link
Member

@egonw egonw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the documentation and testing. Looks good to me. I'll suggest we keep it open for a bit longer (few days max) so that others can comment too.

@johnmay
Copy link
Member Author

johnmay commented Sep 26, 2022

Sounds good, maybe a week or so for comment on the API - until release though it’s still changeable. I will fix the code smells and try and get some more test coverage

@johnmay
Copy link
Member Author

johnmay commented Sep 26, 2022

@uli-f incase you didn’t get pinged on it

@uli-f
Copy link
Member

uli-f commented Sep 26, 2022

@johnmay Thanks for pinging me on this.

Thank you so much for the great work! Looks good to me at first glance 😃

I like the Enum option and EnumSets better than integers bitmaps. I do have a C background and can read and understand integer bitmaps but would only use them in Java when there is a very good reason for it. Enums seem much more like a OO/Java solution to me. Love the grouping of different low-level options in an EnumSet for higher-level config 👍🏼

Running a bit behind everything, but will def take a closer look this week 👀

@egonw
Copy link
Member

egonw commented Sep 26, 2022

maybe a week or so for comment on the API

Shall I send out a tweet?

@johnmay
Copy link
Member Author

johnmay commented Sep 26, 2022

maybe a week or so for comment on the API

Shall I send out a tweet?

No I will add some more doc first then do the mailing list (and maybe tweet)

@uli-f
Copy link
Member

uli-f commented Oct 6, 2022

Had a closer look.
The API looks easy to use and clear to me, the documentation is very helpful, especially the code snippets 👍🏼

Few comments:

  • TransformPlan // private final int maxNewAtomIdx is never read
  • TransformPlan // private final int fstNewAtomIdx is never read
  • SmirksTransform // private static final Aromaticity arom = new Aromaticity(ElectronDonation.daylight(), Cycles.or(Cycles.all(), Cycles.relevant())) is never read
  • SmirksTransform // 3rd argument missing in javadoc example of apply(IAtomContainer,String,Transform.Mode)
  • SmirksTransform // parse(Transform,String) // I'd seen IllegalArgumentExceptions thrown quite frequently in libs when an argument is tested for null, but haven't come across throwing an actual NPE

@uli-f
Copy link
Member

uli-f commented Oct 6, 2022

image

Given the reaction above and the SMILES
[Li+:30].[Al+3:32].[O:28]=[C:27]1[O:31][C:2](=[O:29])[CH:3]2[CH2:26][C:17](=[C:8]([C:9]3=[CH:10][CH:11]=[C:12]([O:13][CH3:14])[CH:15]=[CH:16]3)[CH2:7][CH:4]12)[C:18]=4[CH:19]=[CH:20][C:21]([O:22][CH3:23])=[CH:24][CH:25]4
as an input the SMIRKS
[Al+3;h0X0:32].[Ch0X3v4:27](=[Oh0:28])[Oh0X2v2:31][Ch0X3v4:2]=[Oh0:29].[Li+;h0X0:30]>>[Ch2X4v4:2]([Oh1:29]).[Ch2X4v4:27][Oh1:28]
gives me the the expected (or desired?) outcome.
However, the SMIRKS
[Al+3;h0X0:32].[C+0;h1:3]1[C+0;h1:4][C+0;h0X3v4:27](=[O+0;h0:28])[O+0;h0X2v2:31][C+0;h0X3v4:2]1=[O+0;h0:29].[Li+;h0X0:30]>>[C+0;h2X4v4:2]([O+0;h1:29])[C+0;h1:3][C+0;h1:4][C+0;h2X4v4:27][O+0;h1:28]
leads to no match.

I added the following method to SmirksTest to test this:

@Test
void testReaction_1() throws Exception {
    assertTransform("[Li+:30].[Al+3:32].[O:28]=[C:27]1[O:31][C:2](=[O:29])[CH:3]2[CH2:26][C:17]" +
            "(=[C:8]([C:9]3=[CH:10][CH:11]=[C:12]([O:13][CH3:14])[CH:15]=[CH:16]3)[CH2:7][CH:4]12)[C:18]=4" +
            "[CH:19]=[CH:20][C:21]([O:22][CH3:23])=[CH:24][CH:25]4",
//                "[Al+3;h0X0:32].[Ch0X3v4:27](=[Oh0:28])[Oh0X2v2:31][Ch0X3v4:2]=[Oh0:29].[Li+;h0X0:30]>>[Ch2X4v4:2]([Oh1:29]).[Ch2X4v4:27][Oh1:28]", // expected outcome
            "[Al+3;h0X0:32].[C+0;h1:3]1[C+0;h1:4][C+0;h0X3v4:27](=[O+0;h0:28])[O+0;h0X2v2:31][C+0;h0X3v4:2]1=[O+0;h0:29].[Li+;h0X0:30]>>[C+0;h2X4v4:2]([O+0;h1:29])[C+0;h1:3][C+0;h1:4][C+0;h2X4v4:27][O+0;h1:28]", // no match
            "OCC1CC(=C(CC1CO)C=2C=CC(OC)=CC2)C3=CC=C(OC)C=C3");
}

If I understand correctly the features h, X, and v that are used in this SMIRKS are currently not covered...?

@johnmay
Copy link
Member Author

johnmay commented Oct 6, 2022

I'll look into it, but doing RDKit UGM talk this week and next (presenting).

You can't really have X and v on the right hand side... SMIRKS doesn't work like this :-)

The query matches so possibly something else going on.

Here is a better way to write the same thing.

[Al+3;h0X0:32].[C+0;h1:3]1[C+0;h1:4][C+0;h0X3v4:27](=[O+0;h0:28])[O+0;h0X2v2:31][C+0;h0X3v4:2]1=[O+0;h0:29].[Li+;h0X0:30]>>[CH2+0:2]([OH1+0:29])[CH1+0:3][CH1+0:4][C+0:27][O+0;h1:28]

The error i get is:

[Al+3h0X0:1].[C+0h1:2]1[C+0h0X3v4:7]([O+0h0X2v2:6][C+0h0X3v4:4]([C+0h1:3]1)=[O+0h0:5])=[O+0h0:8].[Li+h0X0:9] [DeleteAtom{1}, ImplH{2@4}, ImplH{1@5}, DeleteAtom{6}, ImplH{2@7}, ImplH{1@8}, DeleteAtom{9}, DeleteBond{2-7}, BondOrder{4-5}, DeleteBond{4-6}, DeleteBond{6-7}, BondOrder{7-8}, NewBond{7-2}] Did not apply!: NewBond{7-2}

Which means atoms :7 and :2 are already bonded - which we can see is the case. This looks to be a bug in the perception of the transform.

@uli-f
Copy link
Member

uli-f commented Oct 6, 2022

You can't really have X and v on the right hand side... SMIRKS doesn't work like this :-)

Thanks for pointing that out. I definitely need to take a closer look at SMIRKS and how we use them.

Here is a better way to write the same thing.

[Al+3;h0X0:32].[C+0;h1:3]1[C+0;h1:4][C+0;h0X3v4:27](=[O+0;h0:28])[O+0;h0X2v2:31][C+0;h0X3v4:2]1=[O+0;h0:29].[Li+;h0X0:30]>>[CH2+0:2]([OH1+0:29])[CH1+0:3][CH1+0:4][C+0:27][O+0;h1:28]

Thanks again. Seeing that already makes me understand a lot better how SMIRKS are supposed to be used.

I am happy to put in a few more tests and see how that goes.

@johnmay
Copy link
Member Author

johnmay commented Oct 6, 2022

Fixed... the bond ordering swapping was not handled correctly. Can you add this as a test case.

Something simpler like:

[C:1]-[O:2]>>[O:2]=[C:1]

would also have shown the issue.

@uli-f
Copy link
Member

uli-f commented Oct 6, 2022

Fixed... the bond ordering swapping was not handled correctly. Can you add this as a test case.

Thank you for fixing this 😃

PR #917

Will add more test cases tomorrow.

@uli-f
Copy link
Member

uli-f commented Oct 7, 2022

I discovered that the Enum Smirks.Option has a line

RECALCULATE_H

Given the transformation patterns I am working with I would definitely appreciate seeing this one coming alive if it adds implicit hydrogens after bond and/or charge changes to an atom. Not quite sure what the expectation would be if the atomic number changes though...?

@johnmay
Copy link
Member Author

johnmay commented Oct 7, 2022

Doh I didn't mean to check in the options yet as they don't do anything. I really am hesitant about recalculate H, Evan Bolton at pubchem likes the term RoboChemistry :-)

@uli-f
Copy link
Member

uli-f commented Oct 7, 2022

Doh I didn't mean to check in the options yet as they don't do anything. I really am hesitant about recalculate H, Evan Bolton at pubchem likes the term RoboChemistry :-)

Okay, totally understand the hesitancy of recalculating the hydrogens.

I am wondering if it would be possible to keep track of any changes per atom (bond changes, bond order changes, charge changes) to update the hydrogen count in a more sophisticated way than just recalculating...? Probably a project for another day!

@johnmay
Copy link
Member Author

johnmay commented Oct 7, 2022

Wait for my ACS talk and I will show the Wild West of SMIRKS bespoke variations. I think having another variant, does not help the situation, however some middle ground might be the ability to take a smirks and improve it by inferring where the hydrogens must be.

@uli-f
Copy link
Member

uli-f commented Oct 7, 2022

Just pushed another PR #918 with a few more test cases.

@uli-f
Copy link
Member

uli-f commented Oct 7, 2022

Wait for my ACS talk and I will show the Wild West of SMIRKS bespoke variations. I think having another variant, does not help the situation, however some middle ground might be the ability to take a smirks and improve it by inferring where the hydrogens must be.

Agreed. I am probably just dreaming the dream of meaningful/sensible RoboChemistry :)

@johnmay
Copy link
Member Author

johnmay commented Oct 7, 2022

Are you manually writing these SMIRKS or generating them, if the later then can make the generate write them better.

@uli-f
Copy link
Member

uli-f commented Oct 10, 2022

Are you manually writing these SMIRKS or generating them, if the later then can make the generate write them better.

The truth is somewhere in the middle here: The rules are generated automatically, but I edit the RHS to make them valid smirks as you pointed out above that there is no x and v on the RHS in smirks, but only changes to # hydrogens, isotope, charge, atom label (and bond order) for a particular atom.

My intention was to add more complex examples instead of testing individual properties / transformation ops.

I am very happy to come up with more relevant / correctly written examples if you point me in the right direction.

@uli-f
Copy link
Member

uli-f commented Oct 10, 2022

Had a closer look at the smirks.

Do you want me to

  • use consecutive numbers starting from 1 for the atom mappings?
  • remove any hydrogen count expression on the RHS if there is no change?
  • anything on the LHS you would like to see changed?

Irrespective of the above: I would consider it worthwhile having quirky but valid expressions in the test set.

@uli-f
Copy link
Member

uli-f commented Oct 10, 2022

I played a bit with the different Transform.Modes.

final String smiles = "CCOCCF";
final String smirks = "[O:1][Ch2:2][C:3]>>[OH1:1].[CH3:2][C:3]";
final String expected = "";

IAtomContainer atomContainer = SMIPAR.parseSmiles(smiles);
SmirksTransform transform = new SmirksTransform();
assertTrue(Smirks.parse(transform, smirks), transform.message());
Iterable<IAtomContainer> matches = transform.apply(atomContainer, Transform.Mode.All);
Iterator<IAtomContainer> iterator = matches.iterator();

assertTrue(iterator.hasNext());
String actual = SMIGEN.create(iterator.next());
System.out.println(actual);

assertTrue(iterator.hasNext());
actual = SMIGEN.create(iterator.next());
System.out.println(actual);

assertFalse(iterator.hasNext());

With Mode.Exclusive the result is what I would expect. However, this is not the case with Mode.Unique and Mode.All:

image

But maybe this is more about my expectation being off 🤷🏼

@johnmay
Copy link
Member Author

johnmay commented Oct 10, 2022

@uli-f
Copy link
Member

uli-f commented Oct 10, 2022

xtransform is the only one you apply them all at once, the others have overlaps so you need to rematch.

Sorry about that. It has been a while since I've last used smirks. I will do some reading first so that I can chip in with more reasonable/meaningful contributions here.

@johnmay
Copy link
Member Author

johnmay commented Oct 11, 2022

What would expect the unique/all options. As described in the doc these are the "iterate possible matches options" and control which matches you apply them to. You can only do a single pass since otherwise (and on non overlapping matches) you could get into an infinite loop of changing things.

Consider

[H:1][N:2]-[C:3]=[N:4]>>[N:2]=[C:3]-[N:4][H:1]

We could constantly apply this and never end. Such a functionality is possibly useful the the nice thing is you can always build that on top of the "match all" and "match unique" primitives. :-)

@johnmay
Copy link
Member Author

johnmay commented Oct 11, 2022

Mostly finished my RDKit talk prep now so will try and catch up on this PR/others

@uli-f
Copy link
Member

uli-f commented Oct 12, 2022

What would expect the unique/all options. As described in the doc these are the "iterate possible matches options" and control which matches you apply them to. You can only do a single pass since otherwise (and on non overlapping matches) you could get into an infinite loop of changing things.

Thank you, I understand this part of the matching and transforming and the associated issue of infinite loops.

I am sorry that I did not make it clear enough what surprised me about the result.

SMILES: CCOCCF
SMIRKS: [O:1][C:2][C:3]>>[OH1:1].[CH3:2][C:3]
Expected outcome with Transform.Mode.Unique: CC.OCCF and CCO.CCF
Actual outcome with Transform.Mode.Unique: CC.O[CH3]CF and C[CH3]O.CCF

SMILES: CCOCCF
SMIRKS: [C:4][O:1][C:2][C:3]>>[C:4][OH1:1].[CH3:2][C:3]
Expected outcome with Transform.Mode.All: CCO.CCF and CC.OCCF
Actual outcome with Transform.Mode.All: C[CH3]O.CCF and CC.O[CH3]CF

Now, if I run the following smirks / smiles combo I get the result I expect:

SMILES: CCOCCF
SMIRKS: [O:1][C:2][C:3]>>[OH1:1].[CH3:2][C:3]
Expected outcome with Transform.Mode.Exclusive: CC.OCCF
Actual outcome with Transform.Mode.Exclusive: CC.OCCF

Is it correct that the result of the last example could be either CC.OCCF or CCO.CCF depending on how CDK perceives the order of the atoms in the smiles?

I expected to remove the bond between O:1 and C:2 using the dot, but that is not what is happening when using Transform.Mode.Unique and Transform.Mode.All. So I end up with transform results that have invalid valences as the CH3 ends up not being terminal. However, if works as expected with Transform.Mode.Exclusive. So it seems like I still don't quite understand something fundamental here.

xtransform is the only one you apply them all at once, the others have overlaps so you need to rematch.

Okay, I think I understand now what you mean here.

I used the Daylight Toolkit years ago and I think in my mind I was looking for that limit argument they have which allows you to limit the number of times the transform is applied with dt_xtransform and to limit the number of reactions to be returned for dt_utransform and dt_transform.

For my application getting the transforms of non-overlapping matches individually might be very helpful. A bit similar to this depiction by Daylight at https://daylight.com/daycgi_tutorials/smirks_examples.html:

two product

@uli-f
Copy link
Member

uli-f commented Oct 12, 2022

I tried another smirks that carries out a nitro group conversion:

smiles = N(=O)(=O)CCN(=O)=O
smirks = [*:1][N:2](=[O:3])=[O:4]>>[*:1][N+:2](=[O:3])[O-:4]

With both Transform.Mode.Unique and Transform.Mode.All I get exactly two results:

[N+](=O)([O-])CC[N+](=O)[O-]
[N+](=O)([O-])CC[N+](=O)[O-]

Is that expected because it makes sense given how the transforms are carried out? So it would be up to the caller of the transform method to check for identical transform results?

@uli-f
Copy link
Member

uli-f commented Oct 12, 2022

I added a few more test cases and a convenient method to test the different Transform.Mode modes to PR #918.

@johnmay Please feel free to pick and choose what you find useful in there.

@johnmay johnmay marked this pull request as ready for review March 6, 2024 10:03
@johnmay
Copy link
Member Author

johnmay commented Mar 6, 2024

@mrwns

I fixed the lazy transform a while ago:

String smirks = "[#8H1:1].[#8H1:2].[#8H1:3].[#8H1:4].[#8H1:5].[#8H1:6].[#8H1:7].[#8H1:8]>>[#6H3]-[#6](-[#8H0:5])=O.[#6H3]-[#6](-[#8H0:1])=O.[#6H3]-[#6](-[#8H0:2])=O.[#6H3]-[#6](-[#8H0:6])=O.[#6H3]-[#6](-[#8H0:3])=O.[#6H3]-[#6](-[#8H0:7])=O.[#6H3]-[#6](-[#8H0:4])=O.[#6H3]-[#6](-[#8H0:8])=O";
String sucrose = "OC[C@H]1O[C@@](CO)(O[C@H]2O[C@H](CO)[C@@H](O)[C@H](O)[C@H]2O)[C@@H](O)[C@@H]1O";

Timings are now:

1 results in 0.011s
1000 results in 0.121s
40320 results in 1.455s

I also confirmed this one is OK:

expected reaction: CCCC\C=C\B(O)O.CCOC(=O)C1(C\C=C(\Br)[Si](C)(C)C)CCCC1=O>>CCCC\C=C\C(=C/CC1(CCCC1=O)C(=O)OCC)\[Si](C)(C)C
        String smirks = "[CH1X3v4+0:1][CH0X3v4+0:2]>>[CH1+0:1]B(O)O.[CH0+0:2]Br";
        String reactants = "CCCC\\C=C\\C(=C/CC1(CCCC1=O)C(=O)OCC)\\[Si](C)(C)C";

Copy link

sonarcloud bot commented Mar 6, 2024

@johnmay
Copy link
Member Author

johnmay commented Mar 6, 2024

@egonw / @uli-f Good to merge subject to review.

@uli-f
Copy link
Member

uli-f commented Mar 6, 2024

@egonw / @uli-f Good to merge subject to review.

I looked more at the javadoc documentation and the API than what happens in the methods. API lgtm, documentation is very helpful.

Is there a good place to put the excellent overview of your original comment so that it's not lost once this PR is closed? Would it make sense to put it somewhere in the javadoc?

@egonw egonw merged commit 2af6236 into main Mar 6, 2024
6 checks passed
@egonw
Copy link
Member

egonw commented Mar 6, 2024

Oh, if only we still had CDK News :)

For eternity we have http://web.archive.org/web/20240306143834/https://github.com/cdk/cdk/pull/916#issue-1385176707 (well, at least until Nov 6)

@johnmay
Copy link
Member Author

johnmay commented Mar 6, 2024

@uli-f Possibly JavaDoc but also we can add a wiki page (https://github.com/cdk/cdk/wiki) - Egon also has/had the CDK book which has some nice example. Something Roger has emphasised to me over the years is API documentation is separate to the user manual (how to use it). Much like the CDK book and the RDKit book.

@egonw what happens on Nov 6?

@egonw
Copy link
Member

egonw commented Mar 6, 2024

@egonw what happens on Nov 6?

It's the "day after".

I will try to include the examples in the book, good idea!

@uli-f
Copy link
Member

uli-f commented Mar 7, 2024

Something Roger has emphasised to me over the years is API documentation is separate to the user manual (how to use it). Much like the CDK book and the RDKit book.

@johnmay Never really thought about this too much, sounds like a good idea.

I will try to include the examples in the book, good idea!

@egonw Thanks egon, let me know if you need any help with this, I'd be happy to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants