Skip to content
Egon Willighagen edited this page Jan 1, 2016 · 11 revisions

These pages describe the migration from one CDK version to another.

CDK 1.4 to CDK 1.5/6

Constructors that now require a builder

The advantage of the builders in the CDK is that code can be independent of data class implementations (and we have three of them in CDK 1.6, at this moment). Over the past years more and more code started using the approach, but that does involve that more and more class constructors take a IChemObjectBuilder. CDK 1.6 has two more constructors that now take a builder.

  • DescriptorEngine
  • SMARTSQueryTool
  • ModelBuilder3D

Replaced classes

The following list shows new classes to be used instead of their old classes.

  • TestMoleculeFactory replaces MoleculeFactory
  • GeometryUtil replaces GeometryTools

Replaced classes / interfaces

The following list shows classes/interfaces have been removed without replacement.

  • IMolecule and its implementations (use IAtomContainer)
  • NoNotificationChemObjectBuilder (use one of the other builders, like SilentChemObjectBuilder)

Class and method renames

A few classes have been renamed:

  • IteratingMDLReader is now called IteratingSDFReader
  • CDKAtomTypeMatcher.findMatchingAtomType() is now called .findMatchingAtomTypes()

Isotopes

A major CDK API change happened around the IsotopeFactory. Previously, this class was used to get isotope information, which it gets from an configurable XML file. This functionality is now available from the XMLIsotopeFactory class. However, to improve the speed of getting basic isotope information as well as to reduce the size of the core modules, CDK 1.6 introduces a Isotopes class, which contains information extracted from the XML file, but is available as a pure Java class. The APIs for getting isotope information is mostly the same, but the instantiation is much simpler, and also no longer requires an IChemObjectBuilder (in Groovy):

import org.openscience.cdk.config.*;

isofac = Isotopes.getInstance();
uranium = 92;
for (atomicNumber in 1..uranium) {
  element = isofac.getElement(atomicNumber)
}

SMILESGenerator

The SMILES stack is replaced in this CDK version. This introduces a few API changes, outlined here. The new code base is much faster and more functional that what the CDK had before. Below are typical new SmilesGenerator API usage.

Generating unique SMILES is done slightly differently, but elegantly:

generator = SmilesGenerator.unique()
smiles = generator.createSMILES(mol)
println "$smiles"

Because SMILES with lower case element symbols reflecting aromaticity has less explicit information, it is not my suggestion to use. Still, I know that some of you are keen on using it, for various sometimes logical reasons, so here goes. Previously, you would use the setUseAromaticityFlag(true) method for this, but you can now use instead:

generator = SmilesGenerator.generic().aromatic()
smiles = generator.createSMILES(mol)
println "$smiles"

IFingerprinter

The IFingerprinter API was changed to accomodate for two types of fingerprints: the bit fingerprint, outlined by the IBitFingerprint interfaces, and the count fingerprint, defined in the ICountFingerprint interface. The IFingerprinter interface now defines getRawFingerprint(IAtomContainer), getCountFingerprint(IAtomContainer), and getBitFingerprint(IAtomContainer). These methods returns various kind of fingerprints. For example, getRawFingerprint(IAtomContainer) returns a Map with Strings representing the various parts of the fingerprint as well as the matching count, and it is this map that is used as input to the getCountFingerprint(IAtomContainer) method, which returns this information as a ICountFingerprint implementation. If the count for each bit is not important, the getBitFinger- print(IAtomContainer) method can be used, which returns a IBitSetFingerprint implementation. Because the previous Fingerprinter interface did not include the counting of how often a bit was set, implementing the new getRawFingerprint(IAtomContainer) method will likely take some effort, but the other two methods can in many cases just wrap other methods in the class, as shown in this example code:

public ICountFingerprint getCountFingerprint(
    IAtomContainer molecule
) throws CDKException {
    return new IntArrayCountFingerprint(
        getRawFingerprint(molecule)
    );
}
public IBitFingerprint getBitFingerprint(
    IAtomContainer molecule
) throws CDKException {
    return new BitSetFingerprint(
        getFingerprint(molecule)
    );
}

CDK 1.2 to CDK 1.4

IChemObjectBuilder

CDK 1.4 has a different way of instantiating IChemObject's:

CDK 1.2 code

IChemObjectBuilder builder =
  DefaultChemObjectBuilder.getInstance();
IMolecule molecule = builder.newMolecule();
molecule.addAtom(builder.newAtom("C"));

CDK 1.4 code

IChemObjectBuilder builder =
  DefaultChemObjectBuilder.getInstance();
IMolecule molecule = builder.newInstance(
  IMolecule.class
);
molecule.addAtom(
  builder.newInstance(IAtom.class, "C")
);

Iterating readers

Up to CDK 1.4.7 the IIteratingChemObjectReader implementations had a next() method that returned a IChemObject class. Depending on the file format, this could be a IChemModel or an IAtomContainer. The interface now uses generics, however, and the next() method now returns an IAtomContainer or IChemModel, making casting in the user code obsolete.

Implicit hydrogens

A second API change lies deep in the IAtom interface. To reflect more ac- curately the meaning of the method, the IAtomType.getHydrogenCount() has been renamed to IAtomType.getImplicitHydrogenCount(), and likewise the setter methods. The 1.2 code:

carbon.setHydrogenCount(4);

has to be updated to:

carbon.setImplicitHydrogenCount(4);

In both versions the count reflected the number of implicit hydrogens. The getHydrogenCount() suggested, however, to return the number of all hydrogens attached to that atom, that is, the sum of implicit and explicit hydrogens.

CDK 1.0 to CDK 1.2

MFAnalyser

Version 1.2 removed the MFAnalyser class in favor of a more elaborate framework to handle molecular formulas.