Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Substructure search with wildcard #70

Open
c-ruttkies opened this issue May 9, 2022 · 1 comment
Open

Substructure search with wildcard #70

c-ruttkies opened this issue May 9, 2022 · 1 comment

Comments

@c-ruttkies
Copy link

Hi,

I have a problem using the SSSearcher with query features where I cannot really explain why it behaves like it does.

Following code snippet:

public class SubstructureSearcherTest {

final private String query = "\n"
		+ "  MJ210900                      \n"
		+ "\n"
		+ "  8  8  0  0  0  0  0  0  0  0999 V2000\n"
		+ "   -0.4035    0.4148    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -1.1180    0.0023    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -1.1180   -0.8228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -0.4035   -1.2354    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    0.3109   -0.8228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    0.3109    0.0023    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    1.1180    0.1738    0.0000 S   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -0.3173    1.2354    0.0000 A   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "  1  2  2  0  0  0  0\n"
		+ "  2  3  1  0  0  0  0\n"
		+ "  3  4  2  0  0  0  0\n"
		+ "  4  5  1  0  0  0  0\n"
		+ "  5  6  2  0  0  0  0\n"
		+ "  6  1  1  0  0  0  0\n"
		+ "  6  7  1  0  0  0  0\n"
		+ "  1  8  1  0  0  0  0\n"
		+ "M  END\n"
		+ "";

final private String target = "\n"
		+ "  MJ210900                      \n"
		+ "\n"
		+ "  9  9  0  0  0  0  0  0  0  0999 V2000\n"
		+ "   -0.4035    0.4148    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -1.1180    0.0023    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -1.1180   -0.8228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -0.4035   -1.2354    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    0.3109   -0.8228    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    0.3109    0.0023    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "    1.1180    0.1738    0.0000 S   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -0.3173    1.2354    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "   -0.4888    2.0423    0.0000 A   0  0  0  0  0  0  0  0  0  0  0  0\n"
		+ "  1  2  2  0  0  0  0\n"
		+ "  2  3  1  0  0  0  0\n"
		+ "  3  4  2  0  0  0  0\n"
		+ "  4  5  1  0  0  0  0\n"
		+ "  6  1  1  0  0  0  0\n"
		+ "  5  6  2  0  0  0  0\n"
		+ "  1  8  1  0  0  0  0\n"
		+ "  8  9  1  0  0  0  0\n"
		+ "  6  7  1  0  0  0  0\n"
		+ "M  END\n"
		+ "";

@Test
public void checkSimpleSubstructure() {
	System.out.println(this.query);
	System.out.println(this.target);
	final MolfileParser parser = new MolfileParser();
	final StereoMolecule target = new StereoMolecule();
	parser.parse(target, this.target);
	final StereoMolecule query = new StereoMolecule();
	parser.parse(query, this.query);
	query.setFragment(true);
	this.addQueryFeatures(query);
	final SSSearcher matcher = new SSSearcher();
	matcher.setMolecule(target);
	matcher.setFragment(query);
	assertTrue(matcher.isFragmentInMolecule());
}

private void addQueryFeatures(StereoMolecule molecule) {
	IntStream.range(0, molecule.getAtoms()).boxed()
		.filter(idx -> !molecule.getAtomLabel(idx).equals("A"))
		.forEach(idx -> molecule.setAtomQueryFeature(idx, Molecule.cAtomQFNoMoreNeighbours, true));
}

}

I have two mol strings, one query and one target. The query and the target both contain an 'A' for any atom. I add the query feature Molecule.cAtomQFNoMoreNeighbours to all but the 'A' atom of the query. In my opinion, the matcher.isFragmentInMolecule() should return true.

I made some strange investigations on this. When I replace the 'A' with a 'C' atom in the target matcher.isFragmentInMolecule() returns true. When I keep the 'A' atom in the target and don't add the query features (uncomment this.addQueryFeatures(query);) matcher.isFragmentInMolecule() also returns true. Can you explain what's happening here and whether this is expected?

Thanks,
Christoph

Might also be interesting for @lutzweber

@thsa
Copy link
Contributor

thsa commented May 12, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants