Skip to content

chopralab/Cliquify

Repository files navigation

Cliquify

Cliquify - Robust representation of molecular graphs to trees structures is an extension to the work from Junction Tree Variational Autoencoder (JTVAE).

This work aims to improve the tree representation of the molecular graph by introducing a variation of Hugin's Algorithm through the formation of chordal graphs.

image

Vocab Generalization

  • We define the importance of tree molecular vocabulary through its ability of representing more and diverse molecules.
  • JTVAE ring vocabulary constraints the number of molecules generated due to its poor generalizability.
  • Our solution fixes the problem by using more generalizable triangular cliques as vocabulary.
  • Generalizable vocabulary helps in generative model, eg. VAE or GAN, to generate more diverse molecules without the need of redefining vocabulary based on new dataset.

JTVAE (random sampled 2 vocabs)

  • Vocabulary used

image

  • Samples generated (Pruned)

image

Cliquify (random sampled 2 vocabs)

  • Vocabulary used

image

  • Samples generated (Pruned)

image

As you can from the comparison above, by using the more generalizable vocabulary from Cliquify, after random sampling, cliquify can produce molecules with diversified components.

Candidate Shrinkage

Cliquify uses triangular clique decomposition, which helps in

  • reducing number of candidates per node generation (candidate generation explosion when involving large rings mentioned in hgraph2graph),
  • control the number and characteristics of candidates being generated for each fragments.

Given sample molecule

image

JTVAE | Cliquify


image

  • The diagram above shows how Cliquify reduces the possibility of candidates generation per node.

image

  • The diagram above shows the average candidate generation per tree node, from molecules which has 6 membered rings and above
  • Cliquify has low fluctuation of average numbers of candidates generated as compared to JTVAE

Tree Similarity

  • JTVAE junction tree (ring vocabulary) is not deterministic since there are potentially many molecules that correspond to the same junction tree. -
  • Using Cliquify, using the triangulation clique method
    • increase deterministic properties of junction tree
    • allow lesser one to many relationship between junction tree and corresponding molecule image
  • We quantify the tree similarity between molecules using Graph Edit Distance (GED) from Networkx Library

    • GED based on tree nodes

    image

    • GED based on tree nodes and edges

    image

  • Based on the two diagrams above, we can infer that cliquify produces more unique trees as compared to JTVAE, making the tree structure more determistic for decoding, encourages more one to one relationship between molecules and tree structure representation.

Descendant Orientation Awareness

  • JTVAE – due to its neighborhood to neighborhood decoding process, it does not consider the orientation of the existing decoded molecule
  • Cliquify eliminate this possibility by restricting the location of possible attachment, reducing/eliminating the possibility of orientation identification error.
    • It does that through prioritizing Non Ring Bonds attachment during graph to tree decomposition, reducing the possible triangular cliques attached to the Non Ring Bond.

Original molecule (JTVAE)

image

Decoded molecule (JTVAE)

image

Honeycomb Problem

  • Honeycomb structure is prevalent in large organic molecules. JTVAE fails to capture such formation
  • Honeycomb formation requires recursive build, thus the more complicated the neighboring molecules, the larger the candidate count would be.
  • This would like result in possibility of candidate explosion.

image

JTVAE

  • Due to its inherent tree structure decoding, JTVAE fails to capture how multiple children of the same parent are being connected to one another.

    image

Cliquify

  • Cliquify is able to decompose the honeycomb structure, reducing the possibility of candidate explosion through pruning.

    image

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published