Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check multiplicity of reverse products if the family template reactants have multiplicity constraints (vdW groups) #2131

Merged
merged 5 commits into from
May 17, 2021

Conversation

davidfarinajr
Copy link
Contributor

This PR attempt to fix this issue #2122

Motivation or Problem

If we are applying a family in reverse and the template reactants restrict multiplicity, we need to make sure that
a reverse product matches the multiplicity-constrained template reactant because the forward template products
do not enforce the multiplicity restriction. Without this check, we get an UndeterminableKineticsError if the reverse products have a multiplicity that does not match the multiplicity constraint of the template reactants.

Description of Changes

Added a multiplicity check when applying the template in reverse.

Testing

Need to add unit test. Tested a few reactions which had UndeterminableKineticsError but now are no longer generated since they don't satisfy the multiplicity constraints in the template.

Copy link
Member

@rwest rwest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as the minor changes to the code that I suggested, please improve your commit message. It should describe what you're trying to achieve, and why, and how.
One of the first things I often do when debugging something is check commit messages of when the code was last updated, and try to get into the head of whoever implemented it at the time (even when it's my past self, I usually need the commit messages to help me remember). Understanding why something is done the way it is is immensely helpful, and this is usually best conveyed by commit messages. (Discussion on github pull requests and linked issues are better than nothing, but may get lost and aren't instantly visible in an IDE or git client. The commit message is there forever, and easily accessible)

# if applying the family in reverse and the template reactants restrict multiplicity, we need to make sure that
# a reverse product matches the multiplicity-constrained template reactant because the forward template products
# do not enforce the multiplicity restriction
if not forward and reactant_type == 'mol':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do

Suggested change
if not forward and reactant_type == 'mol':
if not forward and isinstance(reactant_structure, Molecule):

could you do away with the reactant_type variable and remove the changes above?

match = self._match_reactant_to_template(struct, template)
if match:
break
if not match:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complicated nested for loops with if and break clauses etc. usually benefit greatly from some comments. I think this is correct, but I had to think very hard to check it. Please make life easier for reviewers future programmers by adding some comments here.

if isinstance(node.item, Group):
if node.item.multiplicity != []:
multiplicity_restricted_templates.append(node)
if multiplicity_restricted_templates:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this if is redundant. If the list is empty the following for will just run zero times.

@davidfarinajr
Copy link
Contributor Author

Thanks for the review @rwest, I'll work on those changes

@rwest
Copy link
Member

rwest commented May 14, 2021

Thanks for making this fix, @davidfarinajr. It doesn't go un-noticed when you put effort into solving other people's problems. 🥇

@davidfarinajr davidfarinajr force-pushed the template_multiplicity branch 2 times, most recently from dfea639 to 7d7b34b Compare May 14, 2021 19:14
@davidfarinajr
Copy link
Contributor Author

Also added debug logging which looks like this for the unit test

DEBUG:root:No product structures matched AdsorbateVdW which has a multiplicity of [1]

Template:

multiplicity [1]
1 *1 Xv u0 p0 c0
2 *2 R  ux {3,S}
3 *3 R  ux {2,S}


Product structures:

DEBUG:root:<Molecule "[CH3].[Pt]">
multiplicity 2
1 *2 C u1 p0 c0 {2,S} {3,S} {4,S}
2    H u0 p0 c0 {1,S}
3    H u0 p0 c0 {1,S}
4 *3 H u0 p0 c0 {1,S}
5 *1 X u0 p0 c0


DEBUG:root:<Molecule "[CH]=[Pt]">
multiplicity 2
1 *4 C u1 p0 c0 {2,S} {3,D}
2    H u0 p0 c0 {1,S}
3 *5 X u0 p0 c0 {1,D}

@davidfarinajr
Copy link
Contributor Author

This multiplicity check will not check logic nodes though. I don't think we have many (or any) of these with surface families, but perhaps we should figure out how to check those as well. You could have a logic node at the top which points to groups which have multiplicity restrictions (don't think we currently have any of these)

@codecov
Copy link

codecov bot commented May 14, 2021

Codecov Report

Merging #2131 (0632ab3) into master (304fabe) will increase coverage by 0.05%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2131      +/-   ##
==========================================
+ Coverage   47.40%   47.46%   +0.05%     
==========================================
  Files          89       89              
  Lines       23953    23967      +14     
  Branches     6237     6246       +9     
==========================================
+ Hits        11356    11376      +20     
+ Misses      11390    11385       -5     
+ Partials     1207     1206       -1     
Impacted Files Coverage Δ
rmgpy/data/kinetics/family.py 48.87% <50.00%> (+0.16%) ⬆️
arkane/kinetics.py 12.24% <0.00%> (ø)
rmgpy/molecule/draw.py 52.74% <0.00%> (+0.77%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 304fabe...0632ab3. Read the comment docs.

@rwest
Copy link
Member

rwest commented May 14, 2021

I guess we could check that the products of a reverse reaction always match the forward reaction, but perhaps that's a lot of checking and will slow things down too much ?

@rwest
Copy link
Member

rwest commented May 14, 2021

How often do we think this will get hit, as currently implemented?
There's an optimization possible for debug logging (avoids doing the string formatting if you're not going to actually log it - because usually we're not running with the verbose option) but if we very rarely hit this then it's maybe not worth it.

@davidfarinajr
Copy link
Contributor Author

I guess we could check that the products of a reverse reaction always match the forward reaction, but perhaps that's a lot of checking and will slow things down too much ?

We could check, but this is "checked" when we descend the tree to get the kinetics. If it doesn't at least match the top node, then we get UndeterminableKineticsError. Perhaps, we could check this when we apply the reaction recipe (as we are doing with the multiplicity check) so that we don't generate the reaction in the first place. This will slow things down, although I'm not sure by how much

@davidfarinajr
Copy link
Contributor Author

How often do we think this will get hit, as currently implemented?
There's an optimization possible for debug logging (avoids doing the string formatting if you're not going to actually log it - because usually we're not running with the verbose option) but if we very rarely hit this then it's maybe not worth it.

🤔, not sure how prevalent this issue is. We would be hitting this check quite often for catalysis, but the majority of the time we should get a match because we don't usually have radicals on the surface and the reactant groups are restricted to singlet.

davidfarinajr and others added 5 commits May 14, 2021 22:33
…ict multiplicity

If applying a family recipe in reverse and the template reactant groups restrict multiplicity,
we need to make sure that a reverse product matches the multiplicity-constrained template reactant
because the forward template products do not enforce the multiplicity restriction. This commit
adds a check for this in the apply_recipe method and returns `None` (no reaction) if
none of the products generated match the multiplicity-constrained template reactant.
Should speed it up a bit. 
This might not be called often enough for this optimization to matter,
but I'm not sure. At first I just wanted to switch to %s formatting
because it uses lazy evaluation of the sting formatting, but that still
leaves the expensive to_adjacency_list() calls and the only way to avoid
those is wrap the whole thing an if block.
This method is sometimes applied to Group objects, to 
generate the product templates, not just Molecule objects.
The logic used to be that if you start with a wildcard number of 
radicals (or pairs) then incrementing it will mean you have a wildcard
that is at least 1 or more. But you could in fact increment by more
than one, using a parameter to the method, so your new wildcard list
shouldn't always start at 1 but should start at however many you incremented by.

Could have done list(range(radical,5)) instead of [1,2,3,4][radical-1:]
but the latter is faster.
@rwest
Copy link
Member

rwest commented May 15, 2021

This looks OK to me, but I made the last 3 commits, so now need someone else to review it.

Copy link
Contributor

@Tingchenlee Tingchenlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me. Thanks, Richard and David! I used this branch to run one of the problem input files and the previous issue is gone now.

@davidfarinajr
Copy link
Contributor Author

Looks good to me as well! Thanks Richard and Ting-Chen!

@rwest rwest merged commit ad0ee9f into master May 17, 2021
@rwest rwest deleted the template_multiplicity branch May 17, 2021 15:09
@rwest
Copy link
Member

rwest commented May 17, 2021

I've merged this because it fixes a known issue and has been approved by review. But it occurred to me there may be another approach, that we should perhaps discuss. I tried looking at it for a while, but it was taking too long to figure out.

We generate the reverse template by applying the recipe to the forward template. This is done mostly using atom type actions, etc. so for example if you have a Cs atom type (carbon with all single bonds) and apply a increment_bond then it becomes a Cd atom type (carbon with one double bond) in the reverse template for example. Could we do something similar with the overall group attributes like multiplicity? i.e. if you have multiplicity = [0,2] enforced for the forward template, then apply an increment_radical action you end up with multiplicity = [1,3] in the product template.

Thoughts?

@mjohnson541
Copy link
Contributor

While I think the solution in this PR is fine for now, I think tracking multiplicity when applying recipe actions is probably cleaner and more consistent long term.

@davidfarinajr
Copy link
Contributor Author

I've merged this because it fixes a known issue and has been approved by review. But it occurred to me there may be another approach, that we should perhaps discuss. I tried looking at it for a while, but it was taking too long to figure out.

We generate the reverse template by applying the recipe to the forward template. This is done mostly using atom type actions, etc. so for example if you have a Cs atom type (carbon with all single bonds) and apply a increment_bond then it becomes a Cd atom type (carbon with one double bond) in the reverse template for example. Could we do something similar with the overall group attributes like multiplicity? i.e. if you have multiplicity = [0,2] enforced for the forward template, then apply an increment_radical action you end up with multiplicity = [1,3] in the product template.

Thoughts?

Yes, this was my original thought: If we enforce multiplicity with the template reactants, we could determine the multiplicity of the template products from the actions. I like this approach, but I think it's more difficult to implement than it may seem.

Consider this family for example:

name = "Surface_Adsorption_Abstraction_vdW/groups"
shortDesc = u""
longDesc = u"""
Adsorbtion of a vdw species to the surface with a surface species.

*2=*3    *4-*5        *2-*3-*5    *4
  :   +   |    ---->   |       +  ||
~*1~    ~*6~         ~*1~~       ~*6~

The rate, which should be in mol/m2/s,
will be given by k * (mol/m2) * (mol/m2)
so k should be in (m2/mol/s). We will use sticking coefficients.
"""

template(reactants=["AdsorbateVdW", "Adsorbate1"], products=["Adsorbate2","Adsorbate3"], ownReverse=False)

reverse = "Surface_Desorption_Abstraction_vdW"

reactantNum=2
productNum=2

recipe(actions=[
    ['CHANGE_BOND', '*2', -1, '*3'],
    ['CHANGE_BOND', '*1', 1, '*2'],
    ['BREAK_BOND', '*4', 1, '*5'],
    ['FORM_BOND', '*3', 1, '*5'],
    ['CHANGE_BOND', '*4', 1, '*6'],
])

AdsorbateVdW is restricted to multiplicity 1, and there is no multiplicity restriction on Adsorbate1. None of the actions changes the radical count, so we don't have to worry about that. In order to ensure AdsorbateVdW is mult 1,
*2 and *3 should have no radicals, and any atom they are bonded to in AdsorbateVdW should have no radicals, but *5 can have radicals since Adsorbate1 has no multiplicity restriction. This means that the multiplicity of the Adsorbate2 product should be the number of radicals on *5 + 1 (which would be hard to implement in the group definition).

Since the multiplicity is also determined by the atoms in the molecule that are not labeled in the group (an not affected by actions), it it difficult to restrict multiplicity in the template products.

It also gets tricky if only one of the template reactants has a multiplicity constraint. If, for example Adsorbate1 was limited to mult 1, and the actions don't change radical count, then we would know both products should be mult 1.

This could also be affected by how the groups are written in the database. For this family, the AdsorbateVdW group looks like:

entry(
    index = 1,
    label = "AdsorbateVdW",
    group =
"""
multiplicity [1]
1 *1 Xv  u0 p0 c0
2 *2 R!H ux px cx {3,[D,T]}
3 *3 R!H ux px cx {2,[D,T]}
""",
    kinetics = None,
)

*2 and *3 have ux but they should be written as u0 since its mult 1, so this could get messy if we are using the group atom radicals to determine multiplicity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants