Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Implement writing of ChemSage DAT files #422

Open
wants to merge 97 commits into
base: develop
Choose a base branch
from

Conversation

maxposchmann
Copy link

Adds support for writing of ChemSage DAT files with IDMX, QKTO, RKMP, SUBL, SUBG, SUBQ phase models, plus magnetic variants of RKMP and SUBL.

Notwithstanding open issues on DAT parsing, DAT files can be read and an equivalent DAT produced for cases I've tested. These include the databases Kaye_NobleMetals.dat, FeCuCbase.dat, and FeTiVO.dat available on the Thermochimica repo.

Reasonable DAT files have been produced by converting TDBs Al-Mg_Zhong.tdb, Al-Cu-Y.tdb, and alfe_sei.TDB from the examples directory.

I've tried to be verbose in comments, but the code could probably be factored better. Input on this is desired.

This is intended to close #413, no doubt with many enhancements requiring new issues in the future.

@richardotis richardotis self-requested a review June 19, 2022 16:19
Copy link
Collaborator

@richardotis richardotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Max, this is looking great! Thanks so much for working on this. Besides the specific comments, in general this PR is going to need several tests. Any DAT files you're able to contribute to the test suite would be highly appreciated. In particular I think we'll need tests for:

  1. DAT read/write/read (round-trip) Database equality comparison, for a few cases
  2. TDB->DAT conversion, where supported (can we achieve Database object equality on round-trip conversion?)

@@ -12,6 +12,9 @@
from symengine import S, log, Piecewise, And
from pycalphad import Database, variables as v
from .grammar import parse_chemical_formula
import datetime
import getpass
from sympy import simplify, piecewise_fold, expand
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pycalphad doesn't depend on sympy anymore, only on symengine. I'm still reading so I'll see if I can advise on how to eliminate the dependency.

pycalphad/io/cs_dat.py Show resolved Hide resolved
pycalphad/io/cs_dat.py Outdated Show resolved Hide resolved
solution_phase_species = []

# DAT *always* includes ideal gas phase (gas_ideal) in header (in first solution phase position), even if not used
solution_phases.insert(0,'GAS_IDEAL')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this logic still work if we actually have an ideal gas phase in the Database? What if it's a GAS phase with interaction parameters, pressure dependence, etc.? This is one of the cases that needs a test.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly works with an ideal gas in. Frankly I don't yet understand other gas models well enough yet, I should probably read up on the options in both the TDB and DAT formats before deciding how to handle.

In my experience with other TDB -> DAT converters, they often break on pressure-dependent terms. This might be a tricky spot.

Comment on lines 1322 to 1329
species = []
for i in range(len(constituents[0])):
for j in range(i,len(constituents[0])):
for k in range(len(constituents[1])):
for l in range(k,len(constituents[1])):
species.append(f'{constituents[0][i]},{constituents[0][j]}/{constituents[1][k]},{constituents[1][l]}')
solution_phase_species.append(species)
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bocklund Please check this during your review

pycalphad/io/cs_dat.py Show resolved Hide resolved
pycalphad/io/cs_dat.py Outdated Show resolved Hide resolved
pycalphad/io/cs_dat.py Show resolved Hide resolved
Comment on lines +2129 to +2143
def iterative_substitution(param,symbols):
# Iteratively subsitutes a list of symbols into an equation
# Doesn't check for circular dependence
subs_param = param
requires_substitution = True
while requires_substitution:
requires_substitution = False
param_symbols = [s.name for s in subs_param.free_symbols]
# Check if any of the symbols in symbols are still in the parameter
common_symbols = [s for s in symbols if s in param_symbols]
if common_symbols:
# Do another round of substitutions if so
subs_param = piecewise_fold(simplify(subs_param.subs(symbols)))
requires_substitution = True
return subs_param
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can replace this function with Model.symbol_replace and eliminate the sympy dependency:

# Maximum number of levels deep we check for symbols that are functions of
# other symbols
_MAX_PARAM_NESTING = 32
def symbol_replace(obj, symbols):
    """
    Substitute values of symbols into 'obj'.
    Parameters
    ----------
    obj : SymEngine object
    symbols : dict mapping symengine.Symbol to SymEngine object
    Returns
    -------
    SymEngine object
    """
    try:
        # Need to do more substitutions to catch symbols that are functions
        # of other symbols
        for iteration in range(_MAX_PARAM_NESTING):
            obj = obj.xreplace(symbols)
            undefs = [x for x in obj.free_symbols if not isinstance(x, v.StateVariable)]
            if len(undefs) == 0:
                break
    except AttributeError:
        # Can't use xreplace on a float
        pass
    return obj

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calls to piecewise_fold and simplify here are crucial to getting the equations formatted into simple intervals that can be written to a DAT. I tried switching to symbol_replace as you highlighted, but it returns a nested piecewise equation that can't be used.

For more details of what needs to happen, you might look at my recent question on StackExchange.

Comment on lines +2145 to +2156
def simplify_reference_gibbs(equation,symbols):
# Determine equation type and number of intervals
# Is this monstrosity necessary? Yes, it would seem so.
simplified_parameter = iterative_substitution(equation,symbols)
gibbs_equation = expand(simplify(simplify(simplified_parameter))).args
if not gibbs_equation:
# If all ranges have 0 value, simplify returns empty set, so revert
simplified_parameter = equation.args
gibbs_equation = []
for i in range(int(len(simplified_parameter)/2)):
gibbs_equation.append((simplified_parameter[i*2],simplified_parameter[i*2+1]))
return gibbs_equation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's probably another way to accomplish the goal here, but I don't fully understand the purpose of this function yet. I'd like to avoid reintroducing sympy as a dependency as much as possible

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a great function. Basically I ran all these lines together in a couple of places, so I just grouped them into a function. All it is attempting to do is get the Gibbs reference energy equation into the same format whether it originated from a TDB (with nested piecewise) or a DAT (with just simple coefficients).

Piecewise equations have to be simplified as far as possible, but then I needed expand to get coefficients of simple powers of T.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After playing around with it, I'll add that simplified_parameter.simplify() doesn't yield the same results as simplify(simplified_parameter) (and the difference results in an error).

@bocklund
Copy link
Collaborator

bocklund commented Aug 3, 2022

@maxposchmann sorry for my delay, this is still high on my list, but it's been a busy month or two for me. I'm hoping to get back to properly reviewing and trying to help avoid re-introducing SymPy soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ChemSage DAT support: DAT writing
3 participants