Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical instability in lattice matrix operations : get_slabs / StructureMatcher / _cart_dists #3747

Open
misko opened this issue Apr 11, 2024 · 0 comments
Labels

Comments

@misko
Copy link

misko commented Apr 11, 2024

Python version

Python 3.9.12

Pymatgen version

2023.5.10

Operating system version

Ubuntu

Current behavior

I have a very limited understanding of the materials side of our project. We have some code to generate slabs using pymatgen, however I noticed that on two different machines with identical software environments we get back a different integer number of slabs from SlabGenerator.get_slabs() . I think the root of the problem is when a Lattice is constructed with specific gamma/beta parameters that result in some of the entries in the lattice matrix being dominated by numerical error.

from pymatgen.core.lattice import Lattice

a = 4.49919228
b = 7.361330199082434
c = 30.60553418277412
alpha = 100.89339464913091
beta = 90.0
# gamma = 90.0
gamma = 0.0
Lattice.from_parameters(a, b, c, alpha, beta, gamma)

Using gamma = 0 , puts a value ~ 1e-16 into lattice matrix position [1,0], that is seems to be completely error or due to floating point error on the machine / underlying library.

These types of lattice matrices are initialized inside our call to get_slabs() and then passed through StructureMatcher, which eventually find their way into the call (inside of .fit/match/strict_match),

self._cart_dists(s1fc, t_s2fc, avg_l, mask, normalization, lll_frac_tol)

Here the small differences (1e-15) in the avg_l matrix lead to large differences in the returned value (0.2->0.6). The same python code run on two different machines, with identical environments), and change if a structure matches or not.

https://app.circleci.com/pipelines/github/Open-Catalyst-Project/Open-Catalyst-Dataset/140/workflows/d34fe055-e7a3-4e87-87b2-fd51dbe99837/jobs/286 (local machine vs circleci with identical enviornments)

I am attaching an example where 6 slabs are generated, as they would be inside of a get_slabs() call, and then run through StructureMatcher. This code produces different results on two different machines with identical software setup. If the lattice matrix is rounded, it produces different but consistent results across the two machines.

Expected Behavior

I would expect SlabGenerator.get_slabs() to behave more robustly to numerical error introduced inside of the get_slabs() call that I cannot control as a caller. Maybe rounding the generated lattice matrices to 12 digits? or something one or two orders of magnitude above machine epsilon.

Minimal example

from ase.atom import Atom
from ase.atoms import Atoms

from pymatgen.io.ase import AseAtomsAdaptor
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
from pymatgen.core.surface import SlabGenerator, Slab
from pymatgen.core.lattice import Lattice

bulk_atoms = Atoms(
    [
        Atom("Re", position=position)
        for position in [
            [-1.0578273257344506e-06, 1.6063739546985365, 1.12479807],
            [1.3911606605782731, 0.8031869653014636, 3.37439421],
        ]
    ],
    pbc=True,
    cell=[
        [2.78232129, 0.0, 0.0],
        [-1.39116064, 2.40956092, 0.0],
        [0.0, 0.0, 4.49919228],
    ],
)

struct = AseAtomsAdaptor.get_structure(bulk_atoms)
sga = SpacegroupAnalyzer(struct, symprec=0.1)
standardized_struct = sga.get_conventional_standard_structure()

slab_gen = SlabGenerator(
    initial_structure=standardized_struct,
    miller_index=(2, 1, 0),
    min_slab_size=7.0,
    min_vacuum_size=20.0,
    lll_reduce=False,
    center_slab=True,
    primitive=True,
    max_normal_search=1,
)

# generate the slabs from inside of SlabGenerator.get_slabs before reducing by StructureMatcher
slabs = []
for shift in slab_gen._calculate_possible_shifts(tol=0.1):
    bonds_broken = 0
    slab = slab_gen.get_slab(shift, tol=0.3, energy=bonds_broken)
    slabs.append(slab)

# Round lattic matrix to 10 sigdig
from math import log10

round_to = 1e-10
round_to_sigdig = -int(log10(round_to))
rounded_slabs = [
    Slab(
        Lattice(
            matrix=s.lattice._matrix.copy().round(round_to_sigdig),
            pbc=s.lattice._pbc,
        ),
        s.species_and_occu,
        s.frac_coords,
        s.miller_index,
        s.oriented_unit_cell,
        s.shift,
        s.scale_factor,
        reorient_lattice=False,
    )
    for s in slabs
]

# Run structurematcher

from pymatgen.analysis.structure_matcher import StructureMatcher

m = StructureMatcher(ltol=0.3, stol=0.3, primitive_cell=False, scale=False)

print("Non rounded slabs similar to slab idx=0:")
for s_idx, s in enumerate(slabs):
    print(f"\t0vs{s_idx}", m.fit(slabs[0], s, skip_structure_reduction=True))

print(f"Rounded (to 1e-{round_to_sigdig}) slabs similar to slab idx=0:")
for s_idx, s in enumerate(rounded_slabs):
    print(f"\t0vs{s_idx}", m.fit(rounded_slabs[0], s, skip_structure_reduction=True))

Relevant files to reproduce this bug

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant