This is the README file to the python-based crystIT script, which calculates information theoretical complexity parameters as proposed by S. Krivovichev (2014) and extended by W. Hornfeck (2020). Modifications for partially occupied crystallographic orbits are included as well, based on the work by S. Krivovichev (2022). It provides an accessible user interface, requiring no programming experience.
crystIT is written in Python and uses standardized crystallographic information files (CIFs) as input. In the following, the script's package dependencies, the operation of the script and the output modes are explained.
In addition to standard libraries such as numpy, crystIT was developed and tested in the following Python environment:
- Python 3.8.3 (available at http://www.python.org)
- ASE 3.19.1 (Atomic Simulation Environment, more information at https://wiki.fysik.dtu.dk/ase/)
- Spglib 1.15.0 (more information at https://spglib.github.io/spglib/)
- PyXtal 0.0.7 (more information at https://github.com/qzhu2017/PyXtal)
Open the command window of your computer and navigate to the directory containing crystIT.py
. Write in command line:
$ python crystIT.py
Successful startup is confirmed by crystIT's welcome message:
Welcome to crystIT -- A Crystal Structure Complexity Analyzer Based on Information Theory
Version 0.1, release date: 2020-09-22
Written by Clemens Kaußler and Gregor Kieslich (Technical University of Munich)
Please cite the following paper if crystIT is utilized in your work:
Kaußler, Kieslich (2021). J. Appl. Cryst. 54, DOI: 10.1107/S1600576720016386
Input path of .cif file or directory for complexity analysis. 's' for settings. 'e' to exit.
There are two modes of operation: Either, CIFs can be processed one by one in single file mode, or directories - possibly containing multiple CIFs - may be passed to the script in batch mode.
In single file mode, the path to a CIF is simply typed into the bash and confirmed with enter. All results are displayed in the bash after calculation, whereby the complexity nomenclature introduced by Hornfeck (2020) is applied. A sample output for K3C60 is presented here:
------------ C:\K3C60.cif ------------
assumed formula C20K
assumed SG Fm-3m (225)
SG from CIF F m -3 m (225)
lattice [A] a: 14.24, b: 14.24, c: 14.24
angles [°] b,c: 90.00, a,c: 90.00, a,b: 90.00
---
252.000000 atoms / unit cell
63.000000 atoms / reduced unit cell
123.000000 positions / reduced unit cell
5.000000 crystallographic orbits
8.000000 unique species
8.000000 coordinational degrees of freedom (arities)
--- combinatorial (extended Krivovichev) ---
0.697023 I_comb [bit / position]
0.975610 I_comb_mix [bit / position]
3.000000 I_comb_max [bit / position]
0.232341 I_comb_norm [-]
85.733784 I_comb_tot [bit / reduced unit cell]
0.118763 I_comb_dens [bit / A^3]
--- coordinational (Hornfeck) ---
1.561278 I_coor [bit / freedom]
2.321928 I_coor_max [bit / freedom]
0.672406 I_coor_norm [-]
12.490225 I_coor_tot [bit / reduced unit cell]
0.017302 I_coor_dens [bit / A^3]
--- configurational (extended Hornfeck) ---
1.081474 I_conf [bit / (position + freedom)]
3.700440 I_conf_max [bit / (position + freedom)]
0.292256 I_conf_norm [-]
141.673138 I_conf_tot [bit / reduced unit cell]
0.196254 I_conf_dens [bit / A^3]
In batch mode, the path of a CIF-containing directory is typed into the bash and confirmed with enter. The results as well as warnings and error messages are compiled into a character-separated values (.csv) file which is saved as batch_TIMESTAMP.csv
into the processed directory. Attention! With default settings, only CIFs directly present in the folder passed to crystIT are considered, subfolders are ignored.
The settings menu is accessed by typing s
and hitting enter.
Input float as symmetry tolerance 0 < x < 1 (currently 0.005).
Input int as maximum number of threads (currently 12)
'd' to toggle between decimal separators (currently '.').
'o' to toggle occupancy editing options (currently False).
'r' to toggle recursive subdir scan (currently False).
's' to toggle entropy calculation (currently False).
'e' exit to main menu:
- Input of a decimal number between zero and one changes symprec which defines the tolerance in cartesian coordinates for Spglib to find symmetry and simultaneously is the threshold cartesian coordinate value for identification of duplicate atom entries in the CIF:
|x′ − x| < symprec
. Always use.
as decimal separator to change symprec! This value should be adjusted in the event of wrong space-group assignement which can help in some cases; however, an error message is returned if the assignment in space-group discrepancy still exists. - The maximum number of threads for multiprocessing in batch mode is automatically set to the maximum number of available threads but can be adjusted by integer input.
d
toggles the decimal separator between dot and comma, especially useful for German Excel users.- The occupancy options, accessible by typing
o
, allow for on-the-fly occupancy editing in single file processing. - By activating the recursive subdirectory scan with
r
, subfolders are scanned in batch mode. s
toggles the calculation of entropy values from information content values, according to Krivovichev (2016, 2022).- Finally, the settings menu is exited with
e
.