QChem Adapter & Molpro Updates #712

calvinp0 · 2023-11-14T10:52:49Z

This PR includes QChem as an adapter and also provides updates to the Molpro adapter

github-advanced-security

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

arc/scheduler.py

arc/settings/submit_test.py

codecov · 2023-11-14T23:32:06Z

Codecov Report

Attention: 224 lines in your changes are missing coverage. Please review.

Comparison is base (1f561ed) 73.81% compared to head (f486220) 73.52%.
Report is 3 commits behind head on main.

❗ Current head f486220 differs from pull request most recent head 5e4b433. Consider uploading reports for the commit 5e4b433 to get more accurate results

Files	Patch %	Lines
arc/job/trsh.py	37.79%	69 Missing and 10 partials ⚠️
arc/job/adapters/qchem.py	42.15%	48 Missing and 11 partials ⚠️
arc/parser.py	63.15%	43 Missing and 6 partials ⚠️
arc/scheduler.py	0.00%	13 Missing ⚠️
arc/species/converter.py	13.33%	12 Missing and 1 partial ⚠️
arc/species/vectors.py	14.28%	5 Missing and 1 partial ⚠️
arc/job/adapters/molpro.py	80.00%	0 Missing and 5 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #712      +/-   ##
==========================================
- Coverage   73.81%   73.52%   -0.29%     
==========================================
  Files          99      100       +1     
  Lines       27346    27650     +304     
  Branches     5717     5790      +73     
==========================================
+ Hits        20185    20331     +146     
- Misses       5734     5895     +161     
+ Partials     1427     1424       -3

Flag	Coverage Δ
unittests	`73.52% <60.77%> (-0.29%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alongd

Began reviewing, but cannot finish today, left a few comments

alongd · 2023-11-27T18:48:19Z

data/basis_sets.csv

@@ -0,0 +1,152 @@
+software,basis_set,atoms_supported,


Possibly missing a b in the commit message

alongd · 2023-11-27T18:49:56Z

arc/job/adapter_test.py

@@ -326,7 +326,7 @@ def test_set_cpu_and_mem(self):
        self.job_4.cpu_cores = None
        self.job_4.set_cpu_and_mem()
        self.assertEqual(self.job_4.cpu_cores, 8)
-        expected_memory = math.ceil((14 * 1024 * 1.1) / self.job_4.cpu_cores)


The commit message here should be something like "Minor: Style modification in job adapter test"
(I understand that it is probably a result of a rebase)

alongd · 2023-11-27T18:51:32Z

arc/job/adapters/molpro_test.py

@@ -145,7 +145,7 @@ def test_set_files(self):
                                  'source': 'path',
                                  'make_x': False},
                                 ]
-        job_1_files_to_download = [{'file_name': 'input.out',
+        job_1_files_to_download = [{'file_name':'output.out',


Should also change this line: https://github.com/ReactionMechanismGenerator/ARC/blob/main/arc/settings/settings.py#L165

I believe it was changed in this PR

alongd · 2023-11-27T19:07:05Z

arc/job/adapters/molpro.py

+
+                if memory_values:
+                    min_memory_value = min(memory_values)
+                    required_cores = math.floor(max_memory / (min_memory_value * 7.45e-3))


Can you explain 7.45e-3? If it's the product of some factors, can we explicitly write them for maintenance?

mega word; 1000 MW = 7.45 GB on a 64-bit machine

It is in the doc string of the function

alongd · 2023-11-27T19:09:11Z

arc/job/adapters/molpro.py

+                    if self.core_change is None:
+                        self.core_change = required_cores
+                    elif self.core_change == required_cores:
+                        # We have already done this


Does self.core_change pass on between a failed job to a new job that fixes it?

I am not 100% certain, but as long as the adapter class is not re-instantiated

alongd · 2023-11-27T19:10:58Z

arc/job/adapters/molpro.py

+            self.input_file_memory = math.ceil(self.job_memory_gb / (7.45e-3 * self.cpu_cores)) if 'zeus' not in socket.gethostname() else math.ceil(self.job_memory_gb / (7.45e-3))
+
+
+


minor: please remove the extra line breaks

alongd · 2023-11-27T19:12:16Z

arc/job/adapters/qchem.py

+            """
+            Generates the angles for a Q-Chem scan. The scan is split into two parts, one from start_angle to 180, and one from -180 to end_angle.
+
+            Parameters


Please change to the Google docsting style used throughout ARC (can also probably set your IDE to automatically do adopt that style from now on)

alongd · 2023-11-27T19:14:30Z

arc/job/adapters/qchem.py

@@ -362,6 +562,63 @@ def execute_queue(self):
        Execute a job to the server's queue.
        """
        self.legacy_queue_execution()
+
+    def software_input_matching(self, basis):


Seems important. Although only implemented for Q-Chem currently, it feels like this function would be more at home in job/adapters/common.py

I disagree. This function is only for QChem and if we are to commit to the idea that anyone can create an adapter for ARC, then any function that is only relevant for that adapter should stay with that adapter. I also believe this should be true in the future for troubleshooting

alongd · 2023-11-27T19:15:17Z

arc/job/adapters/qchem.py

+        # If method ends with D3, then we need to remove it and add the D3 as a keyword. Need to account for -D3
+        if input_dict['method'].endswith('d3') or input_dict['method'].endswith('-d3'):
+            input_dict['method'] = input_dict['method'][:-2]
+            # Remove the - if it exists


Generally, let's use as few comments as possible, especially if the code is self-explanatory

alongd · 2023-11-27T19:19:55Z

arc/job/adapters/qchem.py

+            if input_dict['method'].endswith('-'):
+                input_dict['method'] = input_dict['method'][:-1]
+            # DFT_D - FALSE, EMPIRICAL_GRIMME, EMPIRICAL_CHG, D3_ZERO, D3_BJ, D3_CSO, D3_ZEROM, D3_BJM, D3_OP,D3 [Default: None]
+            # TODO: Add support for other D3 options. Check if the user has specified a D3 option in the level of theory


I have this weakness as well, so it's an unfair request: Let's not check-in TODO comments into the production code, because then they practically become to-not-do's. Would we like to ask developers in our ecosystem to create issues instead of TODOs? (comment inspired by "Uncle Bob")

Depends what you mean but creating issues. As in the Issues tab? I don't think that is a good idea for ARC since the Issues section is not monitored and only you, me and @kfir4444 deal with issues that we come across.

alongd · 2023-11-27T19:22:27Z

arc/settings/submit_test.py

@@ -18,7 +18,7 @@ class TestSubmit(unittest.TestCase):
    def test_servers(self):
        """Test server keys in submit_scripts"""
        for server in submit_scripts.keys():
-            self.assertTrue(server in ['local', 'atlas', 'txe1', 'pbs_sample', 'server1', 'server2'])
+            self.assertTrue(server in ['local', 'atlas', 'txe1', 'pbs_sample', 'server1', 'server2', 'azure'])


As CodeQL suggests above, we should modify assertTrue here to assertIn to get a more meaningful error message if this doesn't pass

arc/common.py

arc/job/local.py

@@ -274,6 +275,30 @@
        job_status = 'errored'
    else:
        job_id = _determine_job_id(stdout=stdout, cluster_soft=cluster_soft)
+    # Check if job was ACTUALLY submitted
+    temp = check_running_jobs_ids()


arc/common.py

+        # if isinstance(object_, Species):
+        #     # Check if object_.molecule is a list and then iterate over it
+        #     for i in range(len(object_.molecule)):
+        #         if isinstance(object_.molecule[i], Molecule):
+        #             # Analyse the molecule to determine whether it is an aryl radical and aromatic
+        #             features_molecule = analyze_molecule(mol=object_.molecule[i])
+        #             # If the molecule is an aryl radical and aromatic, remove the aryl radical flag
+        #             if features_molecule['is_aryl_radical'] and features_molecule['is_aromatic']:
+        #                 features_molecule['is_aryl_radical'] = False
+        #                 # Generate optimal aromatic resonance structures using the adjusted features
+        #                 mol_list = generate_optimal_aromatic_resonance_structures(mol=object_.molecule[i],    features = features_molecule, save_order=True)
+        #                 # If the list of resonance structures is not empty, update the object
+        #                 object_.molecule[i]= mol_list[0]


the adapter now can recognise through additional information if the job was killed due to exceeding the wall time it also now has a troubleshoot queue function when writing the submit file, the function will now attempt to see if the user provided a dictionary of queue(s) with wall times and then place them in the submit file template Test: Written a test for the recognition of ServerTimeLimit keyword

…t attempts to consider all the OPT job but does not determine whether the OPT job has a status of done

Built a dataframe of possible combinations of basis sets that QChem requires and the correct format for the input file. Additionally, ensured that it is uploaded in the git push. This is a WIP as there may be more basis sets to add or even fix!

Molpro Adapter: Molpro needs a different touch to troubleshooting its memory. Here in setting the input file memory we determine if the MWords was enough per process but the total memory was too high. If that's the case, we reduce the processes req. while maintaining the memory per process Update Adjusted the file name to download from input.out to output.out

QChem Adapter 0. QChem Adapter: Import - Pandas, ARC_PATH, rapidfuzz 1. QChem Adapter: Input Template now supports IRC and {trsh} args and ensures IQMOL_FCHK is false (This can be set to true BUT be aware this .fchk file can be rather large) 2. QChem Adapter: write_input_file - basis set is now matched via the software_input_matching function 3. QChem Adapter: write_input_file - QChem now supports D3 method. We should look at working with other DFT_D methods in the future. More specifically there are other types of D3 methods 4. QChem Adapter: write_input_file - Correctly pass in troubleshooting arguments into the input file 5. QChem Adapter: write_input_file - Capitalised TRUE/FALSE in UNRESTRICTED parameter 6. QChem Adapter: write_input_file - Removed the scan job type and moved it to another section of the input file writing 7. QChem Adapter: write_input_file - If scan is set, the job_type is PES_SCAN. We also set the fine to be XC_GRID 3. However, we may need to look into changing the tolerances later 8. QChem Adapter: write_input_file - We now write correctly the torsional scans for the input file for a scan 9. QChem Adapter: write_input_file - IRC is now supported, however this input file means we run two jobs from the one input file - A FREQ job and then IRC. This currently works but will need improvements when used more by users 10. QChem Adapter: write_input_file - Ensuring that the SCF CONVERGENCE is 10^-8 for certain job types 11. QChem Adapter: [NEWFUNCTION] generate_scan_angles - to support PES SCAN jobs, we have a function that will look at what the required angle we want to scan, and the step size, and then return a start and end angle between -180, 180 that will ensure we scan the require angle during the stepping 12. QChem Adapter: [NEWFUNCTION] software_input_matching - Since QCHEM has different formatting for basis sets, this function will try take the users format of the basis set and match it against a dataframe (which should always be updated if its missing a format). This uses the new package in the ARC environment called rapidfuzz Additionally - Updated QChem Tests

TrshQChem: Added in new trsh keywrods TrshQChem: 'Error within run_minimization with minimization method' - Not certain what this error requires, and also if we should troubleshoot it if the job type is a 'conformer'. For now, we will re-run the job under the same conditions and if it fails again, we will declare it not possible to troubleshoot remove 'break' TrshMolpro: Parse new array length memory Updated Trsh Tests

For future reference

Added QChem for the Normal Mode Displacement parsing and also change the default bool for raise_error

1. Scheduler: Now imports JobError 2. Scheduler: Fixed adding trsh to the args 3. Scheduler: Added JobError exception for determining job status 4. Scheduler: Now removing remote jobs at the end of the scheduler - !!!MAY NEED TO SEE IF THIS HAS AN EFFECT ON NON SSH JOBS!!! 5. Scheduler: Getting the recent opt job name via properly checking if the opt job was considered done (This was not done before) 6. Scheduler: TODO - We attempt to trouble shoot a frequency we deem not okay. Yet, there is no specific troubleshoot method, so why do we do this? 7. Scheduler: Properly troubleshoot an job 8. Scheduler: Fix conformer troubleshoot if it was a TS conformer

For future reference and good to know for SLURM servers

…onal Needs further development and understanding

The original code did not functional correctly - nor was never used hence why it was passed into production. It has now been changed to properly return the actual number of heavy atoms

Added in rapidfuxx

This function will be required for later use when SSH is updated. For the time being, better to remove. It also causes an error in xTBAdapater (has no attribute remove_remote_files)

Molpro Memory Set Update Will now limit the memory of Molpro to what is req. by Molpro if the CPU is set 1 core.

If user has both software available for IRC, it will use go with preferred order level tests update

Fixed local var min_mem_value Update core reduction logic test for MP

arc/job/adapter_test.py

@@ -431,6 +446,13 @@
                                       'remote': os.path.join(self.job_1.remote_path, 'm.x'),
                                       'source': 'input_files',
                                       'make_x': True})
+
+    def test_determine_job_status(self):


arc/level.py

@@ -12,10 +12,9 @@
 from arkane.encorr.corr import assign_frequency_scale_factor
 from arkane.modelchem import METHODS_THAT_REQUIRE_SOFTWARE, LevelOfTheory, standardize_name

-from arc.common import ARC_PATH, get_logger, get_ordered_intersection_of_two_lists, read_yaml_file
+from arc.common import ARC_PATH, get_logger, get_ordered_intersection_of_two_lists, read_yaml_file, normalize_method_name


calvinp0 requested a review from alongd November 14, 2023 10:53

github-actions bot added Module: Converter Module: Parser Module: Scheduler Module: Species Module: trsh Troubleshooting labels Nov 14, 2023

github-advanced-security bot found potential problems Nov 14, 2023

View reviewed changes

arc/scheduler.py Fixed Show fixed Hide fixed

github-advanced-security bot found potential problems Nov 14, 2023

View reviewed changes

arc/settings/submit_test.py Fixed Show fixed Hide fixed

calvinp0 force-pushed the qchem branch from 3c55680 to 3e638a7 Compare November 14, 2023 22:52

calvinp0 force-pushed the qchem branch 7 times, most recently from d89426a to 0dd37b4 Compare November 27, 2023 16:55

alongd reviewed Nov 27, 2023

View reviewed changes

calvinp0 force-pushed the qchem branch from 2dfb81b to a8883bf Compare February 3, 2024 17:30

calvinp0 force-pushed the qchem branch 2 times, most recently from f486220 to a506a2b Compare February 11, 2024 15:56

github-actions bot added Module: Local Module: Mapping labels Feb 11, 2024

github-advanced-security bot found potential problems Feb 11, 2024

View reviewed changes

calvinp0 added 4 commits February 14, 2024 10:59

Due to Zeus issues, checking if submission is true

21d0e0e

Adjustments for non-submissions

8054f43

It appears that when the function is using an OPT job to get an SP, i…

690a831

…t attempts to consider all the OPT job but does not determine whether the OPT job has a status of done

calvinp0 added 11 commits February 14, 2024 11:02

Updating trh server queue

c10e4f3

asis Set Dataframe

60dd90b

Built a dataframe of possible combinations of basis sets that QChem requires and the correct format for the input file. Additionally, ensured that it is uploaded in the git push. This is a WIP as there may be more basis sets to add or even fix!

Updated Adapter Mem Test - SLURM

1011847

Added QChem to be able to do IRC

6ca490e

Added azure in ess settings

ff7583c

For future reference

QChem Normal Mode Displacement

0c23475

Added QChem for the Normal Mode Displacement parsing and also change the default bool for raise_error

Various Settings Changes

4852981

calvinp0 force-pushed the qchem branch from a506a2b to bc3dc82 Compare February 14, 2024 09:02

calvinp0 added 12 commits February 14, 2024 11:02

Updating submit.py to support AZURE Slurm

497628d

For future reference and good to know for SLURM servers

[WIP] Getting the Internal Coordinates for QChem - Still not operati…

e304459

…onal Needs further development and understanding

Species: Get number of heavy atoms for TSGuess

6143254

The original code did not functional correctly - nor was never used hence why it was passed into production. It has now been changed to properly return the actual number of heavy atoms

XYZ to Smiles: Warning Update to Possible Valence

2b9cd67

Update ARC Environment

d68720f

Added in rapidfuxx

Updated CI wont use arc cache if env file updated

abb4d7f

Removed: remove_remote_files

9e49d7f

This function will be required for later use when SSH is updated. For the time being, better to remove. It also causes an error in xTBAdapater (has no attribute remove_remote_files)

Update spelling

a4a3189

Molpro Mem/CPU Change Tests

fd6bee1

Molpro Memory Set Update Will now limit the memory of Molpro to what is req. by Molpro if the CPU is set 1 core.

deduce software - irc

dc366c6

If user has both software available for IRC, it will use go with preferred order level tests update

molpro test updates

2f581fd

Fixed local var min_mem_value Update core reduction logic test for MP

Updates

bc3dc82

github-advanced-security bot found potential problems Feb 14, 2024

View reviewed changes

Updates

f29012c

github-actions bot added Module: Main Module: Plotter labels Feb 22, 2024

Updates

5e4b433

github-advanced-security bot found potential problems Feb 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QChem Adapter & Molpro Updates #712

QChem Adapter & Molpro Updates #712

calvinp0 commented Nov 14, 2023

github-advanced-security bot left a comment

codecov bot commented Nov 14, 2023 •

edited

alongd left a comment

alongd Nov 27, 2023

alongd Nov 27, 2023

alongd Nov 27, 2023

calvinp0 Nov 28, 2023 •

edited

alongd Nov 27, 2023

calvinp0 Nov 28, 2023

alongd Nov 27, 2023

calvinp0 Nov 28, 2023

alongd Nov 27, 2023

alongd Nov 27, 2023

alongd Nov 27, 2023

calvinp0 Nov 28, 2023

alongd Nov 27, 2023

alongd Nov 27, 2023

calvinp0 Nov 28, 2023

alongd Nov 27, 2023

		self.input_file_memory = math.ceil(self.job_memory_gb / (7.45e-3 * self.cpu_cores)) if 'zeus' not in socket.gethostname() else math.ceil(self.job_memory_gb / (7.45e-3))

QChem Adapter & Molpro Updates #712

Are you sure you want to change the base?

QChem Adapter & Molpro Updates #712

Conversation

calvinp0 commented Nov 14, 2023

github-advanced-security bot left a comment

Choose a reason for hiding this comment

codecov bot commented Nov 14, 2023 • edited

Codecov Report

alongd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calvinp0 Nov 28, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Nov 14, 2023 •

edited

calvinp0 Nov 28, 2023 •

edited