Azure server #689

calvinp0 · 2023-07-30T10:14:01Z

Due to acquiring a subscription to Azure, we need to update ARC to allow for the usage of Azure as well as ESS like QCHEM.
Please see the list below of all the changes. NOTE: Tests still need to be fixed or written for these new changes

Added 'rapidfuzz' to ARC environment
basis set dataframe for QChem
check_normal_mode_displacement - Allow for a try and except case
Adapter: Submit Script Format updated with using user provided username {un}, also convert memory to an 'int', and also provide the option of {server_nodes} if required
Adapter: Total Submit Memory adjusted to now ensure that when troubleshooting a job, it never attempts to go OVER the maximum memory of the allowed submission memory of the node/server
Adapter: SLURM Submit Memory - Using #SBATCH --mem as the parameter now as it defines the TOTAL memory of the submission
Adapter: SSH File Download - We do not expect to always download or upload certain files depending on the scheduler via SSH. This change allows for recognising if certain files will be uploaded or downloaded depending on the user's scheduler choice
Adapter: Null Bytes can appear in files, or rather more specifically, if QChem has an error, it can produce Null Bytes in the out.txt/err.txt files and thus requires a different type of reading of the file contents. This is not 100% full proof though and may need extra work
Adapters: In SSH improvement phase2 #390 branch, SSH had improvements but were not merged. I have brought forth an improvement from this branch were Remote Files are removed once they are download to the local computer
ARC can now recognise that IRC is also supported by QChem
common.py: Slight reworking of the trouble shooting of the scan resolution so that it once the obj.scan_res has been defined with the new rotor scan resolution, the scan res is removed from the obj.args['trsh']['scan_res'] properly
SSH: Updated decorator to use the correct connect function
SSH: If the user provides a server that is not in servers.keys() and server is also not None, then an error is raised to informat the user that they need to fix up the server keys
SSH: An error that can occur is when a submission to a scheduler includes an incorrect memory specification, then there is warning to the user that the requested memory is not supported and needs to be checked. May need to make this a ValueError instead of a logger warning
SSH: Slight adjustment to checking if there is an stdout after submission attempt
SSH: Some servers require private keys. Originally the code was incorrectly adding the private key to the SSH paramiko function. It has now been changed so that system keys are loaded and then if the user provides a private key, it is included in the connect function
SSH: Updated default arguments to get_last_modified_time
SSH: Changed the lowering of the cluster soft
SSH: Added a function to remove the directory on the remote server
SSH: Azure SLURM has an extra status called 'CF' which means configuring (for the node). This can take 10-15 mins or so before the node is online. We now ensure to caputre this. HOWEVER, a node can get stuck in 'CF' status. Now we check this via checking the current time the node has been active, splitting the time up correctly (different formats of time are possible), and then if it is above 15 minutes, we run the command scontrol show node {node_id}. If the stdout includes the phrase 'NOT_RESPONDING' then we return 'errored'
XYZ to Smiles: Warning Update to Possible Valence
Vectors: Reading Coords that are in string format using regex and putting into a tuple
Species: The original code did not functional correctly - nor was never used hence why it was passed into production. It has now been changed to properly return the actual number of heavy atoms.
[WIP] Getting the Internal Coordinates for QChem - Still not operational
submit.py & settings.py: Updated for SLURM and AZURE
Scheduler: Now imports JobError
Scheduler: Fixed adding trsh to the args
Scheduler: Added JobError exception for determining job status
Scheduler: Now removing remote jobs at the end of the scheduler - !!!MAY NEED TO SEE IF THIS HAS AN EFFECT ON NON SSH JOBS!!!
Scheduler: Getting the recent opt job name via properly checking if the opt job was considered done (This was not done before)
Scheduler: TODO - We attempt to trouble shoot a frequency we deem not okay. Yet, there is no specific troubleshoot method, so why do we do this?
Scheduler: Properly troubleshoot an job
Scheduler: Fix conformer troubleshoot if it was a TS conformer
parser: TODO - Why do we set raise error as true for normal mode displacement parsing? It has an effect on the function of raising a not implemented error even though it is implememnt
parser: Now can parse the normal mode displacement of QCHEM
parser: Now can parse the 1d scan coords of QCHEM
parser: Can now parse trajectory of QCHEM
parser: Can now parse arguments in the scan input file QCHEM
parser: NEED to fix parse_ic_info for QCHEM
QChem Adapter: Import - Pandas, ARC_PATH, rapidfuzz
QChem Adapter: Input Template now supports IRC and {trsh} args and ensures IQMOL_FCHK is false (This can be set to true BUT be aware this .fchk file can be rather large)
QChem Adapter: write_input_file - basis set is now matched via the software_input_matching function
QChem Adapter: write_input_file - QChem now supports D3 method. We should look at working with other DFT_D methods in the future. More specifically there are other types of D3 methods
QChem Adapter: write_input_file - Correctly pass in troubleshooting arguments into the input file
QChem Adapter: write_input_file - Capitalised TRUE/FALSE in UNRESTRICTED parameter
QChem Adapter: write_input_file - Removed the scan job type and moved it to another section of the input file writing
QChem Adapter: write_input_file - If scan is set, the job_type is PES_SCAN. We also set the fine to be XC_GRID 3. However, we may need to look into changing the tolerances later
QChem Adapter: write_input_file - We now write correctly the torsional scans for the input file for a scan
QChem Adapter: write_input_file - IRC is now supported, however this input file means we run two jobs from the one input file - A FREQ job and then IRC. This currently works but will need improvements when used more by users
QChem Adapter: write_input_file - Ensuring that the SCF CONVERGENCE is 10^-8 for certain job types
QChem Adapter: [NEWFUNCTION] generate_scan_angles - to support PES SCAN jobs, we have a function that will look at what the required angle we want to scan, and the step size, and then return a start and end angle between -180, 180 that will ensure we scan the require angle during the stepping
QChem Adapter: [NEWFUCNTION] software_input_matching - Since QCHEM has different formatting for basis sets, this function will try take the users format of the basis set and match it against a dataframe (which should always be updated if its missing a format). This uses the new package in the ARC environment called rapidfuzz
TrshQChem: Fixed error checking in QChem output files. It would originally mistakenly think SCF failed was the error due to what errors it would look for in the lines
TrshQChem: FlexNet Licensing Error - If the license server is not working this will cause ARC to stop
TrshQChem: Max Optimisation Cycles is probably checked for in the output file
TrshQChem: Max Iteration Cycles is identified now if there is a failure during SCF convergence
TrshMolpro: Molpro reports memory errors that need to be properly troubleshooted differently than what we did originally. Now, we will look for how much memory needs to be increased in order for molpro to run successfully. This is done through regex pattern matching. We also check for triples memory increase if required
Trsh: determine_job_log_memory_issue - Sometimes the job log can have null bytes in them, usually a QCHEM issue, and so this means we need to open the file to read differently
TrshQChem: trsh_ess_job - QCHEMs trsh has been reworked so now that it will combine troubleshoot attempts if they were attempted previously. For example, if we troubleshooted the max opt cycle but now need to turn on SYM IGNORE, it will include both of these statements in the troubleshooting input file
TrshQMolpro: trsh_ess_job - Molpro required a chnage in how we troubleshoot the memory. If we get an error for the memory it is because either the MWords per process is not enough, even though we have provided an adequate about of memory to the submit script OR the MWords per process is enough but the TOTAL MEMORY (MWords * CPUS) > Max Node Memory, therefore CORES has to be reduced.
TrshSSH:trsh_job_on_server - Fixed it as a 'with' statement so the client is closed when exiting the 'with' statement
Molpro Adapter: Molpro needs a different touch to troubleshooting its memory. Here in setting the input file memory we determine if the MWords was enough per process but the total memory was too high. If that's the case, we reduce the processes req. while maintaining the memory per process

github-advanced-security

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

Lilachn91 · 2023-07-30T14:58:02Z

arc/checks/ts.py

+            check_normal_mode_displacement(reaction, job=job)
+        except ValueError as e:
+            logger.error(f'Could not check normal mode displacement, got: \n{e}')
+            reaction.ts_species.ts_checks['normal_mode_displacement'] = True


Can you explain why do you add that line (74)?
Thanks!

I believe I pulled this from the ts2 branch

ARC/arc/checks/ts.py

Line 74 in 5277e5a

reaction.ts_species.ts_checks['normal_mode_displacement'] = True

@alongd Should I remove this change as it should only be for that specific branch?

Yes, it's a practical hack, we need to fix the NDM check, let's not merge the hack into main

codecov · 2023-07-31T08:48:38Z

Codecov Report

Attention: Patch coverage is 39.50233% with 389 lines in your changes are missing coverage. Please review.

Project coverage is 72.55%. Comparing base (be1f6c8) to head (a3ff0fb).

❗ Current head a3ff0fb differs from pull request most recent head 123279e. Consider uploading reports for the commit 123279e to get more accurate results

Files	Patch %	Lines
arc/parser.py	28.46%	89 Missing and 4 partials ⚠️
arc/job/trsh.py	38.93%	69 Missing and 11 partials ⚠️
arc/job/adapters/qchem.py	39.04%	53 Missing and 11 partials ⚠️
arc/job/ssh.py	15.68%	39 Missing and 4 partials ⚠️
arc/scheduler.py	0.00%	32 Missing ⚠️
arc/job/adapter.py	15.62%	27 Missing ⚠️
arc/job/adapters/molpro.py	13.63%	17 Missing and 2 partials ⚠️
arc/species/converter.py	13.33%	12 Missing and 1 partial ⚠️
arc/species/vectors.py	14.28%	5 Missing and 1 partial ⚠️
arc/level.py	42.85%	3 Missing and 1 partial ⚠️
... and 3 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #689      +/-   ##
==========================================
- Coverage   73.82%   72.55%   -1.27%     
==========================================
  Files          99      100       +1     
  Lines       27346    26999     -347     
  Branches     5717     5668      -49     
==========================================
- Hits        20187    19590     -597     
- Misses       5733     6042     +309     
+ Partials     1426     1367      -59

Flag	Coverage Δ
unittests	`72.55% <39.50%> (-1.27%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

alongd

Thanks, @calvinp0, for investing the time and effort into this very needed extension!
Please see my comments below

alongd · 2023-08-02T16:27:41Z

.gitignore

@@ -36,6 +36,7 @@ scratch/

 # csv database
 *.csv
+!basis_sets.csv


Can you explain !?

So that whenever someone gits add . it will make sure to add the basis_sets.csv. Basically it is excluded from the rule *.csv

alongd · 2023-08-02T16:29:02Z

data/basis_sets.csv

+qchem,RIJK-def2-TZVPPd,"H, He, Li → Ne, Na → Ar",
+qchem,RIJK-def2-QZVP,"H, He, Li → Ne, Na → Ar, K → Br, Rb → Xe, Cs → La, Hf → Rn",
+qchem,RIJK-def2-QZVPP,"H, He, Li → Ne, Na → Ar, K → Br, Rb → Xe, Cs → La, Hf → Rn",
+qchem,RIJK-def2-QZVPPd,"H, He, Li -> Ne, Na -> Ar, K -> Br, Rb -> Xe, Cs -> La, Hf -> Rn",


sometimes I see the character → and sometime the two consecutive characters ->, shall we be consistent, and probably take the latter?

alongd · 2023-08-02T16:32:34Z

arc/checks/ts.py

+            check_normal_mode_displacement(reaction, job=job)
+        except ValueError as e:
+            logger.error(f'Could not check normal mode displacement, got: \n{e}')
+            reaction.ts_species.ts_checks['normal_mode_displacement'] = True


Yes, it's a practical hack, we need to fix the NDM check, let's not merge the hack into main

alongd · 2023-08-02T16:37:43Z

arc/job/adapter.py

@@ -753,10 +754,10 @@ def set_cpu_and_mem(self):
                           f'exceeds {100 * job_max_server_node_memory_allocation}% of the the maximum node memory on '
                           f'{self.server}. Setting it to {job_max_server_node_memory_allocation * max_mem:.2f} GB.')
            self.job_memory_gb = job_max_server_node_memory_allocation * max_mem
-            total_submit_script_memory = self.job_memory_gb * 1024 * 1.05  # MB
+            total_submit_script_memory = self.job_memory_gb * 1024 * 1.05 if (self.job_memory_gb * 1024 * 1.05) <= max_mem else max_mem * 1000 # MB


Please look again, there might be a unit mismatch. What units do we expect for max_mem? if GB, then the if (self.job_memory_gb * 1024 * 1.05) <= max_mem comparison is wrong, if MB then no need to de max_mem * 1000

same for the line below

According to set_cpu_and_mem(self) function, max_mem is in GB

ARC/arc/job/adapter.py

Line 749 in 5d5fc3a

max_mem = servers[self.server].get('memory', None) if self.server is not None else 32.0 # Max memory per node in GB.

alongd · 2023-08-02T16:39:53Z

arc/job/adapter.py

@@ -766,8 +767,8 @@ def set_cpu_and_mem(self):
            # In PBS, "#PBS -l select=1:ncpus=8:mem=12000000" specifies the memory for all cores to be 12 MB.
            self.submit_script_memory = math.ceil(total_submit_script_memory) * 1E6  # in Bytes
        elif cluster_software in ['slurm']:
-            # In Slurm, "#SBATCH --mem-per-cpu=2000" specifies the memory **per cpu/thread** to be 2000 MB.
-            self.submit_script_memory = math.ceil(total_submit_script_memory / self.cpu_cores)  # in MB
+            # In Slurm, "#SBATCH --mem=2000" specifies the memory to be 2000 MB.


Please make the comment more explicit by specifying something like the memory for all cores, like for PBS above

If you change this behavior, do we need to make any changes to our submit scripts?

alongd · 2023-08-02T18:19:45Z

arc/job/adapters/qchem_test.py

@@ -165,7 +165,7 @@ def test_write_input_file(self):
   METHOD        b3lyp
   UNRESTRICTED  TRUE
   BASIS         def2-TZVP
-   IQMOL_FCHK    TRUE
+   IQMOL_FCHK    FALSE


please squash this commit into the original QChem test commit you made

alongd · 2023-08-02T18:20:05Z

arc/job/adapters/ts/autotst_test.py

@@ -35,7 +35,7 @@ def setUpClass(cls):
    def test_has_autotst(self):
        """Test that AutoTST was successfully imported"""
        self.assertTrue(HAS_AUTOTST)
-
+    @unittest.skip("This test is deprecated and will be removed in future versions.")


alongd · 2023-08-02T18:20:42Z

arc/job/trsh.py

@@ -391,7 +391,7 @@ def determine_ess_status(output_path: str,
                            error = f"Memory required: {memory_increase} MW"
                            break                         
                    break
-                elif 'insufficient memory available - require' in line:
+                elif 'insufficient memory available - require' in line:                           


can you remove the extra white spaces?

alongd · 2023-08-02T18:22:09Z

arc/parser.py

-                    xyz_str, skip_traj = '', False
-
-                    while len(lines) and lines[i] != "\n" and 'Z-matrix Print:\n' not in lines[i+1]:
+                elif 'Standard Nuclear Orientation (Angstroms)' in lines[i]:


Can you add a test?

alongd · 2023-08-02T18:22:42Z

arc/job/adapter.py

@@ -754,13 +754,13 @@ def set_cpu_and_mem(self):
                           f'exceeds {100 * job_max_server_node_memory_allocation}% of the the maximum node memory on '
                           f'{self.server}. Setting it to {job_max_server_node_memory_allocation * max_mem:.2f} GB.')
            self.job_memory_gb = job_max_server_node_memory_allocation * max_mem
-            total_submit_script_memory = self.job_memory_gb * 1024 * 1.05 if (self.job_memory_gb * 1024 * 1.05) <= max_mem else max_mem * 1000 # MB
+            total_submit_script_memory = self.job_memory_gb * 1024 * 1.05 if (self.job_memory_gb * 1024 * 1.05) <= (max_mem * 1024) else max_mem * 1024 # MB


Can you squash? I commented about this on the original commit

Added the the python package rapidfuzz to the environment for QChem Adapter to allow for basis set matching

Built a dataframe of possible combinations of basis sets that QChem requires and the correct format for the input file. Additionally, ensured that it is uploaded in the git push. This is a WIP as there may be more basis sets to add or even fix!

…y change, SSH File Download Correction, Null Byte Read, Remote Remove Files 1. Adapter: Submit Script Format updated with using user provided username {un}, also convert memory to an 'int', and also provide the option of {server_nodes} if required 2. Adapter: Total Submit Memory adjusted to now ensure that when troubleshooting a job, it never attempts to go OVER the maximum memory of the allowed submission memory of the node/server 3. Adapter: SLURM Submit Memory - Using `#SBATCH --mem` as the parameter now as it defines the TOTAL memory of the submission 4. Adapter: SSH File Download - We do not expect to always download or upload certain files depending on the scheduler via SSH. This change allows for recognising if certain files will be uploaded or downloaded depending on the user's scheduler choice 5. Adapter: Null Bytes can appear in files, or rather more specifically, if QChem has an error, it can produce Null Bytes in the out.txt/err.txt files and thus requires a different type of reading of the file contents. This is not 100% full proof though and may need extra work 6. Adapters: In #390 branch, SSH had improvements but were not merged. I have brought forth an improvement from this branch were Remote Files are removed once they are download to the local computer

ARC can now recognise that IRC is also supported by QChem

Inspired by branch #390 1. SSH: Updated decorator to use the correct connect function 2. SSH: If the user provides a server that is not in servers.keys() and server is also not None, then an error is raised to informat the user that they need to fix up the server keys 3. SSH: An error that can occur is when a submission to a scheduler includes an incorrect memory specification, then there is warning to the user that the requested memory is not supported and needs to be checked. May need to make this a ValueError instead of a logger warning 4. SSH: Slight adjustment to checking if there is an stdout after submission attempt 5. SSH: Some servers require private keys. Originally the code was incorrectly adding the private key to the SSH paramiko function. It has now been changed so that system keys are loaded and then if the user provides a private key, it is included in the connect function 6. SSH: Updated default arguments to `get_last_modified_time` 7. SSH: Changed the lowering of the cluster soft 8. SSH: Added a function to remove the directory on the remote server 9. SSH: Azure SLURM has an extra status called 'CF' which means configuring (for the node). This can take 10-15 mins or so before the node is online. We now ensure to caputre this. HOWEVER, a node can get stuck in 'CF' status. Now we check this via checking the current time the node has been active, splitting the time up correctly (different formats of time are possible), and then if it is above 15 minutes, we run the command `scontrol show node {node_id}`. If the stdout includes the phrase 'NOT_RESPONDING' then we return 'errored'

ARC may sometimes pass coords in a string format. To deal with this, a regex function is used to properly format it into a tuple. Will return an error if it cannot achieve a formatted tuple

The original code did not functional correctly - nor was never used hence why it was passed into production. It has now been changed to properly return the actual number of heavy atoms

Needs further development and understanding

…ng Remote Files, Checking Opt Jobs, Troubleshooting Conformers, Question Regarding not okay freq, Rerunning fine opt jobs 1. Scheduler: Now imports JobError 2. Scheduler: Fixed adding trsh to the args 3. Scheduler: Added JobError exception for determining job status 4. Scheduler: Now removing remote jobs at the end of the scheduler - !!!MAY NEED TO SEE IF THIS HAS AN EFFECT ON NON SSH JOBS!!! 5. Scheduler: Getting the recent opt job name via properly checking if the opt job was considered done (This was not done before) 6. Scheduler: TODO - We attempt to trouble shoot a frequency we deem not okay. Yet, there is no specific troubleshoot method, so why do we do this? 7. Scheduler: Properly troubleshoot an job 8. Scheduler: Fix conformer troubleshoot if it was a TS conformer

… parse_1d_scan_coords QChem, parse_trajectory QChem, parse_args QChem 1. parser: TODO - Why do we set raise error as true for normal mode displacement parsing? It has an effect on the function of raising a not implemented error even though it is implememnt 2. parser: Now can parse the normal mode displacement of QCHEM 3. parser: Now can parse the 1d scan coords of QCHEM 4. parser: Can now parse trajectory of QCHEM 5. parser: Can now parse arguments in the scan input file QCHEM 6. parser: NEED to fix parse_ic_info for QCHEM

0. QChem Adapter: Import - Pandas, ARC_PATH, rapidfuzz 1. QChem Adapter: Input Template now supports IRC and {trsh} args and ensures IQMOL_FCHK is false (This can be set to true BUT be aware this .fchk file can be rather large) 2. QChem Adapter: write_input_file - basis set is now matched via the software_input_matching function 3. QChem Adapter: write_input_file - QChem now supports D3 method. We should look at working with other DFT_D methods in the future. More specifically there are other types of D3 methods 4. QChem Adapter: write_input_file - Correctly pass in troubleshooting arguments into the input file 5. QChem Adapter: write_input_file - Capitalised TRUE/FALSE in UNRESTRICTED parameter 6. QChem Adapter: write_input_file - Removed the scan job type and moved it to another section of the input file writing 7. QChem Adapter: write_input_file - If scan is set, the job_type is PES_SCAN. We also set the fine to be XC_GRID 3. However, we may need to look into changing the tolerances later 8. QChem Adapter: write_input_file - We now write correctly the torsional scans for the input file for a scan 9. QChem Adapter: write_input_file - IRC is now supported, however this input file means we run two jobs from the one input file - A FREQ job and then IRC. This currently works but will need improvements when used more by users 10. QChem Adapter: write_input_file - Ensuring that the SCF CONVERGENCE is 10^-8 for certain job types 11. QChem Adapter: [NEWFUNCTION] generate_scan_angles - to support PES SCAN jobs, we have a function that will look at what the required angle we want to scan, and the step size, and then return a start and end angle between -180, 180 that will ensure we scan the require angle during the stepping 12. QChem Adapter: [NEWFUCNTION] software_input_matching - Since QCHEM has different formatting for basis sets, this function will try take the users format of the basis set and match it against a dataframe (which should always be updated if its missing a format). This uses the new package in the ARC environment called rapidfuzz

1. TrshQChem: Fixed error checking in QChem output files. It would originally mistakenly think SCF failed was the error due to what errors it would look for in the lines 2. TrshQChem: FlexNet Licensing Error - If the license server is not working this will cause ARC to stop 3. TrshQChem: Max Optimisation Cycles is probably checked for in the output file 4. TrshQChem: Max Iteration Cycles is identified now if there is a failure during SCF convergence 5. TrshMolpro: Molpro reports memory errors that need to be properly troubleshooted differently than what we did originally. Now, we will look for how much memory needs to be increased in order for molpro to run successfully. This is done through regex pattern matching. We also check for triples memory increase if required 6. Trsh: determine_job_log_memory_issue - Sometimes the job log can have null bytes in them, usually a QCHEM issue, and so this means we need to open the file to read differently 7. TrshQChem: trsh_ess_job - QCHEMs trsh has been reworked so now that it will combine troubleshoot attempts if they were attempted previously. For example, if we troubleshooted the max opt cycle but now need to turn on SYM IGNORE, it will include both of these statements in the troubleshooting input file 8. TrshQMolpro: trsh_ess_job - Molpro required a chnage in how we troubleshoot the memory. If we get an error for the memory it is because either the MWords per process is not enough, even though we have provided an adequate about of memory to the submit script OR the MWords per process is enough but the TOTAL MEMORY (MWords * CPUS) > Max Node Memory, therefore CORES has to be reduced. 9. TrshSSH:trsh_job_on_server - Fixed it as a 'with' statement so the client is closed when exiting the 'with' statement

Molpro Adapter: Molpro needs a different touch to troubleshooting its memory. Here in setting the input file memory we determine if the MWords was enough per process but the total memory was too high. If that's the case, we reduce the processes req. while maintaining the memory per process

QChem Test Module - Needs tests and fix ups

Adjusted the file name to download from input.out to output.out Need to create a test for the mem per process change

Fixed up main_test.py due to the addition of azure

Will require some additional tests

Change the cache number so that rapidfuzz is installed in the environment and the cache of ARC production is not used

total submit memory calculation fixed if max_mem is None

Updated the tests for Trsh

Updated the tests for main.py

Updated the server assertion

'Error within run_minimization with minimization method' - Not certain what this error requires, and also if we should troubleshoot it if the job type is a 'conformer'. For now, we will re-run the job under the same conditions and if it fails again, we will declare it not possible to troubleshoot remove 'break'

…ged bool argument

arc/job/adapters/qchem.py

@@ -369,6 +569,52 @@
        Execute a job to the server's queue.
        """
        self.legacy_queue_execution()
+
+    def software_input_matching(self, basis):


arc/job/adapters/molpro.py

@@ -9,6 +9,7 @@
 import os
 from typing import TYPE_CHECKING, List, Optional, Tuple, Union
 import socket
+import re


arc/job/adapters/qchem_test.py

+This module contains unit tests of the arc.job.adapters.qchem module
+"""
+
+import math


arc/parser.py

+                            'FIXED',
+                            'ENDFIXED', regex=False)
+        atoms = len(parse_str_blocks(file_path,
+                                     '$molecule',


arc/parser.py

+                            'ENDFIXED', regex=False)
+        atoms = len(parse_str_blocks(file_path,
+                                     '$molecule',
+                                     '$end', regex=False)[0])-3


arc/job/ssh.py

+                # 00:00:00
+                # 00:00
+                # We need to handle all these cases
+                days = status_time.split('-')[0] if len(status_time.split('-')) == 2 else 0


arc/job/trsh.py

+        try:
+            if os.path.isfile(job_log):
+                with open(job_log, 'r') as f:
+                    lines = f.readlines()


arc/parser.py

-                while len(lines) and '--------------------------------------------' not in lines[i]:
-                    splits = lines[i].split()
-                    xyz_str += f'{qcel.periodictable.to_E(int(splits[1]))}  {splits[3]}  {splits[4]}  {splits[5]}\n'
+        if isinstance(log, GaussianLog):


arc/job/adapters/qchem.py

+                                        f"\n   JOBTYPE       rpath" \
+                                        f"\n   BASIS        {input_dict['basis']}" \
+                                        f"\n   METHOD        {input_dict['method']}" \
+                                        f"\n   RPATH_DIRECTION {irc_direction_value}" \


arc/settings/submit_test.py

@@ -18,7 +18,7 @@
    def test_servers(self):
        """Test server keys in submit_scripts"""
        for server in submit_scripts.keys():
-            self.assertTrue(server in ['local', 'atlas', 'txe1', 'pbs_sample', 'server1', 'server2', 'server3'])
+            self.assertTrue(server in ['local', 'atlas', 'txe1', 'pbs_sample', 'server1', 'server2', 'azure', 'server3'])


calvinp0 added Topic: ESS Electronic Structure Software Type: Code Organization Type: Environment Module: Species Module: Main Module: Parser Module: Conformers Module: SSH Type: trsh Troubleshooting Topic: Server Module: trsh Troubleshooting labels Jul 30, 2023

calvinp0 requested review from alongd, NellyMitnik, MichalKesl and Lilachn91 July 30, 2023 10:14

calvinp0 assigned calvinp0 and kfir4444 Jul 30, 2023

github-actions bot added Module: Converter Module: Scheduler labels Jul 30, 2023

github-advanced-security bot found potential problems Jul 30, 2023

View reviewed changes

Lilachn91 reviewed Jul 30, 2023

View reviewed changes

calvinp0 force-pushed the AzureServer branch from 39d23ad to 270777f Compare July 31, 2023 06:58

alongd reviewed Aug 2, 2023

View reviewed changes

calvinp0 added 6 commits March 18, 2024 19:28

ARC Env: RapidFuzz

dfa290d

Added the the python package rapidfuzz to the environment for QChem Adapter to allow for basis set matching

[WIP] Basis Set Dataframe

b65c34d

Built a dataframe of possible combinations of basis sets that QChem requires and the correct format for the input file. Additionally, ensured that it is uploaded in the git push. This is a WIP as there may be more basis sets to add or even fix!

QChem IRC Software Support Recognition

6437460

ARC can now recognise that IRC is also supported by QChem

XYZ to Smiles: Warning Update to Possible Valence

7948473

calvinp0 added 25 commits March 18, 2024 19:30

Vectors: Reading coords that are in string format using regex

6b3b610

ARC may sometimes pass coords in a string format. To deal with this, a regex function is used to properly format it into a tuple. Will return an error if it cannot achieve a formatted tuple

Species: Get number of heavy atoms for TSGuess

808ba16

The original code did not functional correctly - nor was never used hence why it was passed into production. It has now been changed to properly return the actual number of heavy atoms

[WIP] Getting the Internal Coordinates for QChem - Still not operational

18f2b4a

Needs further development and understanding

submit.py & settings.py: Updated for SLURM and AZURE

2bf315c

[WIP] QChem Test

48a612f

QChem Test Module - Needs tests and fix ups

[WIP] Molpro Test

5ac4317

Adjusted the file name to download from input.out to output.out Need to create a test for the mem per process change

main_test

e346f8d

Fixed up main_test.py due to the addition of azure

level_test

65186c8

Will require some additional tests

[Temp] Change CACHE_NUMBER

76de095

Change the cache number so that rapidfuzz is installed in the environment and the cache of ARC production is not used

Adapter: total submit memory fix

c69d700

total submit memory calculation fixed if max_mem is None

QChem Adapter: Updated Tests

84ec8cb

TrshMolpro: Parse new array length memory

46541b1

TrshTEST

75942f5

Updated the tests for Trsh

mainTEST

a569939

Updated the tests for main.py

parserTEST: Updated normal mode displacement test

45681e1

submitTEST

cb5ff4c

Updated the server assertion

parser: Fixed parse_trajectory for QCHEM

90a5c89

Adapter: Submit Memory Fix

dfdcf13

parser: Fixed elif statement in parse_normal_mode_displacement - chan…

123279e

…ged bool argument

calvinp0 force-pushed the AzureServer branch from a3ff0fb to 123279e Compare March 18, 2024 17:32

github-advanced-security bot found potential problems Mar 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure server #689

Azure server #689

calvinp0 commented Jul 30, 2023

github-advanced-security bot left a comment

Lilachn91 Jul 30, 2023 •

edited

calvinp0 Jul 30, 2023 •

edited

alongd Aug 2, 2023

codecov bot commented Jul 31, 2023 •

edited

alongd left a comment

alongd Aug 2, 2023

calvinp0 Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

MichalKesl Aug 31, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

alongd Aug 2, 2023

Azure server #689

Are you sure you want to change the base?

Azure server #689

Conversation

calvinp0 commented Jul 30, 2023

github-advanced-security bot left a comment

Choose a reason for hiding this comment

Lilachn91 Jul 30, 2023 • edited

Choose a reason for hiding this comment

calvinp0 Jul 30, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 31, 2023 • edited

Codecov Report

alongd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lilachn91 Jul 30, 2023 •

edited

calvinp0 Jul 30, 2023 •

edited

codecov bot commented Jul 31, 2023 •

edited