Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Processing #20

Open
arosale4 opened this issue May 26, 2016 · 5 comments
Open

Multiple Processing #20

arosale4 opened this issue May 26, 2016 · 5 comments

Comments

@arosale4
Copy link
Contributor

Obviously we are limited by the number of tokens available and so we can't take advantage of our own form of additional processing power, but is it possible to use MacroModel's queuing system? I could be wrong, but it seems that the MacroModel computations are the rate determining step. Instead of having multiple *q2mm.mae files for each structure we are using as data, could we use one *q2mm.mae file that has all of the structures? From what I have gathered so far, it seems like we can use multiple processors for a single *.mae file while still having only checked out 2 tokens.

@ericchansen
Copy link
Owner

I'm also unsure about that feature, but it sounds like a really good idea. Starting each MacroModel job is definitely the bottle neck.

It would work as long as every structure in the file needs the same datatypes extracted. We'd have to come up with some way to assign relative energies properly, but other than that, it'd be fairly straight forward.

Any ideas on how we do this? Notice anything in their API?

@arosale4
Copy link
Contributor Author

I have been seeing what macromodel is capable of and it seems we might be able to implement something like this. There are only a few problems:

  1. Vibrational analysis can not be done on multiple structures in the same *.mae file, so these will still have to be ran independently.
  2. It seems that a *.mae file with 3 structures will output another *.mae file with the three optimized structures in the same order, and if the *.mmo option is given then it will give a single file with all three structures in order. It does look like there is an option to make multiple *.mmo's but they would follow the naming scheme that macromodel defines.

I'll keep looking into this to see if we can use this because running a minimization followed by energy calculation of 30 structures in a single file takes less than a second to finish.

@ericchansen
Copy link
Owner

Just in case you haven't already, you may want to look into filetypes.Mae.write_com (writes .q2mm.com file according to datatypes requested) and its relation to filetypes.Mae._index_output_mae and filetypes.Mae._index_output_mmo. They're both lists that keep track of where output structures are. It recognizes that certain commands (like MINI, ELST, etc.) produce certain output, and updates these two lists to ensure we know the ordering of the structures.

These lists are then used by filetypes.select_structures to cherry pick the right structures needed for the given datatype.

For example, using options like -me and -meo simultaneously takes a 3 structure input .mae and gives you a 6 structure output q2mm.mae. This leads to

filetypes.Mae._index_output_mae = ['pre', 'opt']

and the ordering in .q2mm.mae goes

  1. Structure 1 SP
  2. Structure 1 geometry optimization
  3. Structure 2 SP
  4. Structure 2 geo.
  5. Structure 3 SP
  6. Structure 3 geo.

@arosale4
Copy link
Contributor Author

arosale4 commented Jun 3, 2016

TL,DR Unless we have multiple licenses to use with MacroModel we really can't take advantage of serial calculations in any meaningful way.

So I'm not sure why I have not realized this before and I am not sure if you (Eric) have known this or not. If we want to compare geometric data or charges we can just use one file and the code already handles them well in both loop.py and compare.py.

With file1.mae containing struct1, file2.mae containing struct 2, file3.mae containing struct 3 and file4.mae containg struct's 1-3, then you get the following. Submitting this will be slow:

python compare.py -c ' -mt file1.mae file2.mae file3.mae' -r ' -jt file1.mae file2.mae file3.mae'

While this command will give the same result but 3 times faster:

python compare.py -c ' -mt file4.mae' -r ' -jt file4.mae'

I think we can do the same thing with energies. Regardless if you do a single energy command (-me) or multiple energy commands (-me and -meo) the code already picks up the correct energies for file4.mae. I think a simple way to use one file would be to use a reference data file (file4.txt with some command X) that contains all of those energies. So it would look like this:

python compare.py -c ' -me file4.mae -meo file4.mae' -r ' -Xe file4.txt -Xeo file4.txt'

But since there are not too many structures to compare with energies this probably won't save much time.

As for eigenvalues and hessian elements we can't do anything about speeding them up from what I gather. The RHHO macromodel command, that does the vibrational analysis, can not be done in a serial manner, or at-least using the BGIN/END or AUTO commands.

@ericchansen
Copy link
Owner

Reference energies don't need to load MacroModel, so that won't help with time savings. That's simply reading a .mae file.

You could speed up calculating MacroModel energies by putting them all in one file, but you would need some other file that says how the energies are grouped. Let me know if you need more explanation.

ericchansen pushed a commit that referenced this issue Jun 23, 2016
Datum.type for Gaussian energies were all the same
ericchansen pushed a commit that referenced this issue Jan 14, 2019
Updates from Q2MM/q2mm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants