Skip to content

Multiprocessing

ajocher edited this page Oct 11, 2019 · 7 revisions

Introduction

The purpose of this page is to document the use of multiprocessing in RMG. Python's multiprocessing module is implemented for MAC and Linux.

Implementation

Currently, multiprocessing is implemented for reaction generation and the generation of QMfiles when using the QMTP option to compute thermodynamic properties of species. The processes are spawned and closed within each function. The number of processes is determined based on the ratio of currently available RAM and currently used RAM. The user can input the maximum number of allowed processes from the command line. For each reaction generation or QMTP call the number of processes will be the minimum value of either the number of allowed processes due to user input or the value obtained by the RAM ratio. The RAM limitation is employed, because multiprocessing is forking the base process and the memory limit (SWAP + RAM) might be exceeded when using too many processors for a base process large in memory.

In python 3.4 new forking contexts 'spawn' and 'forkserver' are available. These methods will create new processes which share nothing or limited state with the parent and all memory passing is explicit. Using 'spawn' or 'forkserver' in set_start_method('forkserver') raises a DatabaseError: rmgpy.exceptions.DatabaseError: Could not get database with name: kinetics. We suspect that necessary database resources are identified as unnecessary resources and therefore not inherited even though they are needed by the child process.

Use

Multiprocessing is employed from the command line using the -n command and the maximum number of processes the user wants to use, here 4. The default is 1 process, only an integer value smaller than the number of available cpu is allowed.

python rmg -n 4 input.py