Parallel Analysis Using PBS Job Scheduler

This tutorial is specific to the PBS Job Scheduler but can be used as a template and adapted to work with other job scheduling systems.

Dynamically generate an example PBS script named parallel_analysis_using_PBS_example.pbs:

network_size = 10

# Define PBS script
bash_lines = '\n'.join([
    '#! /bin/bash',
    # set project name
    '#PBS -P ProjectName',
    # set job name
    '#PBS -N JobName',
    # choose number of cores and memory
    '#PBS -l select=1:ncpus=1:mem=1GB',
    # set walltime hh:mm:ss
    '#PBS -l walltime=01:00:00',
    # set job array numbers to match network size
    '#PBS -J 0-{}'.format(network_size),
    # load Python
    'module load python/3.7.3',
    # if necessary, activate local environment where IDTxl is installed
    'source /ProjectName/idtxl_env/bin/activate',
    # run analysis on single target
    'python analyse_single_target.py $PBS_ARRAY_INDEX'
    ])

# Generate and save PBS script file
bash_script_name = 'parallel_analysis_using_PBS_example.pbs'
with open(bash_script_name, 'w', newline='\n') as bash_file:
    bash_file.writelines(bash_lines)

The job array can be submitted directly on the cluster from the command line interface using the command qsub parallel_analysis_using_PBS_example.pbs. It is also possible to submit jobs dynamically from python as follows:

from subprocess import call
call(('qsub {0}').format(bash_script_name), shell=True, timeout=None)

The PBS script will call the python script analyse_single_target.py multiple times (one time for each target). On each call, the target number will be passed as an argument. This is a template for the python script analyse_single_target.py:

# analyse_single_target.py

import sys
from idtxl.multivariate_te import MultivariateTE
from idtxl.data import Data
import pickle

# Read parameters from shell call
target_id = int(sys.argv[1])

# Load time series
time_series = ...

# Initialise Data object and set dim_order to reflect your data
dat = Data(time_series, dim_order='psr')

# Initialise analysis object and define settings
network_analysis = MultivariateTE()
settings = ...

# Run analysis
res = network_analysis.analyse_single_target(settings, dat, target_id)

# Save results dictionary using pickle
path = 'my_directory/res.{}.pkl'.format(str(target_id))
pickle.dump(res, open(path , 'wb'))

The single target results can then be combined as shown in the Combine Single Target tutorial.

IDTxl-Wiki Home

Wiki Home

Full documentation

Theoretical background

Tutorials

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel Analysis Using PBS Job Scheduler

Clone this wiki locally