Skip to content

ut-parla/Parla.py

Repository files navigation

______          _                 ┌─────────┐┌──────┐
| ___ \        | |                │ Task A  ││Task B│
| |_/ /_ _ _ __| | __ _           └┬───────┬┘└────┬─┘
|  __/ _` | '__| |/ _` |          ┌▽─────┐┌▽─────┐│  
| | | (_| | |  | | (_| |          │Task D││Task C││  
\_|  \__,_|_|  |_|\__,_|          └┬─────┘└┬─────┘│  
                                  ┌▽─────┐┌▽──────▽┐ 
                                  └──────┘└────────┘ 

Introduction

Parla is a task-parallel programming library for Python. Parla targets the orchestration of heterogeneous (CPU+GPU) workloads on a single shared-memory machine. We provide features for resource management, task variants, and automated scheduling of data movement between devices.

We design for gradual-adoption allowing users to easily port sequential code for parallel execution.

The Parla runtime is multi-threaded but single-process to utilize a shared address space. In practice this means that the main compute workload within each task must release the CPython Global Interpreter Lock (GIL) to achieve parallel speedup.

Note: Parla is not designed with workflow management in mind and does not currently support features for fault-tolerance or checkpointing.

Installation

Parla is currently distributed from this repository as a Python module.

Parla 0.2 requires Python>=3.7, numpy, cupy, and psutil and can be installed as follows:

conda (or pip) install -c conda-forge numpy cupy psutil
git clone https://github.com/ut-parla/Parla.py.git
cd Parla.py
pip install .

To test your installation, try running

python tutorial/0_hello_world/hello.py

This should print

Hello, World!

We recommend working through the tutorial as a starting point for learning Parla!

Example Usage

Parla tasks are launched in an indexed namespace (the 'TaskSpace') and capture variables from the local scope through the task body's closure.

Basic usage can be seen below:

with Parla:
    T = TaskSpace("Example Space")

    for i in range(4):
        @spawn(T[i], placement=cpu)
        def tasks_A():
            print(f"We run first on the CPU. I am task {i}", flush=True)

    @spawn(T[4], dependencies=[T[0:4]], placement=gpu)
    def task_B():
        print("I run second on any GPU", flush=True)

Example Mini-Apps

The examples have a wider set of dependencies.

Running all requires: scipy, numba, pexpect, mkl, mkl-service, and Cython.

To get the full set of examples (BLR, N-Body, and synthetic graphs) initialize the submodules:

git submodule update --init --recursive --remote

Specific running installation instructions for each of these submodules can be found in their directories.

The test-suite over them (reproducing the results in the SC'22 Paper) can be launched as:

python examples/launcher.py --figures <list of figures to reproduce>

Acknowledgements

This software is based upon work supported by the Department of Energy, National Nuclear Security Administration under Award Number DE-NA0003969.

How to Cite Parla.py

DOI

Please cite the following reference.

@inproceedings{
    author = {H. Lee, W. Ruys, Y. Yan, S. Stephens, B. You, H. Fingler, I. Henriksen, A. Peters, M. Burtscher, M. Gligoric, K. Schulz, K. Pingali, C. J. Rossbach, M. Erez, and G. Biros},
    title = {Parla: A Python Orchestration System for Heterogeneous Architectures},
    year = {2022},
    booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
    series = {SC'22}
}