Skip to content

Toolkit Release Test 0.2

Vivekanandan (Vivek) Balasubramanian edited this page Nov 27, 2018 · 2 revisions

Introduction

The aim of the all-hands testing is to assert that (a) EnsembleMD is working correctly within the confines of the currently existing examples, and (b) framework overhead and workload execution performance lies within reasonable means. A secondary but not less important goal is to assert that the documentation has reached a level of maturity at which it can be used and understood by people who are not directly contributing to the project. The test procedure below lays out a set of steps to test the assertions above.

Testing Procedure

The rules of the game:

GENERAL TESTING

Preparation

Open the spreadsheet in Google Docs (https://docs.google.com/spreadsheets/d/1N9UUo_t5rbrgyo7mGDtQg1pFajUS2_62zAUsACEqP-Y), right-click on the tab called "TEMPLATE" at the bottom and select "Duplicate". Rename the tab to your name.

1. Installation

Point browser to http://radicalensemblemd.readthedocs.org/en/latest and follow the instructions in section two to install EnsembleMD on your 'local' machine, laptop, VM, etc. Report whether you were successful and if yes, report the version number that you see at the end of step 2.1. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd/issues and add the link(s) to the issue to the spreadsheet.

Use this installation / virtualenv for all other steps and tasks!

2. Orientation

(Please read the whole paragraph before you start)

Point your browser to http://radicalensemblemd.readthedocs.org/en/latest and try to figure out how to run an ensemble (or "bag") of misc.chksum kernels with 128 members (or "tasks") on 1 core on your local machine. Each task should use an input file of size 10 kBytes [1] (all input files can be identical). The resulting 128 checksums should be download to the 'local' machine.

Once you have succeeded with that, run the same ensemble on 128 cores on stampede.

Before you start the two runs, figure out how you can generate an execution profile and enable it for both runs. Report whether you were successful and if yes, report the runtime of the two ensembles, upload the two profiling log as a Gists (https://gist.github.com/) and report the two links. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd/issues and add the link(s) to the issue to the spreadsheet.

[1] You can create a 10 kB file with the following command: dd if=/dev/zero of=input.dat bs=10000 count=1

3. Scale-Up Profiling

(Please read the whole paragraph before you start)

Use the same example from above and run it with 128, 256, 512 and 1024 ensembles on 128 cores on stampede, again with profiling enabled. Upload the four resulting profiling log as a Gists (https://gist.github.com/) and report the two links in the spread sheet. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd and add the link(s) to the issue to the spreadsheet.

PATERN-SPECIFIC TESTING BELOW

4. Simulation-Analysis (Giannis, Mark)

Locate the 'generic' Simulation-Analysis example in the documentation. Run it with the following configurations (profiling enabled):

locally (1 core):

  • maxiterations=4, simulation_instances=1, analysis_instances=1
  • maxiterations=4, simulation_instances=2, analysis_instances=2
  • maxiterations=4, simulation_instances=4, analysis_instances=4
  • maxiterations=4, simulation_instances=8, analysis_instances=8
  • maxiterations=4, simulation_instances=16, analysis_instances=16

stampede (128 cores):

  • maxiterations=4, simulation_instances=64, analysis_instances=64
  • maxiterations=4, simulation_instances=128, analysis_instances=128
  • maxiterations=4, simulation_instances=256, analysis_instances=256
  • maxiterations=4, simulation_instances=512, analysis_instances=512

Upload the resulting profiling logs as a Gists (https://gist.github.com/) and report the links in the spread sheet. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd and add the link(s) to the issue to the spreadsheet.

5. Replica Exchange (Vivek, Matteo)

Locate the 'generic' Replica Exchange examples in the documentation. Run both of them with the following configurations (profiling enabled):

locally (1 core):

  • cycles=5, replicas=16
  • cycles=5, replicas=32
  • cycles=5, replicas=64

stampede (16-128 cores):

  • cores=16, cycles=5, replicas=32
  • cores=32, cycles=5, replicas=64
  • cores=64, cycles=5, replicas=128
  • cores=128, cycles=5, replicas=256

Upload the resulting profiling logs as a Gists (https://gist.github.com/) and report the links in the spread sheet. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd and add the link(s) to the issue to the spreadsheet.

6. AllPairs (Antons, Andre)

Locate the 'generic' AllPairs example in the documentation. Run it with the following configurations (profiling enabled):

locally (1 core):

  • Num of Elements = 8
  • Num of Elements = 11
  • Num of Elements = 16
  • Num of Elements = 22
  • Num of Elements = 32

stampede (128 cores):

  • Num of Elements = 16
  • Num of Elements = 22
  • Num of Elements = 32
  • Num of Elements = 45
  • Num of Elements = 55

Upload the resulting profiling logs as a Gists (https://gist.github.com/) and report the links in the spread sheet. If you encounter problems, report them at https://github.com/radical-cybertools/radical.ensemblemd and add the link(s) to the issue to the spreadsheet.

OTHER TASKS

7. Documentation Review (Shantenu)