Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale PySM on parallel run #39

Open
zonca opened this issue Feb 11, 2020 · 3 comments
Open

Scale PySM on parallel run #39

zonca opened this issue Feb 11, 2020 · 3 comments
Assignees

Comments

@zonca
Copy link
Member

zonca commented Feb 11, 2020

MSS-001 production run

  • 700 nodes
  • 10 nodes/group
  • 70 groups
  • 8 MPI processes/node
  • ~ 7 observation per group
  • 2580 channels
  • 1/20th year
  • TOD channels per node 258
  • PySM channels per group 2580/70 = 37, per node 4
@zonca zonca self-assigned this Feb 11, 2020
@zonca
Copy link
Member Author

zonca commented Feb 11, 2020

IMG_20200211_142011

The PySM operator distributes the channels equally across groups and then runs in each group. It uses shared memory so only 1 copy of inputs by node, there is no redundant work. In each node PySM should pick up some channels of the local TOD channels.

In the example in the image, we have 5 PySM channels per node, which are the first 5 of the 500 channels. Group 2 will have the second 5 channels.

Once PySM has done bandpass integration for all the local channels, it broadcasts full maps across the group communicator to each node for its own PySM channels.

Then the maps of those channels, either 1 at a time or in chunks (configurable by user), are broacasted across the rank communicator and put in shared memory, then rescanned locally by each process to the timelines.
This is done in parallel in all the nodes of the first group, so we parallelize a factor of 10. Then if we do this broadcast for all 5 local PySM channels, another factor of 5. So the loop over 5000 detectors become a loop over the 100 groups.

In fact once group 0 is done, group 1 does the same with their 5 PySM channels, and so on once all the work is done.

@zonca
Copy link
Member Author

zonca commented Feb 11, 2020

@keskitalo: please review the write-up above.

@keskitalo
Copy link

Exactly as I remember. This will be a huge improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants