Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROSS performance tuning documentation #82

Open
JohnPJenkins opened this issue Feb 3, 2016 · 0 comments
Open

ROSS performance tuning documentation #82

JohnPJenkins opened this issue Feb 3, 2016 · 0 comments

Comments

@JohnPJenkins
Copy link

Hi,

I was unable to find performance tuning information in the ROSS wiki or the PDF doc. I recently sent this message (with some recommendations based on the models we're most familiar with) to some collabs, perhaps it could be edited to be more general and put in the wiki?

Simulations with unbalanced loads will have difficulty scaling no matter what parameters are used. As an example, if you have 1 LP that receives the majority of events (i.e. an LP representing some centralized hub of activity), then the simulation performance is more or less bounded by how fast that LP can process events.

In conservative mode, the most important parameter is the "lookahead", which determines the amount of simulated time the PDES algorithm may jump forward at once. In other words, lookahead is the shortest period of time between consecutive events - after that time has passed, ROSS will have to synchronize. To maximize performance in conservative mode, try making the lookahead as large as possible without causing events smaller than the lookahead to be issued. Lookahead can be set in the simulation code through the ROSS global g_tw_lookahead. This should be the same on every MPI rank!

In optimistic mode, there is an extra performance measure that can be used to determine if you are efficiently executing or not - the number of rollbacks processed. This is printed at the end of each ROSS simulation - look for "Events Rolled Back". A high rollback count relative to the events processed indicates low efficiency.

There are a number of parameters that affect optimistic mode performance, each of which must be the same on every MPI rank. These are:

  • "batch" - the number of events processed before doing point-to-point communication. The default (16) is fairly high for the types of models we've been developing - we typically set this lower (2-8). This can be set in the code through the ROSS global g_tw_batch, or through the command line parameter --batch=N.
  • "gvt interval" - the number of batches that run until a global synchronization step is performed. The default (16) is a little more reasonable, but we've had some luck making it larger (up to even 1024!). This can be set through the ROSS global variable g_tw_gvt_interval or the command line parameter --gvt-interval=N.
  • "KP count" - KPs are an internal ROSS mechanism. Without going too much into the details, a high KP count results in more bookkeeping overhead and less rollbacks, while a low KP count has less bookkeeping but more rollbacks. Valid values for this are between 1 and the number of LPs for each MPI rank. It can be set through the ROSS global variable g_tw_nkp or the command line parameter --nkp=N.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant