ROSS performance tuning documentation #82

JohnPJenkins · 2016-02-03T16:36:49Z

Hi,

I was unable to find performance tuning information in the ROSS wiki or the PDF doc. I recently sent this message (with some recommendations based on the models we're most familiar with) to some collabs, perhaps it could be edited to be more general and put in the wiki?

Simulations with unbalanced loads will have difficulty scaling no matter what parameters are used. As an example, if you have 1 LP that receives the majority of events (i.e. an LP representing some centralized hub of activity), then the simulation performance is more or less bounded by how fast that LP can process events.

In conservative mode, the most important parameter is the "lookahead", which determines the amount of simulated time the PDES algorithm may jump forward at once. In other words, lookahead is the shortest period of time between consecutive events - after that time has passed, ROSS will have to synchronize. To maximize performance in conservative mode, try making the lookahead as large as possible without causing events smaller than the lookahead to be issued. Lookahead can be set in the simulation code through the ROSS global g_tw_lookahead. This should be the same on every MPI rank!

In optimistic mode, there is an extra performance measure that can be used to determine if you are efficiently executing or not - the number of rollbacks processed. This is printed at the end of each ROSS simulation - look for "Events Rolled Back". A high rollback count relative to the events processed indicates low efficiency.

There are a number of parameters that affect optimistic mode performance, each of which must be the same on every MPI rank. These are:

"batch" - the number of events processed before doing point-to-point communication. The default (16) is fairly high for the types of models we've been developing - we typically set this lower (2-8). This can be set in the code through the ROSS global g_tw_batch, or through the command line parameter --batch=N.

"gvt interval" - the number of batches that run until a global synchronization step is performed. The default (16) is a little more reasonable, but we've had some luck making it larger (up to even 1024!). This can be set through the ROSS global variable g_tw_gvt_interval or the command line parameter --gvt-interval=N.

"KP count" - KPs are an internal ROSS mechanism. Without going too much into the details, a high KP count results in more bookkeeping overhead and less rollbacks, while a low KP count has less bookkeeping but more rollbacks. Valid values for this are between 1 and the number of LPs for each MPI rank. It can be set through the ROSS global variable g_tw_nkp or the command line parameter --nkp=N.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROSS performance tuning documentation #82

ROSS performance tuning documentation #82

JohnPJenkins commented Feb 3, 2016

ROSS performance tuning documentation #82

ROSS performance tuning documentation #82

Comments

JohnPJenkins commented Feb 3, 2016