Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparing speeds of (1) GAMA GUI, (2) GAMA headless and (3) R/rama #6

Open
choisy opened this issue Dec 15, 2018 · 8 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@choisy
Copy link
Member

choisy commented Dec 15, 2018

I hear many of you complaining by the fact that R/rama is incredibly slow compared to GAMA GUI. Can somebody do some benchmarking here so that we have some numbers to compare: GAMA GUI, GAMA headless, and R rama. Is it possible to time the time it takes just for launching GAMA in the headless? I guess it should roughly be the time in headless of a simple model with few agents and just 1 time step right? Anyway, having numbers to compare here would be useful to see where the problem might be.

@benoitgaudou
Copy link

benoitgaudou commented Dec 18, 2018

First insights on 1 simple simulation on the SIR.gaml model: results for rama and headless seem quite similar.

In R, I load an experiment:

gaml_file <- system.file("examples", "sir.gaml", package = "rama")
exp1 <- load_experiment("sir", gaml_file, "sir")

and evaluate only the time of the experiment run:

system.time(output <- run_experiment(exp1))

The time between rama and gama headless is quite similar:

  • in R: around 12.5 s
  • in headless : around 12s (with a precision of 1s)

@choisy
Copy link
Member Author

choisy commented Dec 18, 2018

Super nice. Probably very similar if we'd do repetitions.

@benoitgaudou
Copy link

With repetitions, results are much more different.
I run the following experiment (see the attached xml file).
sir9.xml.zip

gaml_file <- system.file("examples", "sir.gaml", package = "rama")

df <- expand.grid(S0 = c(900, 950, 999),
[sir9.xml.zip](https://github.com/r-and-gama/rama/files/2689590/sir9.xml.zip)

                  I0 = c(100, 50, 1),
                  R0 = 0,
                  beta = 1.5,
                  gamma = .15,
                  S = 1,
                  I = 1,
                  R = 1,
                  tmax = 1000,
                  seed = 1)
df
exp4 <- experiment(df, parameters = c(1:5),
                   obsrates = c(6:8), tmax = "tmax", seed = "seed",
                   experiment = "sir", model = gaml_file)
exp4

system.time(output <- run_experiment(exp4,8))

Results I get:

  • for GAMA : around 22s
  • for R : around 31s

Notice that when I run the experiment with only 1 core, with:

system.time(output <- run_experiment(exp4))

it takes around 60s.

@meta00
Copy link
Member

meta00 commented Dec 18, 2018

In run_experiment(exp), we do:

  • check if exp is an experiment
  • create output folder
  • generate xml parameter file from exp
  • run gama (this step uses GAMA headless)
  • retrieve results from GAMA results (one xml for each simulation is parsed to get the stats)
  • correct NAs

It may be not very surprising that run_experiment takes more time. We can try to improve this.

@choisy
Copy link
Member Author

choisy commented Dec 18, 2018

Yes, makes sense and it would be great if we could improve this.

@jdzucker
Copy link
Member

jdzucker commented Dec 21, 2018 via email

@choisy
Copy link
Member Author

choisy commented Dec 24, 2018

What do you mean by "plot some lines"?
For information, there are a number of packages in R that allow good benchmarking and vizualization of results, see here for example. Also, there is the newly-released bench package. I haven't tried it yet, and I'm not even quite sure that's the tool we need here. To be explored...
Here you are also running an experiment with an increasing number of simulations right? I guess that what you're aiming at here is seeing how the total simulation time scales with the number of simulations right? (and also estimating the rama overhead too right?) If yes, then I would recommend using exactly the same simulation each time. Indeed, since all the simulations of the exp4 object are different, it's impossible for now to see whether the observed time differences are due uniquely to the number of simulations or also to the nature of these simulations. See what I mean? Such an experiment with the same simulation repeated a large number of time could be generated with the repl() function, for example:

exp5 <- repl(exp4[1, ], 10)

Here you are also running simulation on 8 "CPU" in parallel. It would be interesting to kind of assess the overhead of the parallelization too.
As a more general comment, I see that we are doing some bits of tests here and there. Maybe that would be a better approach to design a formal benchmarking test that we all agree on, specifying each time what are the things that we are interested in timing (rama overhead, parallelization overhead, scaling with number of experiments (linear vs non-linear), etc...). And, finally, such a benchmark should ideally be run on an "isolated" machine (i.e. not too many services running at the same time, minimum would be to cut wifi and bluetooth I guess).
An Rmd vignette / website article on this issue of benchmarking would be really really nice. And absolutely key in the perspective of a publication. Also, a benchmarking comparing rama with RNetLogo and rrepast on the same model would be great too.

@choisy
Copy link
Member Author

choisy commented Feb 20, 2019

Would be interesting to compare the speeds of GAMA 1.7 and 1.8 too.

@LucieContamin LucieContamin transferred this issue from r-and-gama/rama Jul 10, 2019
@LucieContamin LucieContamin added the enhancement New feature or request label Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants