Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OptPath logger #142

Open
jakobbossek opened this issue Jan 20, 2016 · 27 comments
Open

OptPath logger #142

jakobbossek opened this issue Jan 20, 2016 · 27 comments
Assignees
Milestone

Comments

@jakobbossek
Copy link
Owner

The opt.path is optional from now on (see #141).

@jakobbossek
Copy link
Owner Author

ok. I can add an use.opt.path variable. However, it would be much more flexible if the opt.path would not be hardcoded. Instead we could introduce custom logger objects. A logger object would be an S3 object with methods initLogger, updateLogger and eventually closeLogger. We would pass the opt.state to each of the methods. The logger could modify the opt.state, e.g., initialize an opt.path, save the population to file (see #108) and many more.
@surmann, @berndbischl: what do you think?

@jakobbossek
Copy link
Owner Author

An even more flexible solution: no hard-coded opt.path stuff, no dedicated logger object.
Instead we introduce a maximally flexible event dispatching mechanism. It will work as follows.
The control object internally generates an event dispatcher object. This object has an optional name, a named list of functions (where the names are the event names and the functions are the actions associated with this event, i.e., the functions which should be called if the corresponding event gets triggered) and two functions, namely registerAction and fireEvent. We would predefine a large load of events; for instance "onOffspringGenerated", "onEAFinished", "onEAInitialized" and the user might register his own actions/functions via an extended control object interface. E.g.,

fn = makeSphereFunction(5L)
control = setupECRControl(
  n.population = 50L,
  n.offspring = 10L,
  survival.strategy = "plus",
  monitor = makeNullMonitor(),
  stopping.conditions = list(makeMaxIterStoppingCondition(max.iter = 1000L)
  representation = "float"
)
control = registerAction(control, "onEAInitialized", function(opt.state, ...) {
  # do monitoring
  catf("EA initialized successfully!")
  # init opt path for instance
  opt.state$opt.path = ...
})
control = registerAction(control, "onOffspringGenerated", function(opt.state, ...) {
  # do some local search (even stuff like that possible)
  if (opt.state$iter %/% 50) {
    opt.state$offspring = lapply(opt.state$offspring, doLocalSearch) 
  }
  # update opt.path
  addOptPathEl(opt.state$opt.path, ...)
  catf("Awesome EA magic ...")
})
res = doTheEvolution(fn, control = control)

In the ecr code we place calls to the dispatcher where appropriate, e.g.,

...
offspring = generateOffspring(opt.state, matingPool, control)
control$event.dispatcher$fireEvent("onOffspringGeneration", opt.state)
...

This is maximally flexible in my eyes and solves a lot of problems:

  • The monitoring stuff could be realized via dispatching as well providing nice helper functions which register actions internally.
  • opt.path logging actions via plugin
  • Hybrid approaches (see Hybrid approaches #115) registering some kind of local search for the event "onOffspringGenerated"
  • storage of population or saving the entire opt.state (see Possibility to save population(s) to file #108)
  • ...

What do you think?

@berndbischl
Copy link

My 2cents:
You are possibly highly over-complicating things here?
You are designing an OS :) And the most general version might not be the best one.

IMHO:
Either really think of a usecase where your design is of interest. Or have the flag for the opt.path.

@jakobbossek
Copy link
Owner Author

Thanks for your comment Bernd.

You are possibly highly over-complicating things here?
You are designing an OS :) And the most general version might not be the best one.

😄 No, but IMHO this is an elegant and flexible design.

IMHO:
Either really think of a usecase where your design is of interest. Or have the flag for the opt.path.

Ok, so I already mentioned use cases above.

  • You are doing some EA magic on some kind of VRP/TSP with permutation representation and a standard mutation/recombination. Now you want to apply local search to the offspring after each 100 generations. Until now there is no possibility to do this beside writing a mutator/recombinator by yourself.
    I could of course add parameters local.search.fun, local.search.step and so on, but adding the local search via the following code fragment does the job without adding more parameters and bloating up the control object.
registerAction("onOffspringGenerated", function(opt.state) myLocalSearch(opt.state$offspring))
  • Repairing/rescaling of individuals in case of constraint violations (see above)
  • The user wants a custom logger. At the moment he/she needs to misuse the monitoring stuff in order to do that which is restricted to init, step and finished states of the EA. With the dispatching mechanism the user can log almost anything.
  • ...
  • If you have no use for self-defined actions, simply do not use them. Nobody compels you to do it 😉

Additionally this mechanism has benefits regarding code quality I think. Instead of a dozen internal hard-coded calls to different methods I can just write simple plugins like logger or monitor which are loosely coupled.

By the way: I already wrote the dispatcher on the way home today. It is just about 30 lines of very readable code and works great.

@berndbischl
Copy link

I dont dislike your approach that much. It makes your design really cleaner and not overly complex go for it.

But I still think that all of the examples you mention above are very "specific" to EAs. What I mean is that direct functionality / hooks should be available in the package for just that

  • local improvements
  • logging
  • repairing when out of bounds

@jakobbossek
Copy link
Owner Author

I already included the dispatching mechanism. Monitoring and logging are done via dispatcher plugins now. Makes everything much cleaner. However, logger and monitor can be passed to the control object which internally registers the corresponding actions (I think this is what you mean with hooks).

@jakobbossek
Copy link
Owner Author

At the moment I am experimenting with the best way to log. Since using the opt.path directly for logging even just the best individual in each generation in the single-objective case leads to a heavy performance dropdown I will implement a simple logging mechanism and maybe provide an opt.path converter function.

@berndbischl
Copy link

Well, how do you currently log? And why is that faster?
Like I already said, IMHO we should simply specialize the OptPath in PH.
It really makes no sense to develop a special purpose solution here that other cannot use then?
I mean, you WILL add points to a matrix right?

@jakobbossek
Copy link
Owner Author

Well, I have currently two types of loggers: my default logger and the ParamHelpers opt path logger. The default logger simply logs the entire population and some stats on the fitness in lists. The optPathLogger stores only the best individual of each generation in the opt.path (not the entire population).
My simple benchmarking setup is a (100 + 10) EA with crossover and uniform mutation running for 10000 iterations on 100d sphere.
Without any logging: ~50 seconds.
With default logger: ~70 seconds
With optPathLogger: > 300 seconds

@jakobbossek
Copy link
Owner Author

This benchmark reveals the slowness of the current opt path. This is exactly what is done within the (100+10) EA without the EA computations:

par.set = makeParamSet(
  makeNumericVectorParam("x", len = 100L, lower = 0, upper = 10)
)

n.iter = 10000L
res1 = system.time({
  opt.path = makeOptPathDF(par.set = par.set, y.names = "y", minimize = TRUE)
  for (i in seq(n.iter)) {
    val = sampleValue(par.set)
    y.val = runif(1L)
    addOptPathEl(opt.path, x = val, y = y.val, dob = i, check.feasible = FALSE)
  }
})
print(res1)
>       User      System verstrichen
>     251.70       20.81      274.37

@jakobbossek
Copy link
Owner Author

The problem is the last rbind operation in addOptPathEl.OptPathDF I guess. Maybe we could simply expose the hard-coded nrow argument in makeOptPathDF to the user and make a few adaptions to addOptPathEl. This way I could preallocate a large opt.path and would not suffer the reallocation of memory. I think with these few adpations we could significantly speed up the opt.path update.
What do you think?

@jakobbossek
Copy link
Owner Author

We now have an logger which stores stuff to the ParamHelpers OptPath. However, we need to tackle the performance issue. I think preallocating a large matrix with some heuristic for expansion is sufficient for the moment.

@jakobbossek jakobbossek changed the title Introduce control parameter use.opt.path OptPath logger Jan 23, 2016
@berndbischl
Copy link

There are 3 obvious, complementary improvements:

  1. Prealloc the OptPath to a larger size. Eg 500. But that param is also exposed to the user, because often he will know how large the path will grow.

  2. If not enough mem is allocated, double the size: "exponentail back-off"

  3. Code some part of the "adding" in C.

1-2) are already done and help a lot. 3) I am working on.

Will try to post more finished stuff soon.

@jakobbossek
Copy link
Owner Author

Sounds great. Thanks!

@jakobbossek
Copy link
Owner Author

Did you already commit 1) and 2)?

@berndbischl
Copy link

Nope, I am working on it locally, as the 3 somehow belong together. I will try to push a branch this weekend that works

@jakobbossek
Copy link
Owner Author

Awesome!

@jakobbossek
Copy link
Owner Author

What is the status here? When are you going to commit the OptPath improvements.

@jakobbossek
Copy link
Owner Author

ping

@jakobbossek
Copy link
Owner Author

ping^2

@jakobbossek
Copy link
Owner Author

ping

2 similar comments
@jakobbossek
Copy link
Owner Author

ping

@jakobbossek
Copy link
Owner Author

ping

@berndbischl
Copy link

as this is simply a performance improvment i cannot finish this now, as there is too much to do.

but i am happy to support you if you want to finish this, i made quite a lot of progess

@jakobbossek
Copy link
Owner Author

Ok. Thanks for the response.
Going to have a look at this someday.

@berndbischl
Copy link

really sorry, just tried to be honest here. if you look at this too, i might also be able to invest some cycles.

@jakobbossek
Copy link
Owner Author

No problem. This is of minor importance at the moment.
I will spent some time on it for sure.

@jakobbossek jakobbossek modified the milestones: v1.1, v1.0 Aug 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants