Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel nnetar fitting #346

Open
dashaub opened this issue Jul 29, 2016 · 5 comments
Open

parallel nnetar fitting #346

dashaub opened this issue Jul 29, 2016 · 5 comments

Comments

@dashaub
Copy link
Contributor

dashaub commented Jul 29, 2016

What do you think about adding parallelization to nnetar? The code in avnnet looks very easy to parallelize. For long timeseries with large repeat and large number of CPU cores I imagine it would create a good speedup.

@robjhyndman
Copy link
Owner

Yes, good idea.

@dashaub
Copy link
Contributor Author

dashaub commented Jul 31, 2016

I've put together a PR #349 implementing this. After messing around with parallel::parLapply() for a while and getting argument name collision errors, I settled with foreach::foreach(). The performance improvements look good on the long taylor series but bad for short series, so I set the default parallel = FALSE. An adaptive rule based on the series length like used in tbats() could make sense here.

Pros: great performance on long series
Cons: imports foreach::foreach() and doParallel::registerDoParallel(). These are common packages and probably installed on most systems already, however.

# current implementation
library(microbenchmark)
library(devtools)
install_github("robjhyndman/forecast")
library(forecast)
microbenchmark(nnetar(taylor),
               nnetar(AirPassengers),
               nnetar(AirPassengers, repeats = 500), times = 5)
Unit: seconds
                                 expr        min         lq       mean
                       nnetar(taylor) 837.956053 841.290937 841.430421
                nnetar(AirPassengers)   1.009523   1.030911   1.027371
 nnetar(AirPassengers, repeats = 500)  24.779071  24.818174  24.841770
     median         uq        max neval
 842.189486 842.570680 843.144946     5
   1.031746   1.032079   1.032597     5
  24.845551  24.863318  24.902737     5

# parallel implementation with parallel on by default
install_github("dashaub/forecast")
library(forecast)
microbenchmark(nnetar(taylor, parallel = TRUE),
               nnetar(AirPassengers, parallel = FALSE),
               nnetar(AirPassengers, parallel = TRUE, num.cores = 1),
               nnetar(AirPassengers, parallel = TRUE, num.cores = 2),
               nnetar(AirPassengers, parallel = TRUE, num.cores = 4),
               nnetar(AirPassengers, repeats = 500, parallel = TRUE, num.cores = 4),
               times = 5)
Unit: seconds
                                                                 expr
                                      nnetar(taylor, parallel = TRUE)
                              nnetar(AirPassengers, parallel = FALSE)
                nnetar(AirPassengers, parallel = TRUE, num.cores = 1)
                nnetar(AirPassengers, parallel = TRUE, num.cores = 2)
                nnetar(AirPassengers, parallel = TRUE, num.cores = 4)
 nnetar(AirPassengers, repeats = 500, parallel = TRUE, num.cores = 4)
        min         lq       mean     median         uq        max neval
 424.717306 426.561738 427.136085 427.486460 428.064305 428.850618     5
   1.018690   1.018691   1.022518   1.022630   1.025139   1.027438     5
   6.375443   6.402015   6.401181   6.404755   6.405817   6.417876     5
   7.482467   7.518907   7.545821   7.519844   7.529500   7.678389     5
  10.714595  10.798919  10.816837  10.811451  10.834412  10.924808     5
  18.562486  18.579964  18.624954  18.595774  18.641525  18.745022     5


# some more tests on the taylor series
microbenchmark(nnetar(taylor, parallel = FALSE),
               nnetar(taylor, parallel = TRUE, num.cores = 1),
               nnetar(taylor, parallel = TRUE, num.cores = 2),
               nnetar(taylor, parallel = TRUE, num.cores = 4),
               nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 1),
               nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 2),
               nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 4),
               times = 5)

Unit: seconds
                                                          expr       min
                              nnetar(taylor, parallel = FALSE)  839.0974
                nnetar(taylor, parallel = TRUE, num.cores = 1)  839.0866
                nnetar(taylor, parallel = TRUE, num.cores = 2)  426.8741
                nnetar(taylor, parallel = TRUE, num.cores = 4)  222.6650
 nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 1) 4179.6068
 nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 2) 2106.8621
 nnetar(taylor, repeats = 100, parallel = TRUE, num.cores = 4) 1067.8230
        lq      mean    median        uq       max neval
  843.1898  844.2343  844.1039  846.0222  848.7580     5
  839.7962  842.2116  842.8739  843.8619  845.4396     5
  426.9788  427.9270  427.8327  428.3260  429.6235     5
  222.7813  223.3000  223.0341  223.9712  224.0484     5
 4182.5776 4184.9711 4184.7301 4186.4020 4191.5392     5
 2109.5174 2109.8424 2110.0868 2111.0205 2111.7252     5
 1070.3660 1072.8482 1070.8182 1073.7582 1081.4759     5

@robjhyndman
Copy link
Owner

I would like to avoid adding additional package dependencies. What were the issues with using parallel?

@dashaub
Copy link
Contributor Author

dashaub commented Aug 1, 2016

nnet() uses the x and y arguments for the data, and it seems that parLapply() is also passing a differentx argument in the ... arguments down the function call somewhere that conflicts with this. There might be a way around this by setting up a wrapper function for avnnet()

@robjhyndman
Copy link
Owner

You could try using do.call

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants