Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dim reductoin on big dataset #13

Open
MislavSag opened this issue Jul 23, 2020 · 6 comments
Open

Dim reductoin on big dataset #13

MislavSag opened this issue Jul 23, 2020 · 6 comments

Comments

@MislavSag
Copy link

Great package.

Is the package suitable for very big datasets? I am talking about the datasets of dimension (1.000.000x300)?

I have just tried this code:

mod1<-constructModel(data_sample,p=4,"Basic",gran=c(150,10),RVAR=FALSE,h=1,cv="Rolling",MN=FALSE,verbose=FALSE,IC=TRUE)
results=cv.BigVAR(mod1)

and it is pretty slow with just (1000x100) X matrix (cca 10 minutes).

My goal is to do dimension reduction, but not sure if your package is appropriate for this.

@wbnicholson
Copy link
Owner

wbnicholson commented Jul 27, 2020

Time series with those dimensions (large T, small k) should be feasible in this framework, but rolling validation for penalty parameter selection is not advisable since the process will be very computationally intensive. I would instead suggest something like n-fold cross validation as described in section 3.2 http://www.wbnicholson.com/BigVAR.html.

One was to potentially improve performance is to ensure that the BLAS/OpenMP are single-threaded. You can do so by adding the following code to your .Rprofile:

`
library(RhpcBLASctl)

blas_set_num_threads(1)

omp_set_num_threads(1)
`

@MislavSag
Copy link
Author

I have returned to your answer after some time :)

I have just tried to implement CV from this tutorial: http://www.wbnicholson.com/BigVAR.html#n-fold-cross-validation
CV part is in 3.2.

When I execute the NFoldcv function it returns and error:
Error in 2:nrow(Z1) : argument of length 0
The problem is that is a list of two elements: Y and Z. So instead of Z1 there should be Z1$Z or Z1$Y? in line trainZ <- Z1[2:nrow(Z1),].

@wbnicholson
Copy link
Owner

Yes, it should be Z1$Z, I will make the correction.

@MislavSag
Copy link
Author

Thanks. I think this can be closed now.

@MislavSag
Copy link
Author

It seems this is not solved?

@wbnicholson
Copy link
Owner

This has been fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants