Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge performance degradataion for WRMF #72

Open
david-cortes opened this issue Jan 30, 2022 · 1 comment
Open

Huge performance degradataion for WRMF #72

david-cortes opened this issue Jan 30, 2022 · 1 comment

Comments

@david-cortes
Copy link
Contributor

david-cortes commented Jan 30, 2022

In version 0.5.0 from CRAN (installed with a modified Makevars.in to force OMP linkage), there is a huge slowdown in WRMF with implicit feedback compared to earlier versions.

For example, if I try running it on the LastFM-360K dataset with this configuration + 15 iterations with no early stopping:

WRMF$new(feedback="implicit", rank=50, lambda=5,
         solver="conjugate_gradient",
         with_global_bias=FALSE, with_user_item_bias=FALSE)

And then compare different libraries with these same settings, I get the following times:

  • rsparse: 39.18s
  • cmfrec: 29.52s
  • implicit: 29.0s

Whereas in earlier versions the time was somewhere between implicit and cmfrec. The Cholesky solver is also affected by this slowdown.

I haven't been able to pinpoint what is causing the slowdown. Tried adding extra armadillo defines like DARMA_DONT_USE_WRAPPER, DARMA_USE_BLAS, DARMA_USE_LAPACK, DARMA_USE_OPENMP, but it didn't make a difference.

@david-cortes
Copy link
Contributor Author

david-cortes commented Jan 30, 2022

Actually it's not related to the version. Tried downgrading to 0.4.0 and got the same timings. Perhaps something to do with newer armadillo versions?

EDIT: it actually isn't. Tried with versions of RcppArmadillo and OpenBLAS from 2020 and still experienced the problem. Perhaps something to do with newer GCC versions? This is BTW on an AMD Ryzen 7 2700 (3.2Ghz 8c/16t), GCC11.2.0 (flags -O3 -march=native -fno-math-erro -fno-trapping-math and using link-time optimization), and OpenBLAS 0.3.19 (OpenMP variant).

dselivanov added a commit that referenced this issue Apr 18, 2023
…ions (#75)

* re-arrange omp parallel region to make more efficient memory allocations. Related to #72

* optimize R code, avoid double work in transform

* ignore bench files

* update github actions

* fix accidentally introduced segfault

* run CI only for master

* - update readme
- update NEWS

* simplify r cmd check options
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant