Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve performance regression in WLS for large sample sizes #376

Open
mhunter1 opened this issue Aug 11, 2023 · 4 comments
Open

Resolve performance regression in WLS for large sample sizes #376

mhunter1 opened this issue Aug 11, 2023 · 4 comments
Assignees
Labels
enhancement regression test fails inconsistently

Comments

@mhunter1
Copy link
Contributor

Brad Verhulst observed a large-sample WLS model used to take 1sec and now takes 20sec.

@mhunter1 mhunter1 added enhancement regression test fails inconsistently labels Aug 11, 2023
@mhunter1 mhunter1 self-assigned this Aug 11, 2023
@mhunter1
Copy link
Contributor Author

git bisect suggested these commits as the problem

59bc1905a1c6fd40f1f878e1efe6d68cc75913ac # Almost certainly not it.
b588e2403ef3799462b0c256c4c2f072d5b9cfcb # Tried.  NOT IT.
618f6c4e89fdcc9a103a193b3c84c48bdd3db449 # NOT IT.
6e6f21cff324a1b7b7931bbe8ef5baa3ebdba11f # Tried.  NOT IT.
fc4f518e9ab2373bd400df4055edd46180f85255 # Probably it.
fb6c8bedb875898d6550f979a7745196bd7a0b59 # Prototype version of fc2f518

@mhunter1
Copy link
Contributor Author

Here's the minimal working example (MWE) that shows the problem. This is excerpted from the file SlowWLS.R on my machine.

#------------------------------------------------------------------------------
# Author: Michael D. Hunter
# Date: 2023-06-01
# Filename: slowWLS.R
# Purpose: Create a minimal working example that shows that WLS
#  has slowed down a lot since December 10, 2020
#------------------------------------------------------------------------------


#------------------------------------------------------------------------------
# Set working directory, load packages, load data
setwd('~/../Downloads/')

require(OpenMx)

# devtools::install_github('https://github.com/jpritikin/gwsem', ref='aa26dd0')
# Ran tests on aa26dd0 version of gwsem
# devtools::install_github('https://github.com/jpritikin/gwsem')

# Run in msys2
# pacman -Sy mingw-w64-x86_64-zstd
# pacman -Sy mingw-w64-i686-zstd
# pacman -Sy mingw-w64-x86_64-sqlite3
# pacman -Sy mingw-w64-i686-sqlite3
  

require(gwsem)


#load('slowWLS.RData')
#DATA <- mood2$data$observed
#save(DATA, 'slowWLSData.RData')
load('slowWLSData.RData')


#------------------------------------------------------------------------------
# Specify model

mood <- buildItem(phenoData = DATA, depVar  = "mood", 
                  covariates = c("PC1", "PC2", "PC3", "PC4", "PC5", "PC6","PC7", "PC8", "PC9","PC10", "Age", "Sex"),
                  fitfun = "WLS")

r <- mxRun(mood)
summary(r)$wallTime
# 21.94 sec is slow

coef(r)

# Try to tweak precision of computing summary stats
mood2 <- mxModel(mood, mxData(DATA, type='raw', fitTolerance=1e-10, gradientTolerance=1e-2))
# default fit tol = sqrt(as.numeric(mxOption(key="Optimality tolerance")))
# = 2.50998e-06
r2 <- mxRun(mood2)
summary(r2)$wallTime
coef(r2)

# Time to just compute the summary statistics
a <- Sys.time()
augd <- omxAugmentDataWithWLSSummary(mood$data, exogenous=setdiff(names(DATA), 'mood'))
b <- Sys.time()
b-a
# 20+ seconds

@mhunter1
Copy link
Contributor Author

I ran a simulation study that varied the sample size and the number of covariates, separately. Both seem to have an inordinately large impact compared to the older version of OpenMx. See plot.

plotSlowWlsCovariateSampleSizeOld.pdf
plotSlowWlsCovariateSampleSizeNew.pdf

@mhunter1
Copy link
Contributor Author

The next step is to investigate the internal Newton-Raphson optimization routine. See if it's taking tiny, too small steps because the gradient needs to be scaled differently?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement regression test fails inconsistently
Projects
None yet
Development

No branches or pull requests

1 participant