Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logLik() vs. lavInspect(fit_fixed_x, "loglik.casewise") #226

Open
TDJorgensen opened this issue Jan 10, 2022 · 1 comment
Open

logLik() vs. lavInspect(fit_fixed_x, "loglik.casewise") #226

TDJorgensen opened this issue Jan 10, 2022 · 1 comment

Comments

@TDJorgensen
Copy link
Contributor

From the forum:

When missing = "fiml.x" and fixed.x = TRUE, the sum of casewise log-likelihoods returned by lavInspect(fit_fixed_x, "loglik.casewise") does not match the model's log-likelihood returned by logLik().

mod <- ' x1 ~ x4 + x5
x7 ~ x1  '
dat <- HolzingerSwineford1939
dat[1:10, "x1"] <- NA
dat[11:20, "x5"] <- NA
dat[21:30, "x7"] <- NA

fit_fixed_x <- sem(mod, data = dat, missing = "fiml.x", fixed.x = TRUE)
loglik_i <- lavInspect(fit_fixed_x, "loglik.casewise")
sum(loglik_i)
#> [1] -871.7387

logLik(fit_fixed_x)
#> 'log Lik.' -858.8469 (df=7)

The remainder of the thread notes that logLik() calls lav_mvnorm_h1_loglik_samplestats() within lav_mvnorm_missing_loglik_samplestats(), whereas lavInspect(fit_fixed_x, "loglik.casewise") calls lav_mvnorm_missing_llik_casewise().

@ecmerkle
Copy link
Contributor

Thanks for filing the issue, and here are a few more thoughts:

I looked at how the model log-likelihood (from logLik()) is being computed for the model above. It uses a covariance matrix of x variables (x.cov) to subtract off a fixed.x log-likelihood that is the same across all cases. But it seems like, if some x variables are missing, then the fixed.x log-likelihood that we subtract off should differ for each case, depending on which x variables are observed. I modified lav_mvnorm_missing_loglik_samplestats() to still use x.cov, but to only use the variables that were observed for each case (see diff below). Then the value of logLik() is much closer to, but not the same as, the sum of casewise log-likelihoods.

I suspect that the sum of casewise log-likelihoods is closer to "correct", because I am not sure that x.cov is a sufficient statistic here. But I think it would require more extensive edits to get logLik() to work that way. I am not sure about this, though, and generally find fixed.x = TRUE to be one of my least favorite model settings!

@@ -137,11 +137,30 @@ lav_mvnorm_missing_loglik_samplestats <- function(Yp          = NULL,
         stopifnot(!is.null(x.cov))
         # Note: x.cov should be identical to Sigma[x.idx, x.idx]
         #       so we don't really need x.cov
-        N <- sum(sapply(Yp, "[[", "freq"))
-        loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov  = x.cov,
-                                                     sample.nobs = N)
+        all.x.idx.obs <- all(sapply(Yp, "[[", "var.idx")[, x.idx])
 
-        loglik <- loglik - loglik.x
+        if(all.x.idx.obs) {
+            N <- sum(sapply(Yp, "[[", "freq"))
+            loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov  = x.cov,
+                                                         sample.nobs = N)
+
+            loglik <- loglik - loglik.x
+        } else {
+            for(p in seq_len(pat.N)) {
+                # observed variables for this pattern
+                obs.idx <- which(Yp[[p]]$var.idx)
+
+                # observed x variables for this pattern
+                x.obs.idx <- match(obs.idx, x.idx, nomatch = 0)
+              
+                if(any(x.obs.idx > 0)){
+                    loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov = x.cov[x.obs.idx, x.obs.idx, drop = FALSE],
+                                                                 sample.nobs = Yp[[p]]$freq)
+                    loglik <- loglik - loglik.x
+                }
+            }
+        }   
+              
     }
 
     loglik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants