`logLik()` vs. `lavInspect(fit_fixed_x, "loglik.casewise")` #226

TDJorgensen · 2022-01-10T22:04:09Z

When missing = "fiml.x" and fixed.x = TRUE, the sum of casewise log-likelihoods returned by lavInspect(fit_fixed_x, "loglik.casewise") does not match the model's log-likelihood returned by logLik().

mod <- ' x1 ~ x4 + x5
x7 ~ x1  '
dat <- HolzingerSwineford1939
dat[1:10, "x1"] <- NA
dat[11:20, "x5"] <- NA
dat[21:30, "x7"] <- NA

fit_fixed_x <- sem(mod, data = dat, missing = "fiml.x", fixed.x = TRUE)
loglik_i <- lavInspect(fit_fixed_x, "loglik.casewise")
sum(loglik_i)
#> [1] -871.7387

logLik(fit_fixed_x)
#> 'log Lik.' -858.8469 (df=7)

The remainder of the thread notes that logLik() calls lav_mvnorm_h1_loglik_samplestats() within lav_mvnorm_missing_loglik_samplestats(), whereas lavInspect(fit_fixed_x, "loglik.casewise") calls lav_mvnorm_missing_llik_casewise().

The text was updated successfully, but these errors were encountered:

ecmerkle · 2022-01-16T17:50:53Z

Thanks for filing the issue, and here are a few more thoughts:

I looked at how the model log-likelihood (from logLik()) is being computed for the model above. It uses a covariance matrix of x variables (x.cov) to subtract off a fixed.x log-likelihood that is the same across all cases. But it seems like, if some x variables are missing, then the fixed.x log-likelihood that we subtract off should differ for each case, depending on which x variables are observed. I modified lav_mvnorm_missing_loglik_samplestats() to still use x.cov, but to only use the variables that were observed for each case (see diff below). Then the value of logLik() is much closer to, but not the same as, the sum of casewise log-likelihoods.

I suspect that the sum of casewise log-likelihoods is closer to "correct", because I am not sure that x.cov is a sufficient statistic here. But I think it would require more extensive edits to get logLik() to work that way. I am not sure about this, though, and generally find fixed.x = TRUE to be one of my least favorite model settings!

@@ -137,11 +137,30 @@ lav_mvnorm_missing_loglik_samplestats <- function(Yp          = NULL,
         stopifnot(!is.null(x.cov))
         # Note: x.cov should be identical to Sigma[x.idx, x.idx]
         #       so we don't really need x.cov
-        N <- sum(sapply(Yp, "[[", "freq"))
-        loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov  = x.cov,
-                                                     sample.nobs = N)
+        all.x.idx.obs <- all(sapply(Yp, "[[", "var.idx")[, x.idx])
 
-        loglik <- loglik - loglik.x
+        if(all.x.idx.obs) {
+            N <- sum(sapply(Yp, "[[", "freq"))
+            loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov  = x.cov,
+                                                         sample.nobs = N)
+
+            loglik <- loglik - loglik.x
+        } else {
+            for(p in seq_len(pat.N)) {
+                # observed variables for this pattern
+                obs.idx <- which(Yp[[p]]$var.idx)
+
+                # observed x variables for this pattern
+                x.obs.idx <- match(obs.idx, x.idx, nomatch = 0)
+              
+                if(any(x.obs.idx > 0)){
+                    loglik.x <- lav_mvnorm_h1_loglik_samplestats(sample.cov = x.cov[x.obs.idx, x.obs.idx, drop = FALSE],
+                                                                 sample.nobs = Yp[[p]]$freq)
+                    loglik <- loglik - loglik.x
+                }
+            }
+        }   
+              
     }
 
     loglik

yrosseel added the possible_bug label Jun 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`logLik()` vs. `lavInspect(fit_fixed_x, "loglik.casewise")` #226

`logLik()` vs. `lavInspect(fit_fixed_x, "loglik.casewise")` #226

TDJorgensen commented Jan 10, 2022

ecmerkle commented Jan 16, 2022

logLik() vs. lavInspect(fit_fixed_x, "loglik.casewise") #226

logLik() vs. lavInspect(fit_fixed_x, "loglik.casewise") #226

Comments

TDJorgensen commented Jan 10, 2022

ecmerkle commented Jan 16, 2022

`logLik()` vs. `lavInspect(fit_fixed_x, "loglik.casewise")` #226

`logLik()` vs. `lavInspect(fit_fixed_x, "loglik.casewise")` #226