Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output from augment.merMod is effected by missing values in data or newdata variables that are not used the model. #123

Open
JohannesNE opened this issue Mar 3, 2022 · 1 comment

Comments

@JohannesNE
Copy link

Using augment.merMod with data/newdata with missing values (sometimes) gives a warning, even when the missing values are in unused variables. It seems like calculation of .mu, .offset and remaining columns to the right is related to the problem.

library(lme4)
#> Loading required package: Matrix
library(broom.mixed)

d <- sleepstudy 

d$a <- 1
d$a[10] <- NA

fm1 <- lmer(Reaction ~ Days + (Days | Subject), d)

augment(fm1)
#> # A tibble: 180 × 14
#>    Reaction  Days Subject .fitted  .resid   .hat .cooksd .fixed   .mu .offset
#>       <dbl> <dbl> <fct>     <dbl>   <dbl>  <dbl>   <dbl>  <dbl> <dbl>   <dbl>
#>  1     250.     0 308        254.   -4.10 0.229  0.00496   251.  254.       0
#>  2     259.     1 308        273.  -14.6  0.170  0.0402    262.  273.       0
#>  3     251.     2 308        293.  -42.2  0.127  0.226     272.  293.       0
#>  4     321.     3 308        313.    8.78 0.101  0.00731   283.  313.       0
#>  5     357.     4 308        332.   24.5  0.0910 0.0506    293.  332.       0
#>  6     415.     5 308        352.   62.7  0.0981 0.362     304.  352.       0
#>  7     382.     6 308        372.   10.5  0.122  0.0134    314.  372.       0
#>  8     290.     7 308        391. -101.   0.162  1.81      325.  391.       0
#>  9     431.     8 308        411.   19.6  0.219  0.106     335.  411.       0
#> 10     466.     9 308        431.   35.7  0.293  0.571     346.  431.       0
#> # … with 170 more rows, and 4 more variables: .sqrtXwt <dbl>, .sqrtrwt <dbl>,
#> #   .weights <dbl>, .wtres <dbl>

# .mu is NA when `a` is NA

augment(fm1, data = d)
#> Warning in indices[which(stats::complete.cases(original))] <- seq_len(nrow(x)):
#> number of items to replace is not a multiple of replacement length
#> # A tibble: 180 × 15
#>    Reaction  Days Subject     a .fitted  .resid   .hat .cooksd .fixed   .mu
#>       <dbl> <dbl> <fct>   <dbl>   <dbl>   <dbl>  <dbl>   <dbl>  <dbl> <dbl>
#>  1     250.     0 308         1    254.   -4.10 0.229  0.00496   251.  254.
#>  2     259.     1 308         1    273.  -14.6  0.170  0.0402    262.  273.
#>  3     251.     2 308         1    293.  -42.2  0.127  0.226     272.  293.
#>  4     321.     3 308         1    313.    8.78 0.101  0.00731   283.  313.
#>  5     357.     4 308         1    332.   24.5  0.0910 0.0506    293.  332.
#>  6     415.     5 308         1    352.   62.7  0.0981 0.362     304.  352.
#>  7     382.     6 308         1    372.   10.5  0.122  0.0134    314.  372.
#>  8     290.     7 308         1    391. -101.   0.162  1.81      325.  391.
#>  9     431.     8 308         1    411.   19.6  0.219  0.106     335.  411.
#> 10     466.     9 308        NA    431.   35.7  0.293  0.571     346.   NA  <- .mu and all columns to the right is now NA
#> # … with 170 more rows, and 5 more variables: .offset <dbl>, .sqrtXwt <dbl>,
#> #   .sqrtrwt <dbl>, .weights <dbl>, .wtres <dbl>

d$a <- NA

# We do not get the warning, but .mu is now all NA
augment(fm1, data = d)
#> # A tibble: 180 × 15
#>    Reaction  Days Subject a     .fitted  .resid   .hat .cooksd .fixed   .mu
#>       <dbl> <dbl> <fct>   <lgl>   <dbl>   <dbl>  <dbl>   <dbl>  <dbl> <dbl>
#>  1     250.     0 308     NA       254.   -4.10 0.229  0.00496   251.    NA
#>  2     259.     1 308     NA       273.  -14.6  0.170  0.0402    262.    NA
#>  3     251.     2 308     NA       293.  -42.2  0.127  0.226     272.    NA
#>  4     321.     3 308     NA       313.    8.78 0.101  0.00731   283.    NA
#>  5     357.     4 308     NA       332.   24.5  0.0910 0.0506    293.    NA
#>  6     415.     5 308     NA       352.   62.7  0.0981 0.362     304.    NA
#>  7     382.     6 308     NA       372.   10.5  0.122  0.0134    314.    NA
#>  8     290.     7 308     NA       391. -101.   0.162  1.81      325.    NA
#>  9     431.     8 308     NA       411.   19.6  0.219  0.106     335.    NA
#> 10     466.     9 308     NA       431.   35.7  0.293  0.571     346.    NA
#> # … with 170 more rows, and 5 more variables: .offset <dbl>, .sqrtXwt <dbl>,
#> #   .sqrtrwt <dbl>, .weights <dbl>, .wtres <dbl>

Created on 2022-03-03 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       Pop!_OS 21.10               
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_US:en                    
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Copenhagen           
#>  date     2022-03-03                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                              
#>  assertthat    0.2.1      2019-03-21 [3] CRAN (R 4.1.0)                      
#>  backports     1.4.1      2021-12-13 [1] CRAN (R 4.1.0)                      
#>  boot          1.3-28     2021-05-03 [4] CRAN (R 4.1.0)                      
#>  broom         0.7.12     2022-01-28 [1] CRAN (R 4.1.0)                      
#>  broom.mixed * 0.2.9.2    2022-03-03 [1] Github (bbolker/broom.mixed@8423ebd)
#>  cli           3.2.0      2022-02-14 [1] CRAN (R 4.1.0)                      
#>  codetools     0.2-18     2020-11-04 [4] CRAN (R 4.1.0)                      
#>  crayon        1.5.0      2022-02-14 [1] CRAN (R 4.1.0)                      
#>  DBI           1.1.1      2021-01-15 [3] CRAN (R 4.1.0)                      
#>  digest        0.6.29     2021-12-01 [1] CRAN (R 4.1.0)                      
#>  dplyr         1.0.8      2022-02-08 [1] CRAN (R 4.1.0)                      
#>  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)                      
#>  evaluate      0.14       2019-05-28 [3] CRAN (R 4.1.0)                      
#>  fansi         1.0.2      2022-01-14 [1] CRAN (R 4.1.0)                      
#>  fastmap       1.1.0      2021-01-25 [3] CRAN (R 4.1.0)                      
#>  forcats       0.5.1      2021-01-27 [1] CRAN (R 4.1.0)                      
#>  fs            1.5.2      2021-12-08 [1] CRAN (R 4.1.0)                      
#>  furrr         0.2.3      2021-06-25 [1] CRAN (R 4.1.0)                      
#>  future        1.24.0     2022-02-19 [1] CRAN (R 4.1.0)                      
#>  generics      0.1.2      2022-01-31 [1] CRAN (R 4.1.0)                      
#>  globals       0.14.0     2020-11-22 [1] CRAN (R 4.1.0)                      
#>  glue          1.6.2      2022-02-24 [1] CRAN (R 4.1.0)                      
#>  highr         0.9        2021-04-16 [3] CRAN (R 4.1.0)                      
#>  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.1.0)                      
#>  knitr         1.37       2021-12-16 [1] CRAN (R 4.1.0)                      
#>  lattice       0.20-44    2021-05-02 [4] CRAN (R 4.1.0)                      
#>  lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.1.0)                      
#>  listenv       0.8.0      2019-12-05 [3] CRAN (R 4.1.0)                      
#>  lme4        * 1.1-27.1   2021-06-22 [1] CRAN (R 4.1.0)                      
#>  magrittr      2.0.2      2022-01-26 [1] CRAN (R 4.1.0)                      
#>  MASS          7.3-54     2021-05-03 [4] CRAN (R 4.1.0)                      
#>  Matrix      * 1.3-4      2021-06-01 [4] CRAN (R 4.1.0)                      
#>  minqa         1.2.4      2014-10-09 [1] CRAN (R 4.1.0)                      
#>  nlme          3.1-152    2021-02-04 [4] CRAN (R 4.1.0)                      
#>  nloptr        1.2.2.2    2020-07-02 [1] CRAN (R 4.1.0)                      
#>  parallelly    1.30.0     2021-12-17 [1] CRAN (R 4.1.0)                      
#>  pillar        1.7.0      2022-02-01 [1] CRAN (R 4.1.0)                      
#>  pkgconfig     2.0.3      2019-09-22 [3] CRAN (R 4.1.0)                      
#>  purrr         0.3.4      2020-04-17 [3] CRAN (R 4.1.0)                      
#>  R6            2.5.1      2021-08-19 [1] CRAN (R 4.1.0)                      
#>  Rcpp          1.0.8      2022-01-13 [1] CRAN (R 4.1.0)                      
#>  reprex        2.0.0      2021-04-02 [1] CRAN (R 4.1.0)                      
#>  rlang         1.0.1      2022-02-03 [1] CRAN (R 4.1.0)                      
#>  rmarkdown     2.11.3     2021-11-04 [1] Github (rstudio/rmarkdown@ebf0d09)  
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.0)                      
#>  sessioninfo   1.1.1      2018-11-05 [3] CRAN (R 4.1.0)                      
#>  stringi       1.7.6      2021-11-29 [1] CRAN (R 4.1.0)                      
#>  stringr       1.4.0      2019-02-10 [3] CRAN (R 4.1.0)                      
#>  styler        1.5.1.9000 2021-07-15 [1] local                               
#>  tibble        3.1.6      2021-11-07 [1] CRAN (R 4.1.0)                      
#>  tidyr         1.2.0      2022-02-01 [1] CRAN (R 4.1.0)                      
#>  tidyselect    1.1.2      2022-02-21 [1] CRAN (R 4.1.0)                      
#>  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.1.0)                      
#>  vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)                      
#>  withr         2.4.3      2021-11-30 [1] CRAN (R 4.1.0)                      
#>  xfun          0.29       2021-12-14 [1] CRAN (R 4.1.0)                      
#>  yaml          2.2.1      2020-02-01 [3] CRAN (R 4.1.0)                      
#> 
#> [1] /home/johannes/R/x86_64-pc-linux-gnu-library/4.1
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
@JohannesNE
Copy link
Author

Related, should these values (.mu and rightwards), that are directly drawn from the model object, be included when newdata is used?

library(lme4)
#> Loading required package: Matrix
library(broom.mixed)

d <- sleepstudy 

fm1 <- lmer(Reaction ~ Days + (Days | Subject), d)

d_rev <- d[nrow(d):1,]

d_rev_aug <- augment(fm1, newdata = d_rev)

d_rev_aug[, c(".fitted", ".mu")]
#> # A tibble: 180 × 2
#>    .fitted   .mu
#>      <dbl> <dbl>
#>  1    369.  254.
#>  2    358.  273.
#>  3    346.  293.
#>  4    334.  313.
#>  5    322.  332.
#>  6    311.  352.
#>  7    299.  372.
#>  8    287.  391.
#>  9    275.  411.
#> 10    264.  431.
#> # … with 170 more rows

Created on 2022-03-03 by the reprex package (v2.0.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant