Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup multiple.md #13

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Cleanup multiple.md #13

wants to merge 1 commit into from

Conversation

dmbates
Copy link
Contributor

@dmbates dmbates commented Sep 29, 2021

  • one line per sentence
  • do not capitalize section or subsection titles
  • add a few of the plots described in the text

@dmbates dmbates requested a review from palday September 29, 2021 15:50
Comment on lines +214 to +239
Some presentations of mixed-effects models, especially those related to *multilevel modeling* [@MLwiNUser:2000] or *hierarchical linear models* [@Rauden:Bryk:2002], leave the impression that one can only define random effects with respect to factors that are nested.
This is the origin of the terms "multilevel", referring to multiple, nested levels of variability, and "hierarchical", also invoking the concept of a hierarchy of levels.
To be fair, both those references do describe the use of models with random effects associated with non-nested factors, but such models tend to be treated as a special case.

The blurring of mixed-effects models with the concept of multiple, hierarchical levels of variation results in an unwarranted emphasis on "levels" when defining a model and leads to considerable confusion.
It is perfectly legitimate to define models having random effects associated with non-nested factors.
The reasons for the emphasis on defining random effects with respect to nested factors only are that such cases do occur frequently in practice and that some of the computational methods for estimating the parameters in the models can only be easily applied to nested factors.

This is not the case for the methods used in the MixedModels package.
Indeed there is nothing special done for models with random effects for nested factors.
When random effects are associated with multiple factors exactly the same computational methods are used whether the factors form a nested sequence or are partially crossed or are completely crossed.

There is, however, one aspect of nested grouping factors that we should emphasize, which is the possibility of a factor that is *implicitly nested* within another factor.
Suppose, for example, that the factor was sample defined as having three levels instead of 30 with the implicit assumption that sample is nested within batch.
It may seem silly to try to distinguish 30 different batches with only three levels of a factor but, unfortunately, data are frequently organized and presented like this, especially in text books.
The factor cask in the data is exactly such an implicitly nested factor.
If we cross-tabulate cask and batch we get the impression that the and factors are crossed, not nested.
If we know that the cask should be considered as nested within the batch then we should create a new categorical variable giving the batch-cask combination, which is exactly what the sample factor is.
A simple way to create such a factor is to use the interaction operator, '`&`', on the factors.
It is advisable, but not necessary, to apply to the result thereby dropping unused levels of the interaction from the set of all possible levels of the factor.
(An "unused level" is a combination that does not occur in the data.)
A convenient code idiom is

In a small data set like we can quickly detect a factor being implicitly nested within another factor and take appropriate action.
In a large data set, perhaps hundreds of thousands of test scores for students in thousands of schools from hundreds of school districts, it is not always obvious if school identifiers are unique across the entire data set or just within a district.
If you are not sure, the safest thing to do is to create the interaction factor, as shown above, so you can be confident that levels of the district:school interaction do indeed correspond to unique schools.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Member

@palday palday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM: feel free to squash and merge when ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants