You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A year or two ago, I was running some large-scale power simulations of reaction times on a remote compute cluster. I was sampling and creating thousands upon thousands of artificial experiments, adding artificial effects to these data sets and testing the convergence rate, singular fit rate, power, and Type I error rates of various lmer models in capturing these effects.
In one study, I was measuring a full factorial interaction between an added (and artificial) binary factor and a naturally occurring binary factor, for both directions of a simple interaction of the two. In my code (due to laziness), I ran simulations equally for both directions of the interaction effect, duplicating these simulations even when the effect size of the interaction effect was 0. Per dataset, the only thing different between these duplicated simulations was that for one model, the natural effect was given a contrast code of <-1, 1> and <1, -1> in the other. Since the added interaction effect size was 0, there was nothing different in the datasets between these simulations.
Mathematically, this should not change the outcome of the model, beyond flipping the signs of some of the effects in the model. But recently in writing up my results, I noticed that there were relatively significant differences in the convergence rates between these simulation conditions. In conditions with the smallest experiments (8 "subjects" by 8 "items" = 64 rows of data), the convergence rate for one contrast coding was 104/10,000 models, while the convergence rate for the other was 48/10,000 models.
The differences in convergence rates were much larger than other, similar power simulations I had run.
The following image shows the data. Notice that I was comparing lmer models measuring raw RTs ('RT ~ ...') with identical models measuring log-transformed RTs ('log(RT) ~ ...') and that the difference in convergence rates is not present for the those models. The color of the points indicates which added effects are present: when 'none' of the added effects are present, the data is the same, and the only thing that differed between simulations of either direction was the sign of the contrast coding of one effect.
Technical details
Unfortunately, a lot of the meta information that is normally very helpful in filing these bug reports is vague---it's been a year or two since I ran these experiments, and the compute cluster has since gone offline, and I don't have the resources to replicate it.
The nodes in the cluster were using R versions 3.5 and 3.6, and the version of lme4 was the one right after the default optimizer was switched to nloptwrap (I remember because while I was doing similar previous studies, the convergence rates suddenly changed midway through). For these results, I used the 'bobyqa' optimizer, and the effects structure was of the form:
I was wondering if something like this has ever been reported before. I don't see how the sign of a binary contrast coding can change the convergence rates by ~100%, especially when the datasets were literally identical otherwise. I'm assuming this is a bug!
The text was updated successfully, but these errors were encountered:
Background
A year or two ago, I was running some large-scale power simulations of reaction times on a remote compute cluster. I was sampling and creating thousands upon thousands of artificial experiments, adding artificial effects to these data sets and testing the convergence rate, singular fit rate, power, and Type I error rates of various
lmer
models in capturing these effects.In one study, I was measuring a full factorial interaction between an added (and artificial) binary factor and a naturally occurring binary factor, for both directions of a simple interaction of the two. In my code (due to laziness), I ran simulations equally for both directions of the interaction effect, duplicating these simulations even when the effect size of the interaction effect was 0. Per dataset, the only thing different between these duplicated simulations was that for one model, the natural effect was given a contrast code of <-1, 1> and <1, -1> in the other. Since the added interaction effect size was 0, there was nothing different in the datasets between these simulations.
Mathematically, this should not change the outcome of the model, beyond flipping the signs of some of the effects in the model. But recently in writing up my results, I noticed that there were relatively significant differences in the convergence rates between these simulation conditions. In conditions with the smallest experiments (8 "subjects" by 8 "items" = 64 rows of data), the convergence rate for one contrast coding was 104/10,000 models, while the convergence rate for the other was 48/10,000 models.
The differences in convergence rates were much larger than other, similar power simulations I had run.
The following image shows the data. Notice that I was comparing lmer models measuring raw RTs ('RT ~ ...') with identical models measuring log-transformed RTs ('log(RT) ~ ...') and that the difference in convergence rates is not present for the those models. The color of the points indicates which added effects are present: when 'none' of the added effects are present, the data is the same, and the only thing that differed between simulations of either direction was the sign of the contrast coding of one effect.
Technical details
Unfortunately, a lot of the meta information that is normally very helpful in filing these bug reports is vague---it's been a year or two since I ran these experiments, and the compute cluster has since gone offline, and I don't have the resources to replicate it.
The nodes in the cluster were using R versions 3.5 and 3.6, and the version of
lme4
was the one right after the default optimizer was switched tonloptwrap
(I remember because while I was doing similar previous studies, the convergence rates suddenly changed midway through). For these results, I used the 'bobyqa' optimizer, and the effects structure was of the form:RT ~ 1 + <added effect> * <natural effect> + (1 + <natural effect> | Subject ) + (1 | Item)
I was wondering if something like this has ever been reported before. I don't see how the sign of a binary contrast coding can change the convergence rates by ~100%, especially when the datasets were literally identical otherwise. I'm assuming this is a bug!
The text was updated successfully, but these errors were encountered: