Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

mjskay · 2017-05-24T15:16:06Z

From Pierre:

Good guidelines should present planning/preregistration as a plus, not as a "must". Exploratory analyses (when labeled as such) are also a plus. Using exploratory or flexible analyses to draw major conclusions is a minus. If the flexibility is obfuscated on top of that, that's a big minus.

Gelman's distinction between exploratory and flexible/hacked analyses is a good one to make. When we do exploratory analyses, we're trying to learn things. When we hack analyses, we don't. It's the same as the distinction between advancing science and persuading others in our definition of transparent statistics.

Two approaches we might make exemplars of:

The "multiverse analysis" idea: reporting many possible analyses and the conclusions that would result from them (like this: http://www.stat.columbia.edu/~gelman/research/published/multiverse_published.pdf, though I would prefer a forest plot over a histogram of p values, to emphasize estimation over testing).
"Run all the models and then combine them" approach, per McElreath (chapter 6 of http://xcelab.net/rm/statistical-rethinking/). This is use WAIC or LOO (or some measure of model performance) to make a weighted average of the models.

steveharoz · 2017-05-24T16:19:22Z

Maybe also discuss the process of using a flexible/hacked pilot experiment to explore, formulate a hypothesis, and estimate noise. Then use those results to specify a confirmatory experiment. It's often tough to figure out parameter ranges and the required N without a purposefully sketchy pilot.

chatchavan · 2017-05-25T05:56:46Z

I think this may be beyond the scope of the FAQ. This issue basically introduces CHI community to a new meta-method for statistical analysis.

I'd think that a suitable form to communicate this is as an archival publication together with a tool that people can use to do such analysis (like Jake's note on Aligned Rank Transform).

chatchavan · 2017-05-25T06:15:59Z

As for general FAQ about the use of exploratory analysis guideline, I like what Pierre mentioned in the email thread "Re: Exploratory versus confirmatory analyses in a 7th grade class". I quote this below in a list form.

(+) Good guidelines should present planning/preregistration as a plus, not as a "must".
(+) Exploratory analyses (when labeled as such) are also a plus.
(-) Using exploratory or flexible analyses to draw major conclusions is a minus.
(--) If the flexibility is obfuscated on top of that, that's a big minus.

chatchavan · 2017-05-25T06:16:06Z

Another quote from Tukey's EDA book:

Once upon a time, statisticians only explored. Then they learned to confirm exactly---to confirm a few things exactly, each under very specific circumstances. As they emphasized exact information, their techniques inevitably became less flexible. The connection of the most used techniques with past insights was weakened. Anything to which a confirmatory procedure was not explicitly attached was described as "mere descriptive statistics", no matter how much we had learned from it.

Taking this quote in the context of 2017, could it be that the problem is the use of confirmatory procedures to explore data? (One instance is the p-hacking.)

mjskay · 2017-05-25T19:28:20Z

Taking this quote in the context of 2017, could it be that the problem is the use of confirmatory procedures to explore data? (One instance is the p-hacking.)

Interesting question. I do tend to think that p values are almost meaningless in exploratory analysis (I suppose the multiverse paper is proposing one way around that).

I don't think this is out of scope though. There is so much exploratory analysis going on at CHI that I think this is firmly within scope. I agree that a paper could probably also be written on it, but that doesn't put it out of scope.

dragice · 2017-05-31T11:02:27Z

Pilot studies are standard practice and often reported, so I'd say it's within scope. A similar practice is multi-experiment studies, where each experiment informs the next one. I've not read the multiverse paper yet but I love the idea, although this seems more like a new direction for improving practice than something that's really widespread. We can briefly mention new and emerging ideas/directions but should probably treat them differently than widespread practices?

dragice · 2017-05-31T14:36:57Z

p-values and exploratory analysis: this could be something interesting to discuss in one of the guidelines. I think it depends on whether the p-values are cherry-picked. Say my exploratory analysis involves 20 t-tests and I report all of them (or similarly, I report 20 CIs and I look at which ones don't cross zero). I think in an exploratory analysis it's fine. There's no correction for multiplicity but each p-value taken separately is "correct", i.e., meets its definition. It's very different from reporting only the p values that are statistically significant because in this case, they're not correct anymore. I agree the first case still feels somehow dangerous, perhaps it's worth reminding the reader that some/many of the sig. outcomes may be pure noise (and if there's only 1--3 of these, it's likely all of them).

Now I could decide against p-values / CIs because they're too misleading, and only report point estimates. Is it better? I'm not sure, as we're now missing info about statistical error.

Now suppose for some reason I choose to only report the most "impressive" point estimates, and don't mention the others. It seems to be as problematic as the p-based cherry-picking from before. Maybe the problem is more a problem of cherry picking than a problem of p-values?

I suppose it's possible for an exploratory analysis to be planned and to report everything graphically in the paper. The author could make crazy speculations based on the graphs but at least all is there, for the reader to see. But would that still be called exploratory analysis? I don't know.

steveharoz · 2017-05-31T14:49:13Z

The case with 20 t-tests or CIs is one I've encountered a number of times with CHI papers. Maybe an exemplar of using multiple comparison adjustment (e.g., bonferroni) would be useful?

dragice · 2017-05-31T15:13:18Z

As an exemplar OK but as a recommendation it's complicated. There are some pretty good arguments against the systematic use of adjustments.

Bender, Ralf, and Stefan Lange. "Adjusting for multiple testing—when and how?." Journal of clinical epidemiology 54.4 (2001): 343-349.
Stewart-Oaten, Allan. "Rules and judgments in statistics: three examples." Ecology 76.6 (1995): 2001-2009.
Wilson, Warner. "A note on the inconsistency inherent in the necessity to perform multiple comparisons." Psychological Bulletin 59.4 (1962): 296.

The arguments for and against are complicated but if we can manage to summarize the literature and report that into general guidelines, it would be incredibly useful to CHI. I went through some of the literature already, but would have to dive into it again.

mjskay added the exemplar ideas label May 24, 2017

mjskay assigned shionguha, steveharoz, dragice, chatchavan and mjskay May 24, 2017

mjskay changed the title ~~Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked)~~ Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) May 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

mjskay commented May 24, 2017

steveharoz commented May 24, 2017

chatchavan commented May 25, 2017

chatchavan commented May 25, 2017

chatchavan commented May 25, 2017

mjskay commented May 25, 2017

dragice commented May 31, 2017

dragice commented May 31, 2017 •

edited

steveharoz commented May 31, 2017

dragice commented May 31, 2017

Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

Comments

mjskay commented May 24, 2017

steveharoz commented May 24, 2017

chatchavan commented May 25, 2017

chatchavan commented May 25, 2017

chatchavan commented May 25, 2017

mjskay commented May 25, 2017

dragice commented May 31, 2017

dragice commented May 31, 2017 • edited

steveharoz commented May 31, 2017

dragice commented May 31, 2017

dragice commented May 31, 2017 •

edited