Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) #21

Open
mjskay opened this issue May 24, 2017 · 9 comments

Comments

@mjskay
Copy link
Member

mjskay commented May 24, 2017

From Pierre:

Good guidelines should present planning/preregistration as a plus, not as a "must". Exploratory analyses (when labeled as such) are also a plus. Using exploratory or flexible analyses to draw major conclusions is a minus. If the flexibility is obfuscated on top of that, that's a big minus.

Gelman's distinction between exploratory and flexible/hacked analyses is a good one to make. When we do exploratory analyses, we're trying to learn things. When we hack analyses, we don't. It's the same as the distinction between advancing science and persuading others in our definition of transparent statistics.

Two approaches we might make exemplars of:

@mjskay mjskay changed the title Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked) Some kind of exemplar(s) for exploratory analysis (as opposed to flexible / hacked analysis) May 24, 2017
@steveharoz
Copy link
Contributor

Maybe also discuss the process of using a flexible/hacked pilot experiment to explore, formulate a hypothesis, and estimate noise. Then use those results to specify a confirmatory experiment. It's often tough to figure out parameter ranges and the required N without a purposefully sketchy pilot.

@chatchavan
Copy link
Collaborator

I think this may be beyond the scope of the FAQ. This issue basically introduces CHI community to a new meta-method for statistical analysis.

I'd think that a suitable form to communicate this is as an archival publication together with a tool that people can use to do such analysis (like Jake's note on Aligned Rank Transform).

@chatchavan
Copy link
Collaborator

As for general FAQ about the use of exploratory analysis guideline, I like what Pierre mentioned in the email thread "Re: Exploratory versus confirmatory analyses in a 7th grade class". I quote this below in a list form.

(+) Good guidelines should present planning/preregistration as a plus, not as a "must".
(+) Exploratory analyses (when labeled as such) are also a plus.
(-) Using exploratory or flexible analyses to draw major conclusions is a minus.
(--) If the flexibility is obfuscated on top of that, that's a big minus.

@chatchavan
Copy link
Collaborator

Another quote from Tukey's EDA book:

Once upon a time, statisticians only explored. Then they learned to confirm exactly---to confirm a few things exactly, each under very specific circumstances. As they emphasized exact information, their techniques inevitably became less flexible. The connection of the most used techniques with past insights was weakened. Anything to which a confirmatory procedure was not explicitly attached was described as "mere descriptive statistics", no matter how much we had learned from it.

Taking this quote in the context of 2017, could it be that the problem is the use of confirmatory procedures to explore data? (One instance is the p-hacking.)

@mjskay
Copy link
Member Author

mjskay commented May 25, 2017

Taking this quote in the context of 2017, could it be that the problem is the use of confirmatory procedures to explore data? (One instance is the p-hacking.)

Interesting question. I do tend to think that p values are almost meaningless in exploratory analysis (I suppose the multiverse paper is proposing one way around that).

I don't think this is out of scope though. There is so much exploratory analysis going on at CHI that I think this is firmly within scope. I agree that a paper could probably also be written on it, but that doesn't put it out of scope.

@dragice
Copy link
Member

dragice commented May 31, 2017

Pilot studies are standard practice and often reported, so I'd say it's within scope. A similar practice is multi-experiment studies, where each experiment informs the next one. I've not read the multiverse paper yet but I love the idea, although this seems more like a new direction for improving practice than something that's really widespread. We can briefly mention new and emerging ideas/directions but should probably treat them differently than widespread practices?

@dragice
Copy link
Member

dragice commented May 31, 2017

p-values and exploratory analysis: this could be something interesting to discuss in one of the guidelines. I think it depends on whether the p-values are cherry-picked. Say my exploratory analysis involves 20 t-tests and I report all of them (or similarly, I report 20 CIs and I look at which ones don't cross zero). I think in an exploratory analysis it's fine. There's no correction for multiplicity but each p-value taken separately is "correct", i.e., meets its definition. It's very different from reporting only the p values that are statistically significant because in this case, they're not correct anymore. I agree the first case still feels somehow dangerous, perhaps it's worth reminding the reader that some/many of the sig. outcomes may be pure noise (and if there's only 1--3 of these, it's likely all of them).

Now I could decide against p-values / CIs because they're too misleading, and only report point estimates. Is it better? I'm not sure, as we're now missing info about statistical error.

Now suppose for some reason I choose to only report the most "impressive" point estimates, and don't mention the others. It seems to be as problematic as the p-based cherry-picking from before. Maybe the problem is more a problem of cherry picking than a problem of p-values?

I suppose it's possible for an exploratory analysis to be planned and to report everything graphically in the paper. The author could make crazy speculations based on the graphs but at least all is there, for the reader to see. But would that still be called exploratory analysis? I don't know.

@steveharoz
Copy link
Contributor

The case with 20 t-tests or CIs is one I've encountered a number of times with CHI papers. Maybe an exemplar of using multiple comparison adjustment (e.g., bonferroni) would be useful?

@dragice
Copy link
Member

dragice commented May 31, 2017

As an exemplar OK but as a recommendation it's complicated. There are some pretty good arguments against the systematic use of adjustments.

  • Bender, Ralf, and Stefan Lange. "Adjusting for multiple testing—when and how?." Journal of clinical epidemiology 54.4 (2001): 343-349.
  • Stewart-Oaten, Allan. "Rules and judgments in statistics: three examples." Ecology 76.6 (1995): 2001-2009.
  • Wilson, Warner. "A note on the inconsistency inherent in the necessity to perform multiple comparisons." Psychological Bulletin 59.4 (1962): 296.

The arguments for and against are complicated but if we can manage to summarize the literature and report that into general guidelines, it would be incredibly useful to CHI. I went through some of the literature already, but would have to dive into it again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants