Andrew Gelman: What has happened down here is the winds have changed

Andrew Gelman writes:

To understand Fiske’s attitude, it helps to realize how fast things have changed.
As of five years ago—2011—the replication crisis was barely a cloud on the horizon.

Here’s what I see as the timeline of important events:

1960s-1970s: Paul Meehl argues that the standard paradigm of experimental psychology doesn’t work, that “a zealous and clever investigator can slowly wend his way through a tenuous nomological network, performing a long series of related experiments which appear to the uncritical reader as a fine example of ‘an integrated research program,’ without ever once refuting or corroborating so much as a single strand of the network.”

Psychologists all knew who Paul Meehl was, but they pretty much ignored his warnings. For example, Robert Rosenthal wrote an influential paper on the “file drawer problem” but if anything this distracts from the larger problems of the find-statistical-signficance-any-way-you-can-and-declare-victory paradigm.

1960s: Jacob Cohen studies statistical power, spreading the idea that design and data collection are central to good research in psychology, and culminating in his book, Statistical Power Analysis for the Behavioral Sciences, The research community incorporates Cohen’s methods and terminology into its practice but sidesteps the most important issue by drastically overestimating real-world effect sizes.

1971: Tversky and Kahneman write “Belief in the law of small numbers,” one of their first studies of persistent biases in human cognition. This early work focuses on resarchers’ misunderstanding of uncertainty and variation (particularly but not limited to p-values and statistical significance), but they and their colleagues soon move into more general lines of inquiry and don’t fully recognize the implication of their work for research practice.

1980s-1990s: Null hypothesis significance testing becomes increasingly controversial within the world of psychology. Unfortunately this was framed more as a methods question than a research question, and I think the idea was that research protocols are just fine, all that’s needed was a tweaking of the analysis. I didn’t see general airing of Meehl-like conjectures that much published research was useless.

2006: I first hear about the work of Satoshi Kanazawa, a sociologist who published a series of papers with provocative claims (“Engineers have more sons, nurses have more daughters,” etc.), each of which turns out to be based on some statistical error. I was of course already aware that statistical errors exist, but I hadn’t fully come to terms with the idea that this particular research program, and others like it, were dead on arrival because of too low a signal-to-noise ratio. It still seemed a problem with statistical analysis, to be resolved one error at a time.

2008: Edward Vul, Christine Harris, Piotr Winkielman, and Harold Pashler write a controversial article, “Voodoo correlations in social neuroscience,” arguing not just that some published papers have technical problems but also that these statistical problems are distorting the research field, and that many prominent published claims in the area are not to be trusted. This is moving into Meehl territory.

2008 also saw the start of the blog Neuroskeptic, which started with the usual soft targets (prayer studies, vaccine deniers), then started to criticize science hype (“I’d like to make it clear that I’m not out to criticize the paper itself or the authors . . . I think the data from this study are valuable and interesting – to a specialist. What concerns me is the way in which this study and others like it are reported, and indeed the fact that they are repored as news at all,” but soon moved to larger criticisms of the field. I don’t know that the Neuroskeptic blog per se was such a big deal but it’s symptomatic of a larger shift of science-opinion blogging away from traditional political topics toward internal criticism.

2011: Joseph Simmons, Leif Nelson, and Uri Simonsohn publish a paper, “False-positive psychology,” in Psychological Science introducing the useful term “researcher degrees of freedom.” Later they come up with the term p-hacking, and Eric Loken and I speak of the garden of forking paths to describe the processes by which researcher degrees of freedom are employed to attain statistical significance.


* Yet Fiske doesn’t seem to have any issue with fluffy TED talks. Apparently TED provides the quality control she mentions.

* Amy Cuddy’s speaker fees are in tier 6–that is, $40,001 and up.

Yikes. Well, that would create a bit of an incentive…

* I would say Fiske isn’t using subterfuge–she’s just incompetent (but a full professor at Princeton!). When incompetence is pointed out, she reacts like an academic–she attempts to silence the source or use ad hominem attacks. But here’s the nice thing–she has to do it publicly, rather than pick up the phone (which is the standard method in academic political science). That’s because she can’t pick up the phone to silence you.

>Look. I’m not saying these are bad people. Sure, maybe they cut corners here or there, or make some mistakes, but those are all technicalities—at least, that’s how I’m guessing they’re thinking. For Cuddy, Norton, and Fiske to step back and think that maybe almost everything they’ve been doing for years is all a mistake . . . that’s a big jump to take. Indeed, they’ll probably never take it. All the incentives fall in the other direction.

Solzhenitsyn says that when you have spent your life establishing a lie, what is required is not equivocation but rather a dramatic self-sacrifice (in relation to Ehrenburg– I see no chance of that happening in any social science field–tenure means Fiske will be manning (can I use that phrase?) the barricades until they cart her off in her 80’s. Thinking machines will be along in 20-30 years and then universities can dismantle the social sciences and replace them with those.

* Paul Romer, a former academic who now is chief economist at the World Bank thinks that macroecnomics is a science in failure mode, and thinks that this parallels the evolution of science in general. You can read his arguments at:

The gist of it is that economists have cooked up fancy models involving variables that have no measurable counterpart in the real world, and then use these models to draw conclusions that reflect nothing more than the arbitrary assumptions made to identify the model. Not being familiar with the models he criticizes, I can’t assess his claims, but they sound quite plausible. He has been sounding this alarm for quite a while now, and has published numerous papers which you can easily find by Googling the term “mathiness” (which he coined.)

About Luke Ford

I've written five books (see My work has been followed by the New York Times, the Los Angeles Times, and 60 Minutes. I teach Alexander Technique in Beverly Hills (
This entry was posted in College. Bookmark the permalink.