March 14, 2008

The Cult of Statistical Significance There is a new book called The Cult of Statistical Significance by Stephen T. Ziliak and Deirdre N. McCloskey which seems to be an important one to me. The book shows how many scientific disciplines rely way too much on the concept of statistical significance. I have read the book and I find it convincing. The authors show how the focus on statistical significance has taken away attention for 'real' significance. In other words: the focus on statistical significance often means that researchers fail to ask whether their findings matter. In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. So testing for statistical significance is asking the question how likely it is that an effect exists. It tries to answer that question by looking at how precisely the effect can be measured. It does not answer at all how strong and important this effect is. And this latter question about the effect size is much more important from a scientific and a practical perspective. Statistical significance does not imply an effect is important, lack of statistical significance does not mean an effect is not important. You may ask: how can an effect be important that is not statistically significant? The answer to your question has to do with HOW a statistical significance test tries to answer the question of whether an effects does or not exist, which is by looking at HOW PRECISELY the (presumed) effect can be measured. There are circumstances in which an effect is important yet can not be measured precisely. This would be the case when there is a lot of variability in the effect. When an effect is strong YET highly variable (for instance ranging between 30 and 70), statistical significance tests say the effect cannot be measured precisely which can lead to the conclusion: not statistically significant. At the same time, a weaker effect with lower variability (for instance ranging between 4 and 5) could be measured more precisely, which might lead to the conclusion 'statistically significant'.
Mind you the book is NOT a plea against quantitative research nor statistical analysis. On the contrary. It is a plea for doing it and doing it right by bringing back focus on effect sizes in social science.