 Lynn D. Torbeck
|
Statisticians and non-statisticians daily select the level of statistical significance to be used for decisions, experimental
designs, data collection, sample sizes, and other formal analysis. They usually choose 5%. Why 5%? Note this definition in
a well-known dictionary: "Significance level: The level of probability which it is agreed that the null hypothesis will be
rejected. Conventionally set at 0.05" (1).
Thus, the Cambridge Dictionary of Statistics gives 5% as the definition of the significance level. Why 5%? Why not some other
value? Is it acceptable to be wrong 5% of the time? If we choose another value, what value(s) should it be: 10%, 1%, or 0.1%?
Other writers have reflected on this as well. "Why is the 0.05 level of significance used as the decision point to reject
the null hypothesis? Why not 0.06 level of significance? Actually, the 0.05 level of significance is used because of tradition.
[Sir] R. A. Fisher (2), the founder of modern statistical methods, chose this value and other scientists have accepted the
choice" (3).
In his landmark 1926 paper, "The Arrangement of Field Experiments," (4) Sir Ronald presented his logic for selecting confidence
levels: "It will illustrate the meaning of tests of significance if we consider for how many years the [farm] produce (i.e., results)
should have been recorded in order to make the evidence convincing. First, if the experimenter could say that in twenty years
experience with uniform treatment the difference in favour of the acre treated with manure had never before touched 10 per
cent, the evidence would have reached a point which may be called the verge of significance; ... This level, which we may
call the 5 per cent. point, would be indicated, though very roughly, by the greatest chance deviation observed in twenty successive
trials. ... If one in twenty does not seem high enough odds, we may, if we prefer it, draw the line at one in fifty (the 2
per cent. point), or one in a hundred (the 1 per cent. point)." Apparently, Sir Ronald was not fixated on 5%.
Fisher worked at the Rothamsted Experimental Station located in Harpenden, England, from 1919–1933 (2, 5). While there, he
studied crop yields and animal husbandry using statistics and the theory of experimental designs that he developed. An incorrect
decision affected only the distribution of manure on crop fields or the care and feeding of pigs and honey bees. Being wrong
for 5% of the decisions wouldn't seem to be a major problem. Also, Fisher was not legalistic in his use of significance levels.
Being pragmatic, he was much more concerned with the practical impact or practical significance of the results. It should
be noted that Fisher developed many of the valuable statistical tables used for significance testing, and he chose levels
of 0.05 and 0.01%. Thus, in effect, he forced the rest of the world to go along with his choices regardless of the application.
Other statisticians have pointed out that the selected level of 5% determines how often we will be wrong in our decisions.
"In rejecting the null hypothesis, the sampler faces the possibility that he is wrong. Such is the risk always run by those
who test hypotheses and rest decisions on the tests. ... As a matter of practical convenience, probability levels of 5%
(0.05) and 1% (0.01) are commonly used in deciding whether to reject the null hypothesis. ... This use of 5% and 1% levels
is simply a working convention. There is merit in the practice, followed by some investigators, of reporting in parentheses
the probability ... (6)."
"The question arises: at what probability level does a deviation become statistically significant? There is no rational probability
level at which possibility ceases and impossibility begins, but it is conventional to regard a probability of 0.05 as the
critical level of significance" (7).
Thus, the hallowed 5% significance level was born from a crop experiment and a manure spreader. I am sure this gives reassurance
to those benefitting from the next analysis.
Lynn D. Torbeck is a statistician at Torbeck and Assoc., 2000 Dempster Plaza, Evanston, IL 60202, tel. 847.424.1314, Lynn@Torbeck.org
,
http://www.torbeck.org/.
References
1. B.S. Everitt, The Cambridge Dictionary of Statistics (Cambridge University Press, Cambridge, MA, 1998) p. 305.
2. J.F. Box, and R.A. Fisher, The Life of a Scientist (John Wiley & Sons, New York, NY, 1978).
3. J.F. Zolman, Biostatistics, Experimental Design and Statistical Inference (Oxford University Press, Oxford, 1993) p. 84.
4. R.A. Fisher, Jrnl of the Ministry of Agric., 33, 504 (1926).
5. J.L. Folks, Ideas of Statistics (John Wiley & Sons, New York, NY, 1981) p. 245.
6. G.W. Snedecor, and W. G. Cochran, Statistical Methods, 6th ed., (Iowa State Press, Ames, IA, 1971) p. 27
7. L.H.C. Tippett, The Methods of Statistics, 4th ed., (John Wiley & Sons, New York, NY, 1951) p. 89.