 Lynn D. Torbeck
|
Reading the newspaper, I get very frustrated with contradictory studies. Hormones are good for you; hormones are bad for you.
Vitamin C is good for you; no, wait, it doesn't help. Classical music will make our children smarter, or not. Such studies
are well intentioned but have limitations not noted in the newspaper or maybe even in the original report. How can we make
sense of such studies?
We see many references to statistics in the popular press, including surveys, polls, and allusions to "scientific studies."
Many people talk about these statistical results, but are reluctant to do the statistics themselves. When it comes down to
it, can those who talk statistics really "walk the walk?"
It's not really that hard, and I encourage you to use basic statistical concepts in your everyday activities. If you do, others
may wonder whether you could be a statistician. A good friend of mine has over the years learned many of these basic concepts
and developed a clever knack for asking the right questions in a meeting that go to the heart of data collection, analysis,
and interpretation. His most common statement is, "Bring data and graphs!"
My friend is on his way to walking the statistician walk. He's realized that statisticians think differently about data and
the studies that generate them than do most people. He's discovered that in a sense, applied statisticians are data detectives,
looking for clues to solve difficult problems.
Of course, the first question is always: "What is the real question?" Initial attempts to diagnose a problem will surface
a litany of excuses, reasons, symptoms, and folklore. The wise problem-solver continues to ask questions until the real root
statistical question is revealed. At this point, the problem is half-solved.
The second question is: "Are there data available?" If so, how much, and how were they collected? Act like a newspaper reporter
and ask the "4W's: "Who?, What?, Where?, and When?" The data must be representative or they are worthless. The only true way
to know whether the data are representative is to know exactly how the data were collected or generated. Thus, there is a
need for tightly written standard operating procedures for sampling, data collection, training, and retraining.
An often overlooked but important issue is the reportable value. This is defined as the end result of the measurement method
as documented. Most crucially, it is the value or result that is compared with the specification criterion and the value most
often used for statistical analysis. The operational definition should be clear to all working on the project. That then raises
the question: "What are the specification criteria, and how were they determined?" Don't hesitate to ask what historical data
were used and what statistical analysis was performed.
I noted that statisticians are like detectives. The good ones are always looking for sources of variation and ways to eliminate
it or minimize it. The best way to start an investigation to study variation is to plot the data on graphs. Use histograms,
Pareto plots, timeplots, boxplots, control charts, and scatter plots. It is amazing how many projects can be solved quickly
and simply by collecting good data and making a graph with cause and effect on the same page (see Ref. 1 for a classic example
using Challenger data).
Keep looking for patterns in the data. When nothing is going on, the data look random. To the extent that we can find nonrandom
patterns in the data, we have the clues we need to find solutions to the problem. The variation is the message. Practice this
every day, and you, too, can walk like a statistician.
Lynn D. Torbeck is a statistician at Torbeck and Assoc., 2000 Dempster Plaza, Evanston, IL 60202, tel. 847.424.1314, Lynn@Torbeck.org
,
http://www.torbeck.org/.
Reference
1. E. Tufte, Visual Explanations (Graphics Press, Cheshire, CT, 1997), p. 45.
Additional reading
1. D. Huff, How to Lie with Statistics (W.W. Norton, New York, NY, 1954).
2. J.A. Paulos, Innumeracy (Vintage Books, New York, NY, 1988).
3. L. Gonick and W. Smith, The Cartoon Guide to Statistics (Harper Perennial, New York, NY, 1993).