Walk Like a Statistician

Published on: 
Pharmaceutical Technology, Pharmaceutical Technology-11-02-2010, Volume 34, Issue 11

Applied statisticians are forever searching for the enemy of quality-variability.

Reading the newspaper, I get very frustrated with contradictory studies. Hormones are good for you; hormones are bad for you. Vitamin C is good for you; no, wait, it doesn't help. Classical music will make our children smarter, or not. Such studies are well intentioned but have limitations not noted in the newspaper or maybe even in the original report. How can we make sense of such studies?

Lynn D. Torbeck

We see many references to statistics in the popular press, including surveys, polls, and allusions to "scientific studies." Many people talk about these statistical results, but are reluctant to do the statistics themselves. When it comes down to it, can those who talk statistics really "walk the walk?"

It's not really that hard, and I encourage you to use basic statistical concepts in your everyday activities. If you do, others may wonder whether you could be a statistician. A good friend of mine has over the years learned many of these basic concepts and developed a clever knack for asking the right questions in a meeting that go to the heart of data collection, analysis, and interpretation. His most common statement is, "Bring data and graphs!"

My friend is on his way to walking the statistician walk. He's realized that statisticians think differently about data and the studies that generate them than do most people. He's discovered that in a sense, applied statisticians are data detectives, looking for clues to solve difficult problems.


Of course, the first question is always: "What is the real question?" Initial attempts to diagnose a problem will surface a litany of excuses, reasons, symptoms, and folklore. The wise problem-solver continues to ask questions until the real root statistical question is revealed. At this point, the problem is half-solved.

The second question is: "Are there data available?" If so, how much, and how were they collected? Act like a newspaper reporter and ask the "4W's: "Who?, What?, Where?, and When?" The data must be representative or they are worthless. The only true way to know whether the data are representative is to know exactly how the data were collected or generated. Thus, there is a need for tightly written standard operating procedures for sampling, data collection, training, and retraining.

An often overlooked but important issue is the reportable value. This is defined as the end result of the measurement method as documented. Most crucially, it is the value or result that is compared with the specification criterion and the value most often used for statistical analysis. The operational definition should be clear to all working on the project. That then raises the question: "What are the specification criteria, and how were they determined?" Don't hesitate to ask what historical data were used and what statistical analysis was performed.

I noted that statisticians are like detectives. The good ones are always looking for sources of variation and ways to eliminate it or minimize it. The best way to start an investigation to study variation is to plot the data on graphs. Use histograms, Pareto plots, timeplots, boxplots, control charts, and scatter plots. It is amazing how many projects can be solved quickly and simply by collecting good data and making a graph with cause and effect on the same page (see Ref. 1 for a classic example using Challenger data).

Keep looking for patterns in the data. When nothing is going on, the data look random. To the extent that we can find nonrandom patterns in the data, we have the clues we need to find solutions to the problem. The variation is the message. Practice this every day, and you, too, can walk like a statistician.

Lynn D. Torbeck is a statistician at Torbeck and Assoc., 2000 Dempster Plaza, Evanston, IL 60202, tel. 847.424.1314, Lynn@Torbeck.org, www.torbeck.org.


1. E. Tufte, Visual Explanations (Graphics Press, Cheshire, CT, 1997), p. 45.

Additional reading

1. D. Huff, How to Lie with Statistics (W.W. Norton, New York, NY, 1954).

2. J.A. Paulos, Innumeracy (Vintage Books, New York, NY, 1988).

3. L. Gonick and W. Smith, The Cartoon Guide to Statistics (Harper Perennial, New York, NY, 1993).