Part I of this article appeared in the July 2010 issue of Pharmaceutical Technology and discussed experimental design planning (1). This article, Part II, addresses design and analysis in statistical design
of experiments (DoE). Part III, to be published in the September 2010 issue of Pharmaceutical Technology, will cover how to evaluate a design space.
Design space is part of the US Food and Drug Administration's quality initiative for the 21st century which seeks to move
toward a new paradigm for pharmaceutical assessment as outlined in the International Conference on Harmonization's quality
guidelines Q8, Q9, and Q10. The statistics required for design-space development play an important role in ensuring the robustness
of this approach.
This article provides concise answers to frequently asked questions (FAQs) related to the statistical aspects of determining
a design space as part of quality-by-design (QbD) initiatives. These FAQs reflect the experiences of a diverse group of statisticians
who have worked closely with process engineers and scientists in the chemical and pharmaceutical development disciplines,
grappling with issues related to the establishment of a design space from a scientific, engineering, and risk-based perspective.
Questions 1–7 appeared in Part I of this series (1). The answers provided herein, to Questions 8–22, constitute basic information
regarding statistical considerations and concepts and will be beneficial to a scientist working to establish a design space
in collaboration with a statistician.
Statistical experimental design
The following questions address types of experiments and appropriate design choices. The selection of the design depends on
the development stage, available resources, and the goal of the experiment. The type and size of the design for an experiment
depends on what questions need to be answered by the study. In general, there are three types of experiments: screening experiments
to select factors for more experimentation or to demonstrate robustness, interaction experiments to further study interactions
between factors of interest, and optimization experiments to more carefully map a region of interest.
Q8: Why are one-factor-at-a-time (OFAT) designs misleading?
A: OFAT experimentation is effective if the error in the measurements is small when compared with the difference one desires
to detect and if the factors do not interact with one another. If either of condition is violated, the OFAT methodology will
require more resources (experiments, time, and material) to estimate the effect of each factor of interest. In general, the
interactions between factors are not estimable from OFAT experiments. Even when there are no interactions, a fractional factorial
experiment often results in fewer resources and may provide information on the variability. Experimenters may be able to estimate
interactions based on a series of experiments that were not originally designed for that purpose, but this approach may require
more sophisticated statistical analyses.
Q9: If I have a model based on mechanistic understanding, should I use experimental design?
A: Experimental design can provide an efficient and effective means for estimating mechanistic model parameters. Further, experimental
design can be used to minimize the number of runs required to estimate model coefficients. The underlying functional form
of the surface, including interactions, may be known so that the interest is focused on estimating the model coefficients.
Random variation should also be estimated and incorporated into the mechanistic models; the incorporation of error is often
omitted from these models. General mechanistic understanding can still be used to select factors for evaluation, assist with
identifying interactions, and assure that the model is appropriate. The appropriate design to accomplish the goals listed
above may differ from factorial designs and may result in levels of factors that are not equally spaced.
Another advantage of using an experimental design is that an empirical model can be fit to the data and compared with the
mechanistic model as part of model verification (see Questions 10–13).