Q21: How do I construct a design space when I have multiple important responses?
Figure 7: Assay design space for Factors A and B. The red points are the experimental design points; note that although the
axes extend from –1.5 to +1.5 coded units, the experimental factors only spanned -1.41 to +1.41 coded units.
A: If each response can be adequately modeled using univariate analyses, the simplest approach to dealing with multiple responses
is to overlay contour plots of the fitted model for each response. The overlay plot will indicate regions where each mean
response is within its required bounds.
To illustrate, the example used in Question 19 can be extended to include two responses: Assay and Degradate 1. The design
space can be constructed using the results from the DOE with the factors (A and B) and two quality characteristics. The analysis
of the data from the DoE found that the responses could be adequately modeled using the following equations and that these
responses were not highly correlated:
Assay =96.9+3.2* Factor A -17*
Factor B - 2.2* Factor A2+3.0*
Factor A* Factor B Degradate = 0.82
– 0.30* Factor A +0.49* Factor B +0.72*
Factor A* Factor B
When there is more than one quality characteristic in the design space, the use of overlay plots is helpful. An overlay plot
is created by superimposing the contour plots for each quality characteristic with the required quality bounds (e.g., specifications).
A potential design space can often be found where all the mean quality characteristics are simultaneously within the requirements.
Figure 8 provides a potential window of operability for Assay and Degradate 1. For this overlay plot, the requirements for
assay are 95.0% to 105.0% and Degradate 1 is NMT 1.00%. The yellow region indicates the settings of Factors A and B that meet
both of these requirements simultaneously. The red points included on the plot are the experimental design points; note that
the axes extend from -1.5 to +1.5 coded units although the experimental space is from -1.41 to +1.41 coded units.
Figure 8: Overlay plot of assay (%) and degradate 1 (%) for Factors A and B where the bounds for assay are 95.0%–105.0% and
the upper bound for degradate 1 is 1.00%. The red points represent the observed results.
With multiple important responses, it is valuable to understand the correlation structure of the responses. A correlation
analysis can help determine if each response should be assessed separately or if the responses should be analyzed together.
If a set of the responses is highly correlated, it may be possible to eliminate some responses from the analysis, recognizing
that each of the correlated responses contains the same information. As a result, one response can be chosen to represent
the set of responses, and it can be analyzed using univariate methods. If the results are moderately correlated, analysis
methods that take the correlation structure into account may be used such as multivariate analysis (MANOVA), the Bayesian
interval approach referenced in Question 26b (in Part III of this series), or principal component analysis (PCA) if the linear
combinations make scientific sense (e.g., particle size data). These multivariate techniques can provide effective analyses,
but come at the cost of increased complexity. If, for all practical purposes, the responses are not correlated, it is possible
to employ univariate analyses and use simple combinations (such as overlays and desirability functions) of the univariate
models to construct the design space (noting the need to account for uncertainty due to variability). In any case, potential
correlation between responses must be explored for scientific meaning and understanding.
Q22: Can I use PCA/PLS to analyze data from a statistically designed experiment?
A: The first choice of statistical method for analysing data from a multivariate DoE would be multiple a linear-regression (MLR)
model. Mathematically, PLS can be used in the analysis of a DoE but there is some question surrounding this tool's benefit.
PLS is a "latent variable" approach to analyze multivariate data when there is a correlation structure within data. If the
responses are independent (i.e., uncorrelated), and the data has originated from an orthogonal DoE (e.g., full-factorial,
fractional-factorial), then PLS will have no mathematical advantage compared to performing the analysis one response at a
time using MLR. In fact, there may be a disincentive to using PLS because if the response variables are all independent, PLS
will require the same number of latent variables as responses.
If the responses are correlated, PLS could be used, but there are several other preferred approaches:
- Perform the analysis on each response separately. This is the easiest and most interpretable approach. If some responses
are highly correlated, the factors that are significant and their interpretations will be similar.
- Perform PCA on the responses and analyze the principal components separately. The individual components may serve as interpretable
summary of the original responses (e.g., particle-size distribution data). Furthermore, the new components are independent
of each other. An exploratory PCA on responses may be useful to identify those responses that are likely to have similar univariate
models under the first approach.
- Use of other multivariate methods such as MANOVA or the Bayesian interval approach referenced in Question 26b, which will
appear in Part III of this series.
1 "Factor" is synonymous with "x," input, variable. A process parameter can be a factor as can an input material. For simplicity
and consistency, "factor" is used throughout the paper. "Response" is synonymous with "y" and output. Here, "response" is
either the critical quality attribute (CQA) or the surrogate for the CQA. For consistency, "response" is used throughout the
2 For continuity throughout the series, Figures and Tables are numbered in succession. Figure 1 and Table I appeared in Part
I of this article series.
1. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use,
Q8(R1), Pharmaceutical Development, Step 5, November 2005 (core) and Annex to the Core Guideline, Step 5, November 2008.
2. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use,
Q9, Quality Risk Management, Step 4 , November 2005.
3. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use,
Q10, Pharmaceutical Quality System, Step 5, June 2008.
4. A Posterior Predictive Approach to Multiple Response Surface Optimization, John Peterson, 2004.
5. Potter C., et al..al.. A Guide to EFPIA #8217;s Mock P.2. Document, Pharm Tech 2006.
6. Glodek, M., Liebowitz, S, McCarthy, R., McNally, G., Oksanen, C., Schultz, T., Sundararajan, M., Vorkapich, R., Vukovinsky,
K., Watts, C., and Millili, G. Process Robustness: A PQRI White Paper, Pharmaceutical Engineering, November/December 2006.
7. Box, G.E.P, W.G. Hunter, and J.S. Hunter (1978). Statistics for Experimenters: An Introduction to Design, Analysis and
Model Building. John Wiley and Sons.
8. Montgomery, D.C. (2001).). Design and Analysis of Experiments. John Wiley and Sons.
9. Box, G.E.P.,and N. R. Draper (1969). Evolutionary Operation: A Statistical Method for Process Improvement. John Wiley
10. Cox, D.R. (1992). Planning for Experiments. John-Wiley and Sons.
11. Cornell, J. (2002). Experiments with Mixtures: Designs, Models, and the Analysis of Mixture Data, 3rd Edition. John Wiley
12. Duncan, A.J. (1974). Quality Control and Industrial Statistics, Richard D. Irwin, Inc., Homewood, IL.
13. Myers, R.H. and Montgomery, D.C. (2002).). Response Surface Methodology: Process and Product Optimization Using Designed
Experiments. John Wiley and Sons.
14. Montgomery, D.C. (2001). Introduction to Statistical Quality Control, 4th Edition. John Wiley and Sons.
15. del Castillo, E. (2007).Process Optimization: A Statistical Approach. Springer. New Yor.k
16. Khuri, A. and Cornell, J. A. (1996.). Response Surfaces, 2nd Edition, Marcel-Dekker, New York.
17. MacGregor, J. F. and Bruwer, M-J. (2008). "A Framework for the Development of Design and Control Spaces", Journal of Pharmaceutical
Innovation, 3, 15-22.
18. Mir and#243;-Quesada, G., del Castillo, E., and Peterson, J.J., (2004). "A Bayesian Approach for Multiple Response Surface
Optimization in the Presence of Noise Variables", Journal of Applied Statistics, 31, 251-270.
19. Peterson, J. J. (2004). "A Posterior Predictive Approach to Multiple Response Surface Optimization", Journal of Quality
Technology, 36, 139-153.
20. Peterson, J. J. (2008). "A Bayesian Approach to the ICH Q8 Definition of Design Space", Journal of Biopharmaceutical Statistics,
21. Stockdale, G. and Cheng, A. (2009). "Finding Design Space and a Reliable Operating Region using a Multivariate Bayesian
Approach with Experimental Design", Quality Technology and Quantitative Management (in press).
Stan Altan is a senior research fellow at Johnson & Johnson Pharmaceutical R&D in Raritan, NJ. James Bergum is associate director of nonclinical biostatistics at Bristol-Myers Squibb Company in New Brunswick, NJ. Lori Pfahler is associate director, and Edith Senderak is associate director, scientific staff, both at Merck and Co. in West Point, PA. Shanthi Sethuraman is director of chemical product R&D at Lilly Research Laboratories in Indianapolis. Kim Erland Vukovinsky* is director of nonlinical statistics at Pfizer, MS 8200-3150, Eastern Point Rd., Groton, CT 06340, tel. 860.715.0916, firstname.lastname@example.org
. At the time of this writing, all authors were members of the Pharmaceutical Research and Manufacturers of America (PhRMA)
Chemistry, Manufacturing, and Controls Statistics Experts Team (SET).
*To whom all correspondence should be addressed.
Submitted: Jan. 12, 2010. Accepted: Jan. 27, 2010.
1. S. Altan et al., Pharm. Technol.
34 (7) 66–70 (2010).
The authors wish to thank Raymond Buck, statistical consultant; Rick Burdick, Amgen; Dave Christopher, Schering-Plough; Peter
Lindskoug, AstraZeneca; Tim Schofield and Greg Stockdale, GSK; and Ed Warner, Schering-Plough, for their advice and assistance
with this article.
See Part III of this article series