Statistical Considerations in Design Space Development (Part II of III) - Pharmaceutical Technology

Latest Issue

Latest Issue
PharmTech Europe

Statistical Considerations in Design Space Development (Part II of III)
The authors discuss the statistical tools used in experimental planning and stategy and how to evaluate the resulting design space and its graphical representation.

Pharmaceutical Technology
Volume 34, Issue 8, pp. 52-60

Figure 7: Assay design space for Factors A and B. The red points are the experimental design points; note that although the axes extend from –1.5 to +1.5 coded units, the experimental factors only spanned -1.41 to +1.41 coded units.
Q21: How do I construct a design space when I have multiple important responses?

A: If each response can be adequately modeled using univariate analyses, the simplest approach to dealing with multiple responses is to overlay contour plots of the fitted model for each response. The overlay plot will indicate regions where each mean response is within its required bounds.

To illustrate, the example used in Question 19 can be extended to include two responses: Assay and Degradate 1. The design space can be constructed using the results from the DOE with the factors (A and B) and two quality characteristics. The analysis of the data from the DoE found that the responses could be adequately modeled using the following equations and that these responses were not highly correlated:
Assay =96.9+3.2* Factor A -17*
Factor B - 2.2* Factor A2+3.0*
Factor A* Factor B Degradate = 0.82
– 0.30* Factor A +0.49* Factor B +0.72*
Factor A* Factor B

Figure 8: Overlay plot of assay (%) and degradate 1 (%) for Factors A and B where the bounds for assay are 95.0%–105.0% and the upper bound for degradate 1 is 1.00%. The red points represent the observed results.
When there is more than one quality characteristic in the design space, the use of overlay plots is helpful. An overlay plot is created by superimposing the contour plots for each quality characteristic with the required quality bounds (e.g., specifications). A potential design space can often be found where all the mean quality characteristics are simultaneously within the requirements. Figure 8 provides a potential window of operability for Assay and Degradate 1. For this overlay plot, the requirements for assay are 95.0% to 105.0% and Degradate 1 is NMT 1.00%. The yellow region indicates the settings of Factors A and B that meet both of these requirements simultaneously. The red points included on the plot are the experimental design points; note that the axes extend from -1.5 to +1.5 coded units although the experimental space is from -1.41 to +1.41 coded units.

With multiple important responses, it is valuable to understand the correlation structure of the responses. A correlation analysis can help determine if each response should be assessed separately or if the responses should be analyzed together. If a set of the responses is highly correlated, it may be possible to eliminate some responses from the analysis, recognizing that each of the correlated responses contains the same information. As a result, one response can be chosen to represent the set of responses, and it can be analyzed using univariate methods. If the results are moderately correlated, analysis methods that take the correlation structure into account may be used such as multivariate analysis (MANOVA), the Bayesian interval approach referenced in Question 26b (in Part III of this series), or principal component analysis (PCA) if the linear combinations make scientific sense (e.g., particle size data). These multivariate techniques can provide effective analyses, but come at the cost of increased complexity. If, for all practical purposes, the responses are not correlated, it is possible to employ univariate analyses and use simple combinations (such as overlays and desirability functions) of the univariate models to construct the design space (noting the need to account for uncertainty due to variability). In any case, potential correlation between responses must be explored for scientific meaning and understanding.

Q22: Can I use PCA/PLS to analyze data from a statistically designed experiment?

A: The first choice of statistical method for analysing data from a multivariate DoE would be multiple a linear-regression (MLR) model. Mathematically, PLS can be used in the analysis of a DoE but there is some question surrounding this tool's benefit. PLS is a "latent variable" approach to analyze multivariate data when there is a correlation structure within data. If the responses are independent (i.e., uncorrelated), and the data has originated from an orthogonal DoE (e.g., full-factorial, fractional-factorial), then PLS will have no mathematical advantage compared to performing the analysis one response at a time using MLR. In fact, there may be a disincentive to using PLS because if the response variables are all independent, PLS will require the same number of latent variables as responses.

If the responses are correlated, PLS could be used, but there are several other preferred approaches:

  • Perform the analysis on each response separately. This is the easiest and most interpretable approach. If some responses are highly correlated, the factors that are significant and their interpretations will be similar.
  • Perform PCA on the responses and analyze the principal components separately. The individual components may serve as interpretable summary of the original responses (e.g., particle-size distribution data). Furthermore, the new components are independent of each other. An exploratory PCA on responses may be useful to identify those responses that are likely to have similar univariate models under the first approach.
  • Use of other multivariate methods such as MANOVA or the Bayesian interval approach referenced in Question 26b, which will appear in Part III of this series.

1 "Factor" is synonymous with "x," input, variable. A process parameter can be a factor as can an input material. For simplicity and consistency, "factor" is used throughout the paper. "Response" is synonymous with "y" and output. Here, "response" is either the critical quality attribute (CQA) or the surrogate for the CQA. For consistency, "response" is used throughout the paper.
2 For continuity throughout the series, Figures and Tables are numbered in succession. Figure 1 and Table I appeared in Part I of this article series.

Additional reading

1. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q8(R1), Pharmaceutical Development, Step 5, November 2005 (core) and Annex to the Core Guideline, Step 5, November 2008.

2. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q9, Quality Risk Management, Step 4 , November 2005.

3. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q10, Pharmaceutical Quality System, Step 5, June 2008.

4. A Posterior Predictive Approach to Multiple Response Surface Optimization, John Peterson, 2004.

5. Potter C., et A Guide to EFPIA #8217;s Mock P.2. Document, Pharm Tech 2006.

6. Glodek, M., Liebowitz, S, McCarthy, R., McNally, G., Oksanen, C., Schultz, T., Sundararajan, M., Vorkapich, R., Vukovinsky, K., Watts, C., and Millili, G. Process Robustness: A PQRI White Paper, Pharmaceutical Engineering, November/December 2006.

7. Box, G.E.P, W.G. Hunter, and J.S. Hunter (1978). Statistics for Experimenters: An Introduction to Design, Analysis and Model Building. John Wiley and Sons.

8. Montgomery, D.C. (2001).). Design and Analysis of Experiments. John Wiley and Sons.

9. Box, G.E.P.,and N. R. Draper (1969). Evolutionary Operation: A Statistical Method for Process Improvement. John Wiley and Sons.

10. Cox, D.R. (1992). Planning for Experiments. John-Wiley and Sons.

11. Cornell, J. (2002). Experiments with Mixtures: Designs, Models, and the Analysis of Mixture Data, 3rd Edition. John Wiley and Sons.

12. Duncan, A.J. (1974). Quality Control and Industrial Statistics, Richard D. Irwin, Inc., Homewood, IL.

13. Myers, R.H. and Montgomery, D.C. (2002).). Response Surface Methodology: Process and Product Optimization Using Designed Experiments. John Wiley and Sons.

14. Montgomery, D.C. (2001). Introduction to Statistical Quality Control, 4th Edition. John Wiley and Sons.

15. del Castillo, E. (2007).Process Optimization: A Statistical Approach. Springer. New Yor.k

16. Khuri, A. and Cornell, J. A. (1996.). Response Surfaces, 2nd Edition, Marcel-Dekker, New York.

17. MacGregor, J. F. and Bruwer, M-J. (2008). "A Framework for the Development of Design and Control Spaces", Journal of Pharmaceutical Innovation, 3, 15-22.

18. Mir and#243;-Quesada, G., del Castillo, E., and Peterson, J.J., (2004). "A Bayesian Approach for Multiple Response Surface Optimization in the Presence of Noise Variables", Journal of Applied Statistics, 31, 251-270.

19. Peterson, J. J. (2004). "A Posterior Predictive Approach to Multiple Response Surface Optimization", Journal of Quality Technology, 36, 139-153.

20. Peterson, J. J. (2008). "A Bayesian Approach to the ICH Q8 Definition of Design Space", Journal of Biopharmaceutical Statistics, 18, 958-974.

21. Stockdale, G. and Cheng, A. (2009). "Finding Design Space and a Reliable Operating Region using a Multivariate Bayesian Approach with Experimental Design", Quality Technology and Quantitative Management (in press).

Stan Altan is a senior research fellow at Johnson & Johnson Pharmaceutical R&D in Raritan, NJ. James Bergum is associate director of nonclinical biostatistics at Bristol-Myers Squibb Company in New Brunswick, NJ. Lori Pfahler is associate director, and Edith Senderak is associate director, scientific staff, both at Merck and Co. in West Point, PA. Shanthi Sethuraman is director of chemical product R&D at Lilly Research Laboratories in Indianapolis. Kim Erland Vukovinsky* is director of nonlinical statistics at Pfizer, MS 8200-3150, Eastern Point Rd., Groton, CT 06340, tel. 860.715.0916,
. At the time of this writing, all authors were members of the Pharmaceutical Research and Manufacturers of America (PhRMA) Chemistry, Manufacturing, and Controls Statistics Experts Team (SET).

*To whom all correspondence should be addressed.

Submitted: Jan. 12, 2010. Accepted: Jan. 27, 2010.


1. S. Altan et al., Pharm. Technol. 34 (7) 66–70 (2010).


The authors wish to thank Raymond Buck, statistical consultant; Rick Burdick, Amgen; Dave Christopher, Schering-Plough; Peter Lindskoug, AstraZeneca; Tim Schofield and Greg Stockdale, GSK; and Ed Warner, Schering-Plough, for their advice and assistance with this article.


See Part III of this article series


blog comments powered by Disqus
LCGC E-mail Newsletters

Subscribe: Click to learn more about the newsletter
| Weekly
| Monthly
| Weekly

FDASIA was signed into law two years ago. Where has the most progress been made in implementation?
Reducing drug shortages
Breakthrough designations
Protecting the supply chain
Expedited reviews of drug submissions
More stakeholder involvement
Reducing drug shortages
Breakthrough designations
Protecting the supply chain
Expedited reviews of drug submissions
More stakeholder involvement
View Results
Eric Langerr Outsourcing Outlook Eric LangerTargeting Different Off-Shore Destinations
Cynthia Challener, PhD Ingredients Insider Cynthia ChallenerAsymmetric Synthesis Continues to Advance
Jill Wechsler Regulatory Watch Jill Wechsler Data Integrity Key to GMP Compliance
Sean Milmo European Regulatory WatchSean MilmoExtending the Scope of Pharmacovigilance Comes at a Price
From Generics to Supergenerics
CMOs and the Track-and-Trace Race: Are You Engaged Yet?
Ebola Outbreak Raises Ethical Issues
Better Comms Means a Fitter Future for Pharma, Part 2: Realizing the Benefits of Unified Communications
Better Comms Means a Fitter Future for Pharma, Part 1: Challenges and Changes
Source: Pharmaceutical Technology,
Click here