Statistical Considerations in Design Space Development (Part I of III)

July 2, 2010
Pharmaceutical Technology, Pharmaceutical Technology-07-02-2010, Volume 34, Issue 7

The authors discuss the statistical tools used in experimental planning and strategy and how to evaluate the resulting design space and its graphical representation.

Design space is part of the US Food and Drug Administration andapos;s quality initiative for the 21st century which aims to move toward a new paradigm for pharmaceutical assessment as outlined in the International Conference on Harmonization andapos;s quality guidelines Q8, Q9, and Q10. The statistics required for design-space development play an important role in ensuring the robustness of this approach.

The objective of this article is to provide concise answers to frequently asked questions (FAQs) related to the statistical aspects of determining a design space as part of quality-by-design (QbD) initiatives. These FAQs reflect the experiences of a diverse group of statisticians who have worked closely with process engineers and scientists in the chemical and pharmaceutical development disciplines, grappling with issues related to the establishment of a design space from a scientific, engineering, and risk-based perspective. The answers provided herein constitute basic information regarding statistical considerations and concepts, and will be beneficial to a scientist working to establish a design space in collaboration with a statistician (see Figure 1).

Figure 1: Areas of common statistical questions regarding the development of a design space. (FIGURE IS COURTESY OF THE AUTHORS)

In this series, the authors discuss the statistical tools used in the following: experimental planning and strategy, including the use, analysis, and interpretation of historical information; the analysis of a statistically designed experiment (DoE); defining a design space based on the results of the DoE; and evaluating the resulting design space and its graphical representation. The concept of risk is interwoven throughout the FAQs to emphasize the notion that risk management is inherent in virtually all development decisions (including all phases of development and clinical trials) and that the application of statistical thinking and statistical methods is intended to investigate, mitigate, and control risk.

The role of statistics with respect to the design space is grouped into sections as follows:

  • Experimental design planning (Questions 1-7, Part I of III)

  • The design in statistical DoE (Questions 8-18, Part II of III)

  • The analysis in statistical DoE (Questions 19-22, Part II of III)

  • Presenting a design space (Questions 23-24, Part III of III)

  • Evaluating a design space (Questions 25-29, Part III of III).

The set of questions and answers is not meant to be exhaustive; additional relevant questions could be posed that go into greater detail or scope in studying a design space. Nevertheless, these questions cover a broad enough range and detail to be useful to the practicing scientist or process engineer.

Overview

The goal of FDA andapos;s QbD initiatives and current good manufacturing practice (CGMP) for the 21st century is to more effectively and consistently produce high quality products through better scientific understanding and risk assessment of drug-product formulation and drug-product and drug-substance manufacture. Quality systems are modified based on designing quality into the process and product throughout the life cycle. Early in the design of the manufacturing process, it is recommended that a risk assessment be performed by a cross-functional team to identify the critical quality attributes (CQAs) and process parameters believed to affect the CQAs, and the effect of the quality of the incoming materials in the manufacturing processes.

According to ICH guidelines Q8(R2), Q9, and Q10, CQA is defined as a physical, chemical, biological, or microbiological property or characteristic that should be within an appropriate limit, range, or distribution to ensure the desired product quality. A design space is a multidimensional combination of input variables (e.g., material attributes), their interactions, and process parameters that have been demonstrated to provide assurance of quality. A statistical and/or mechanistic model can be used to establish a design space. In addition to the establishment of a design space, a target operating setting within the design space is defined to ensure a certain desired level of consistency of the product and process performance during routine operations. As the operating range of process variables approaches the boundary defining a design space, the risk of failing a specification may increase. Whether or not there is statistical significance to this risk (i.e., effect on quality and, thus, safety and efficacy) depends on the proximity of the operating variables andapos; ranges to the design space boundary and the attenuation of the control strategy to detect and mitigate the risk. The execution of an operating plan, including an appropriate control strategy and appropriate process monitoring, is essential to the success of the overall process and product performance.

Statistics is the science of making decisions in the face of uncertainty. Statistical thinking and methods thereby bring established tools and approaches to the determination of a design space, and help to maintain a process that is in control and capable of producing appropriate high quality product. Statistical experimental design is one especially useful tool for establishing a design space in conjunction with risk-based and other modeling tools. This tool provides an effective and efficient way to simultaneously test for factor effects and interactions and to describe causative relationships between process parameters or input materials with the quality attributes.

Experimental design planning

Developing a design space requires a significant level of planning. A critical aspect of QbD is to understand which factors* affect the responses and#8224; associated with the CQAs and to specify the operating regions that meet product requirements. Data-mining and risk assessment contribute to the selection of factors and responses and to the planning of a DoE. In the following FAQ section, the authors explore the role of historical data, preliminary runs, DoE, responses, factors, and factor ranges in developing a design space.

Q1a: What is the role of prior knowledge such as historical data, in developing a design space?

A: A starting point for statistical experimental design is the review of available historical information. Historical data might consist of information obtained from previous commercialized products and processes or literature and fundamental scientific understanding. Some possible sources of useful information include: exploratory laboratory-trial data, analytical data, stability data, batch-release data, regular production batches, and deviation data such as factors that do not meet proven acceptable ranges or operating ranges or responses that fail to meet control limits or specifications.

Q1b: How and what type of information can be gleaned from historical data?

A: Information gleaned from historical data can provide valuable information in the planning of designed studies. As part of planning, it is critical to identify appropriate responses-that is, those responses that are linked to the quality attributes and factors that could contribute to the variability in identified quality attributes. Historical data also can provide information on relationships between factors and responses that can be used to set up experimental studies. For example, information from unit operations or on equipment capability could provide a starting point for factors and factor ranges to include in the designed studies. Although analysis of historical data may provide information on relationships, it should be understood that the structure of the data may affect the conclusions of the analysis. The relationships in the context of historical data can do no more than point to an association; the data cannot be understood as causal. When the historical database is extensive, a variety of tools can be applied to understand relationships, including Principal Component Analysis (PCA) and Partial Least Squares (PLS).

Q1c: What problems can arise when using historical data?

A: It is important to exercise caution when mining historical data because historical data are restricted to observed variation in levels versus forced controlled variation in levels from a planned statistical design. Gaps in knowledge, therefore, are likely. Problems can arise due to the nature of observational data. No analysis can make up for deficiencies in the structure and quality of the data. Below are a few critical points that can arise when analyzing historical data.

  • Multicollinearity: In observational data in which two or more factors are highly correlated, it is difficult to identify which factor(s) are affecting the response. No analysis can remedy this problem.

  • Missing factors: Important factors may not be recorded in the data and if the recorded data are correlated with unrecorded causal factors, a partial relatonship may be mistakenly proposed. A common mistake is to attribute causality when associational relationships are all one can propose until a confirmatory experiment is carried out.

  • Missing data and imbalance: Historical data can frequently suffer from a relatively large proportion of missing information. Prediction in areas where there is no information or extrapolation to areas where no data exists can be highly misleading.

  • Precision of the data: Over-rounded data can lead to misleading conclusions.

  • Range of factor levels: Data with a wider range may provide knowledge on a relationship. If the inputs haven andapos;t been varied across an appropriate range, the relationship won andapos;t be detected.

The relationships established by historical-data analysis should be confirmed by appropriate DOEs before defining or expanding a design space.

Q2: What role do experimental design and DoE play in establishing a design space?

A: There are underlying mathematical models to all scientific endeavors: compound properties, formulation development, and process development. Good approximate models can be developed to understand the effect of process parameters and material inputs on formulation and processing quality attributes, so that acceptable outcomes can be assured. One efficient and effective means to determine these approximate models, which are causal and not merely correlative, is through DoE. Causality in the relationship between factors and responses is a consequence of having observed specific changes in the responses as factor levels are varied. Applying DoE principles in conjunction with mechanistic understanding through the use of first principles when available provides a model-based scientific understanding of the system and process. A design space can be thought of as a summary of such understanding (i.e., a andquot;region of goodness andquot;). Other desirable DoE properties include maximizing the information with a minimum number of runs, exploring interactions between factors in an efficient way, allowing model testing and the ability to apply randomization and blocking principles to minimize biases.

Q3: How should responses (e.g., impurity level, particle size) be selected and what are the consequences if the responses are not appropriate or not well defined?

A: The first step in designing an experiment is to decide on the purpose of the study. The purpose may be to study the effect of certain factors on the responses, to estimate a predictive model that relates factors to responses, to screen factors, or to optimize the process. Once there is agreement on the purpose of the study, determine the most appropriate responses to measure.

Choosing responses that may be the CQAs or closely related to them is fundamental. It is therefore essential to decide on the candidate process parameters and CQAs as early as possible. It is important to think critically about which responses to measure so that during the statistical analysis one does not discover that a key response was not collected. In addition, there could be significant consequences if certain important data attributes are not considered when choosing the relevant responses. Some important considerations and consequences are listed in Table I.

Table I: Considerations in selecting responses when developing a design space.

Q4a: How should factors be selected?

A: Typically, Ishikawa charts or fishbone diagrams are useful in listing potential factors that could explain the variability in the key responses. Following a risk-based approach, one can choose a subset of potentially important primary factors. These factors can be classified as:

  • Controllable (e.g., equivalents of starting material, processing speed)

  • Mixture (i.e., when the independent factors are proportions of different components of a blend [e.g., amounts of excipients])

  • Blocking (i.e., when the experiment is carried out across several groups of runs [e.g., days might be a block when the experimental runs are carried out across several days])

  • Measurable, but not controlled or controlled within a range (e.g., amount of water in the reagent, % loss on drying)

  • Noise or nuisance (e.g., ambient temperature or humidity, that cannot be controlled and may not be capable of being measured): These factors are not accounted for in the statistical model and their overall effect is contained within the random residual variability estimate.

Q4b: What are the consequences if factors are not selected properly?

A: Ultimately, the factors selected for a DoE are those that experts involved in the risk assessment and historical review suggest could have an effect on the responses. It is possible for the selection to be incorrect, with the effect of the error varying from situation to situation, as outlined below.

  • It may be that for a particular study, important factors or their interactions were not included. They are important in the sense that had they been included, they would have shown substantial effects on the response(s). Because design space is limited to the region defined by the factor ranges considered in the study, the effect of factors not included in the study is unknown. For factors held constant during the study, additional trials would be needed to evaluate what effect, if any, they have on the response.

  • Those factors which are not controlled in the initial study (i.e., noise or nuisance factors), may affect the ability to accurately estimate and understand the impact of those factors studied in the initial design. The effect of a factor which had not been studied may appear later when it does vary. As a result, problem-solving work may be necessary, leading to a project delay. Although special designs can be conducted to address noise factors, this topic is out of scope in this article.

  • Including a factor in a DoE and finding that it has no effect on the responses may appear to be a waste of resources. In fact, there may be great value in learning about this lack of sensitivity because this factor can be set to minimize cost or increase convenience.

Q5: What is an appropriate number of factors to study in a designed experiment?

A: There is no strict requirement on the number of factors to be included in a study. The number of factors has to be balanced against the goal of the study (i.e., optimization or effect estimation and#8212;see Questions 8-18 in Part II of this series) and the required information for establishing a design space versus any time or resource constraints that are imposed on the experimenters. Fishbone diagrams and a risk-based approach could lead to identifying factors as those that have a high probability of impact, potential to impact, and also those that are very unlikely to impact the responses of interest. Time and resources are typically determined based on the number of factors, as considering more factors or desiring a more detailed understanding of the impact of the factors (e.g., response surface estimation) leads to a larger experiment.

Q6a: How should the ranges for each factor be selected?

A: The ranges should be set in relation to:

  • Feasibility at the commercial scale

  • Basic knowledge of the product, including modeling

  • If feasible, capability of laboratory simulation and modeling.

There may be other constraints related to the practical or physical limitations of the equipment. If the ranges are too wide, the observed effect may be too large, thereby swamping the effects of other factors. Such an effect also could provide little information about the region of interest or fall outside a linear region, thereby making modeling more complex or even leading to immeasurable responses. If the ranges are too narrow, it may not be possible to explore the region of interest sufficiently, or, the effect may be too small relative to the measurement variability to provide good model estimates, or in some cases, the effect may not be observed. In addition, there may be some factors that cannot be simulated at a laboratory scale or do not scale up well (e.g., mixing, feed rates). Those factors need to be identified and other possible ways to account for their effect on the responses should be considered.

Q6b: Would it help to carry out preliminary runs before conducting a DoE?

A: Preliminary experiments might be useful for establishing experimental ranges. Also, the design may contain certain combinations of factors that are not feasible to execute. The design should be thoroughly reviewed by a multidisciplinary team before execution. It is a good practice to explore the factor-level combinations using a risk-based approach before embarking on the traditional random order used for executing the design. Sometimes, performing one or two experimental runs representing the extreme of the design can provide key information not only about the process but also about the appropriate use of resources. If the process does not perform for these settings, it may be prudent to change some of the factor ranges and redesign the study to ensure that informative response data can be obtained from each trial.

Q7: Is it a better to run one big DoE study or several small DoE studies to determine a design space?

A: Running several small experiments versus one large experimental study depends on, but is not limited to, the following:

  • The purpose of the study

  • The availability of raw materials and other resources

  • The amount of available prior information (data mining, historical information, or one-factor-at-a-time studies) or basic scientific knowledge

  • The amount of time it takes to perform the study (including set-up and runs).

Manufacturing processes generally consist of unit operations, each of which contains several factors to evaluate. Each operation could be analyzed separately, or a single design could be used to study several unit operations (see Question 15 in Part II of this series).

There are numerous pros and cons to consider when deciding which approach to take. For example, one advantage of conducting a larger experiment is that interactions between more factors can be evaluated. It is possible to create a design in which factors that are expected to interact with one another are included in the same DoE and factors that do not interact with each other are included in another smaller design. On the other hand, if something goes wrong during the experiment, a smaller study approach could save resources. Smaller designs are also useful when final ranges for the factors have not been determined. If the factor levels are too wide and a large experiment is performed, there is an increased risk that many of the experiments could fail. However, running a small experiment usually requires the factors not included in the design to be fixed at a specific level. Therefore, any interaction between these factors and those in the experiment cannot be examined.

*Factor is synonymous with andquot;x, andquot; input, variable. A process parameter can be a factor as can an input material. For simplicity and consistency, factor will be used throughout the paper.

and#8224;Response is synonymous with andquot;y andquot; and output. Here, response is either the critical quality attribute (CQA) or the surrogate for the CQA. For consistency response will be used throughout the paper.

Additional reading

1. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q8(R1), Pharmaceutical Development, Step 5, November 2005 (core) and Annex to the Core Guideline, Step 5, November 2008.

2. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q9, Quality Risk Management, Step 4 , November 2005.

3. International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, Q10, Pharmaceutical Quality System, Step 5, June 2008.

4. A Posterior Predictive Approach to Multiple Response Surface Optimization, John Peterson, 2004.

5. Potter C., et al..al.. A Guide to EFPIA and#8217;s Mock P.2. DocumentDocument, Pharm Tech 2006.

6. Glodek, M., Liebowitz, S, McCarthy, R., McNally, G., Oksanen, C., Schultz, T., Sundararajan, M., Vorkapich, R., Vukovinsky, K., Watts, C., and Millili, G. Process Robustness: A PQRI White Paper, Pharmaceutical Engineering, November/December 2006.

7. Box, G.E.P, W.G. Hunter, and J.S. Hunter (1978). Statistics for Experimenters: An Introduction to Design, Analysis and Model Building. John Wiley and Sons.

8. Montgomery, D.C. (2001).). Design and Analysis of Experiments. John Wiley and Sons.

9. Box, G.E.P.,and N. R. Draper (1969). Evolutionary Operation: A Statistical Method for Process Improvement. John Wiley and Sons.

10. Cox, D.R. (1992). Planning for Experiments. John-Wiley and Sons.

11. Cornell, J. (2002). Experiments with Mixtures: Designs, Models, and the Analysis of Mixture Data, 3rd Edition. John Wiley and Sons.

12. Duncan, A.J. (1974). Quality Control and Industrial Statistics, Richard D. Irwin, Inc., Homewood, IL.

13. Myers, R.H. and Montgomery, D.C. (2002).). Response Surface Methodology: Process and Product Optimization Using Designed Experiments. John Wiley and Sons.

14. Montgomery, D.C. (2001). Introduction to Statistical Quality Control, 4th Edition. John Wiley and Sons.

15. del Castillo, E. (2007).Process Optimization: A Statistical Approach. Springer. New Yor.k

16. Khuri, A. and Cornell, J. A. (1996.). Response Surfaces, 2nd Edition, Marcel-Dekker, New York.

17. MacGregor, J. F. and Bruwer, M-J. (2008). "A Framework for the Development of Design and Control Spaces", Journal of Pharmaceutical Innovation, 3, 15-22.

18. Mir and#243;-Quesada, G., del Castillo, E., and Peterson, J.J., (2004). "A Bayesian Approach for Multiple Response Surface Optimization in the Presence of Noise Variables", Journal of Applied Statistics, 31, 251-270.

19. Peterson, J. J. (2004). "A Posterior Predictive Approach to Multiple Response Surface Optimization", Journal of Quality Technology, 36, 139-153.

20. Peterson, J. J. (2008). "A Bayesian Approach to the ICH Q8 Definition of Design Space", Journal of Biopharmaceutical Statistics, 18, 958-974.

21. Stockdale, G. and Cheng, A. (2009). "Finding Design Space and a Reliable Operating Region using a Multivariate Bayesian Approach with Experimental Design", Quality Technology and Quantitative Management (in press).

Acknowledgments

The authors wish to thank Raymond Buck, statistical consultant; Rick Burdick, Amgen; Dave Christopher, Schering-Plough; Peter Lindskoug, AstraZeneca; Tim Schofield and Greg Stockdale, GSK; and Ed Warner, Schering-Plough, for their advice and assistance with this article.

Stan Altan is a senior research fellow at Johnson andamp; Johnson Pharmaceutical R andamp;D in Raritan, NJ. James Bergum is associate director of nonclinical biostatistics at Bristol-Myers Squibb Company in New Brunswick, NJ. Lori Pfahler is associate director, and Edith Senderak is associate director/scientific staff, both at Merck and Co. in West Point, PA. Shanthi Sethuraman is director of chemical product research and development at Lilly Research Laboratories in Indianapolis. Kim Erland Vukovinsky* is director of nonlinical statistics at Pfizer, MS 8200-3150, Eastern Point Rd., Groton, CT 06340, tel. 860.715.0916, kim.e.vukovinsky@pfizer.com. At the time of this writing, all authors were members of the Pharmaceutical Research and Manufacturers of America (PhRMA) Chemistry, Manufacturing, and Controls Statistics Experts Team (SET).

*To whom all correspondence should be addressed.

Submitted: Jan. 12, 2010. Accepted: Jan. 27, 2010.

See Part II of this article seriesSee Part III of this article series