Q17: When developing a design space, one tends to increase batch size from small laboratory batches to medium and large production
batches. Because production size batches are more expensive, how can one minimize the resources used to develop the final
design space?
Table II^{2}: Advantages and challenges in the use of laboratoryscale models.

A: A significant amount of resources are required to establish a detailed design space using largescale batches. Using a laboratory
scale model is beneficial to understanding the chemistry and effect of other factors that are scaleindependent on the desired
responses. Most of the effort to define ranges and evaluate factors should be done at the laboratory scale(s). One strategy
is to reduce the number of factors and the number of batches as the batch size increases while maintaining the most important
factors. Using laboratory scale models has certain advantages and challenges (see Table II^{2} ).
One challenge when using a laboratoryscale model is not being able to simulate the impact of scaledependent factors. Some
factors may be known to be dependent on batch size. In other cases, mechanistic knowledge may enable a scaleindependent factor
to be identified rather than perhaps a more obvious scaledependent factor. Increasing the batch size from laboratory scale
to a medium size may identify other scaledependent factors that need to be studied at production scale.
There are a few options for addressing scaleup:
 Identify the least desirable, yet acceptable, combination(s) of the factors that have been studied in the laboratory. The
least desirable combination for each critical quality attribute (CQA) is determined by scientific knowledge and information
obtained from laboratory data that is likely to give the least desired response. In addition, identify the operating target
combination obtained from the laboratory scale model. Perform the worstcase combination(s), the operating target combination,
and preferably the bestcase combination(s) in duplicate at the large scale. If the combined experimental runs provide results
that pass the specifications, then the ranges from the large scale can be used to establish the ranges of the design space
for the factors.
 Identify the scaledependent factors and their potential ranges based on laboratory and mediumsize batches. Perform a very
small designed experiment with these factors (at two levels) with the other factors at their worst combination(s) so that
any potential interaction of the scaledependent and scaleindependent factors can be studied at large scale.
 Generate scaledependent data on intermediate scales using approach (b) above and perform approach (a) on the actual production
scale. This approach might be useful when material cost is expensive.
 Augment a design space over time using information from batches that are made at large scale. The data should include deviation
data and data from other small designed studies that may have been performed to gain more knowledge about the process in real
time.
 Use mechanistic models that can provide a good understanding of scaledependent rates. These models allow one to change factor
settings in large scale to ensure the desired response. Typical examples of this change include heat transfers and mass transfers
in chemical reactions. Equipment and scale effects can be studied using rate kinetics studies in the laboratory or by performing
in silico experiments on a mechanistic model. These can provide an optimum combination for large scale production.
Q18: Are there any other statistical principles to consider when designing the experiment that might reduce the number of batches
required or improve identification and/or estimation of important factors and their effects?
A: There are several statistical principles that can be used to ensure that the important factors are identified and their effects
are estimated without ambiguity, as outlined below.
 Randomization refers to running experiments in such a way that the order in which the experimental runs are performed do
not induce systematic bias in the results. To accomplish randomization, each run should be treated as though it is the first;
that is, all factors should be reset before each run. Any divergence from complete randomness for convenience in running the
experiment introduces a potential change into the analysis approach. Divergence from traditional randomization can often be
economically desirable or physically necessary, but should be discussed with a statistician before running the experiment.
For example:
 Splitting the batch into subbatches. In a DOE, there could be factors applied to the parent batch and other factors applied
to the subbatches. An example would be to manufacture tablet batches with different combinations of factors (e.g., mixing
time, lubricant level, mixing speed) and then split the core tablets from each batch into subbatches for coating, which would
have its own set of factors (e.g., spray rate, pan speed). These types of designs are called splitplots and are analyzed
in a different way than a factorial experiment because the error structure is more complicated.
 Performing an experiment in a convenient manner. It may be convenient to make all of the hightemperature batches first and
the lowtemperature batches afterward. However, this arrangement can introduce a splitplot structure and other biases into
the experiment and lead to a misleading conclusion.
 Not planning the order of analytical (or other) testing. It is important to ensure that measurement sources of variability
are not confounded with the effects being studied. Consider the following examples demonstrating the impact of analytical
variability on the samples from the experimental phase. Suppose that six batches were manufactured in the experiment and sent
to the analytical department for dissolution testing. Assume that six analytical tests are performed with six tablets from
the same batch and all tested in the same analytical run. The batch results are now confounded with the analytical run. A
better approach would be to perform six analytical runs with six tablets per run but within each run, use one tablet from
each batch. This approach will separate the batchtobatch variability from the analytical variability, thereby improving
the comparison between batches. Analytical runs could also be confounded with factor effects. If samples generated from all
of the high levels of a factor (e.g., temperature) in an experiment are tested in one analytical run and the low levels in
another analytical run, it is possible that a significant difference between the high and low levels is not due to a temperature
effect but rather may be due to daytoday variation in the analytical method (i.e., the factor temperature is confounded
with analytical days). Instead, one might randomly assign the samples from the experimental study to the different analytical
runs.
 Prior to running the experiment, the sampling strategy determines where and how many samples should be obtained to maximize
information. Collecting samples throughout the execution of a given experimental run provides significant information that
can be used to understand mechanisms across the run. Preliminary experiments are useful for developing the sampling strategy.
This exercise also helps to identify what type of analytical methods would be beneficial for obtaining accurate and precise
information about the process.
 Replication refers to running one or more independent repeats of the same experimental condition or setting to provide reliable
estimates and/or conclusions from the data. Replication should not be confused with repeated measurements of the same sample
or measurements of multiple samples taken at the same experimental condition. To have adequate ability or power to distinguish
factor effects and estimate with good precision the coefficients of the statistical model, it is essential to have sufficient
replication of the design points in the study. Most of the time, the center design point of the statistical study is replicated.
However, there may be situations in which replication of the extreme design points provide precision information across the
statistical design. In general, the number and location of replicates depend on the size of the effects to be detected, the
desired statistical power, and the range of the factors in the study.
 Statistical thinking refers to recognizing that all work occurs in a series of interconnected processes. Before running a
designed experiment, the process and potential sources of variability should be discussed, understood, and controlled as much
as possible. Diligence in this manner will aid in assuring more accurate and precise data. The scientist should be well trained
and knowledgeable about the process flow and the equipment. In this case, performing a preliminary experiment(s) can be helpful.
Learning how to operate the equipment while conducting the experiment adds a confounding factor to the experiment, which may
adversely affect the conclusions. In addition, because the analysis usually depends on analytical results, the methods should
have adequate accuracy and precision before performing an experiment.
Analysis of DoE
This section provides an overview and interpretation of the analysis of DoE data beginning with the simple single response
and singlefactor case progressing to the single response with multiplefactor case, and concludes with multiple responses
and multiple factors. Analysis of variance (ANOVA) along with partial least squares (PLS) analyses are discussed.
Q19: How do I analyze the data and define the design space for one response and one factor?
Figure 2: Categorical Factor X and continuous product attribute Y.

A: Analyzing an experiment to study the effect of a single material attribute or a process factor, X, on a single response,
Y, varies based on whether X and Y are categorical or continuous and whether the relationship is described by a welldefined
mechanistic model or an empirical model. Typically, a design space has more than one CQA and more than one factor, but this
example is useful to understand more complex situations. Figure 2^{2} shows an example where X is categorical. Figure 3 shows an example where X is continuous and the relationship between X
and Y is described by a mechanistic model.
When X is categorical as shown in Figure 2, statistical intervals on the response are constructed for different values of
the categorical factor. Figure 2 demonstrates a box–plot display of the response for each of the X values (A, B, C, D). Box
plots summarize the data for each X value, showing the spread and center of the response at the value of X. The design space
would include all the X values where the statistical limits on Y provide acceptable quality.
Figure 3: Arrhenius relationship.

When X is continuous, functional relationships are fitted between X and Y as shown in Figure 3. When the relationship between
X and Y is not welldefined, or is too complicated, empirical models may be used to approximate the underlying relationship
between X and Y. Firstorder and second order polynomial models are frequently used and have been found to be very useful
in approximating the relationship between X and Y over limited ranges of X.
Figure 4: Empirical model of degradate (Y) on Factor B (Degradate = 0.72 + 0.57 * Factor B).

Figure 4 illustrates an example of an empirical model of Y on X. A linear regression equation is used to model how the level
of a degradate changes with varying levels of Factor B. If the specification for degradate is not more than (NMT) 1.00%, then
–1.00 to 0.49 may be defined as a design space for Factor B. This is shown as the green region in Figure 4. In this region,
the predicted values from the regression equation meet specification. It should be noted that there is some uncertainty associated
with the just defined area. Evaluation of the design space is discussed in Question 25, which will be covered in Part III
of this series.
Q20: Once I have data from the designed experiment, how do I define a design space for one response and multiple factors?
A: The design space can be built using a statistical model that results from the DoE that has been executed. The particular
model employed will vary depending on the design and will be subject to the appropriate caveats. For example, a screening
design may produce a model with linear terms and may assume no curvature. An interaction design may assume no quadratic effects.
An optimization experiment is an approximation to a more complicated response function. As described in Questions 8–18, a
riskbased approach is used in determining the DoE structure and determines the analysis.
The modeling step of the analysis is often an iterative process where the statistically significant model terms that contribute
to explaining the response are determined. The final fitted model can be used to generate interaction profiles and contour
plots to help visualize and understand the effect of the factors on the response. An interaction profile shows how the response
changes as one factor changes at given levels of another factor. A contour plot is a twodimensional graph of two factors
and the fitted response. Vertical and horizontal axes of the contour plots represent factors from the DoE while the lines
on the contour plot connect points in the factor plane that have the same response value (y), thereby producing a surface
similar to a topographic map. The contour lines show peaks and valleys for the quality characteristic over the region studied
in the DoE. When there are more than two factors in the experiment, contour plots can be made for several levels of the other
factors (see Figures 5 and 6).
Figure 5: Assay interaction profile depicting the interaction effect between Factors A and B on assay. The subgraph on the
left shows how the assay (yaxis) changes from the low to high level of A (xaxis). The red line indicates the assay values
for the low level of B (–1.41 coded units) and the blue line indicates assay values for the high level of B (1.41 coded units).
The subgraph on the right shows levels of B on the xaxis and the low and high levels of A coded as red and blue lines. This
subgraph provides insight into how assay is related to level of factor A and that this relationship is dependent on the level
of factor B.

The same fitted equation,
Assay = 96.9+3.2* A1.7* B+3.0* A* B*  2.2 A^{2}
Figure 6: Assay contour plot shows the expected assay responses for different combinations of Factors A and B. Red points
are the experimental design points; note that although the axes extend from –1.5 to +1.5 coded units, the experimental factors
only spanned 1.41 to +1.41 coded units. Color is used to indicate the value of assay with red representing lower values and
blue indicating higher values.

that is used to draw Figures 5 and 6 can then be used to determine the factor ranges that would produce acceptable results;
results that meet specifications. If there were no variability, Figure 7 offers an example of the region (in yellow) for Factors
A and B where the assay values are predicted to be at least 95%. The levels of factors A and B are presented in coded/standardized
form with 1 representing the low level of the factor and 1 representing the high level of the factor in the study. This coding
allows the factors to be viewed on a common scale that is not dependent on units of measure. Because this DoE is a response
surface design, the experiment includes star points that extend from the center of the design to the 1.41 and +1.41 levels.
Similar to Question 19, there can be variability and some degree of uncertainty associated with the edges of a design space
region (see Figure 7). The use of a probability region to define a design space to protect against the uncertainty of meeting
specification at the edges of the region described in Figure 7 is discussed in Question 26, which will appear in Part III
of this series.
