Statistical Considerations in Design Space Development (Part I of III) - Pharmaceutical Technology

Latest Issue

Latest Issue
PharmTech Europe

Statistical Considerations in Design Space Development (Part I of III)
The authors discuss the statistical tools used in experimental planning and strategy and how to evaluate the resulting design space and its graphical representation.

Pharmaceutical Technology
Volume 34, Issue 7, pp. 66-70

Experimental design planning

Developing a design space requires a significant level of planning. A critical aspect of QbD is to understand which factors* affect the responses and#8224; associated with the CQAs and to specify the operating regions that meet product requirements. Data-mining and risk assessment contribute to the selection of factors and responses and to the planning of a DoE. In the following FAQ section, the authors explore the role of historical data, preliminary runs, DoE, responses, factors, and factor ranges in developing a design space.

Q1a: What is the role of prior knowledge such as historical data, in developing a design space?

A: A starting point for statistical experimental design is the review of available historical information. Historical data might consist of information obtained from previous commercialized products and processes or literature and fundamental scientific understanding. Some possible sources of useful information include: exploratory laboratory-trial data, analytical data, stability data, batch-release data, regular production batches, and deviation data such as factors that do not meet proven acceptable ranges or operating ranges or responses that fail to meet control limits or specifications.

Q1b: How and what type of information can be gleaned from historical data?

A: Information gleaned from historical data can provide valuable information in the planning of designed studies. As part of planning, it is critical to identify appropriate responses-that is, those responses that are linked to the quality attributes and factors that could contribute to the variability in identified quality attributes. Historical data also can provide information on relationships between factors and responses that can be used to set up experimental studies. For example, information from unit operations or on equipment capability could provide a starting point for factors and factor ranges to include in the designed studies. Although analysis of historical data may provide information on relationships, it should be understood that the structure of the data may affect the conclusions of the analysis. The relationships in the context of historical data can do no more than point to an association; the data cannot be understood as causal. When the historical database is extensive, a variety of tools can be applied to understand relationships, including Principal Component Analysis (PCA) and Partial Least Squares (PLS).

Q1c: What problems can arise when using historical data?

A: It is important to exercise caution when mining historical data because historical data are restricted to observed variation in levels versus forced controlled variation in levels from a planned statistical design. Gaps in knowledge, therefore, are likely. Problems can arise due to the nature of observational data. No analysis can make up for deficiencies in the structure and quality of the data. Below are a few critical points that can arise when analyzing historical data.

  • Multicollinearity: In observational data in which two or more factors are highly correlated, it is difficult to identify which factor(s) are affecting the response. No analysis can remedy this problem.
  • Missing factors: Important factors may not be recorded in the data and if the recorded data are correlated with unrecorded causal factors, a partial relatonship may be mistakenly proposed. A common mistake is to attribute causality when associational relationships are all one can propose until a confirmatory experiment is carried out.
  • Missing data and imbalance: Historical data can frequently suffer from a relatively large proportion of missing information. Prediction in areas where there is no information or extrapolation to areas where no data exists can be highly misleading.
  • Precision of the data: Over-rounded data can lead to misleading conclusions.
  • Range of factor levels: Data with a wider range may provide knowledge on a relationship. If the inputs haven andapos;t been varied across an appropriate range, the relationship won andapos;t be detected.

The relationships established by historical-data analysis should be confirmed by appropriate DOEs before defining or expanding a design space.


blog comments powered by Disqus
LCGC E-mail Newsletters

Subscribe: Click to learn more about the newsletter
| Weekly
| Monthly
| Weekly

What role should the US government play in the current Ebola outbreak?
Finance development of drugs to treat/prevent disease.
Oversee medical treatment of patients in the US.
Provide treatment for patients globally.
All of the above.
No government involvement in patient treatment or drug development.
Finance development of drugs to treat/prevent disease.
Oversee medical treatment of patients in the US.
Provide treatment for patients globally.
All of the above.
No government involvement in patient treatment or drug development.
Jim Miller Outsourcing Outlook Jim MillerOutside Looking In
Cynthia Challener, PhD Ingredients Insider Cynthia ChallenerAdvances in Large-Scale Heterocyclic Synthesis
Jill Wechsler Regulatory Watch Jill Wechsler New Era for Generic Drugs
Sean Milmo European Regulatory WatchSean MilmoTackling Drug Shortages
New Congress to Tackle Health Reform, Biomedical Innovation, Tax Policy
Combination Products Challenge Biopharma Manufacturers
Seven Steps to Solving Tabletting and Tooling ProblemsStep 1: Clean
Legislators Urge Added Incentives for Ebola Drug Development
FDA Reorganization to Promote Drug Quality
Source: Pharmaceutical Technology,
Click here