Quality Issues for Multiregional Clinical-Trial Materials

Published on: 
Pharmaceutical Technology, Pharmaceutical Technology-10-02-2011, Volume 35, Issue 10

The authors examine risk management relating to the quality issues of clinical-trial materials and discuss areas that would benefit from additional consideration and harmonization.

Conducting clinical trials is an integral part in developing new, safe, and efficacious pharmaceutical products. Exposing study volunteers to an experimental drug, particularly in the early stages of development, is challenging because of the limited information on the functional performance of the drug candidate. Regulatory guidances and formal control and approval procedures are in place to reduce risks and safety threats to study volunteers. This article discusses the management of risks in pharmaceutical quality of clinical-trial supplies, particularly the challenges that occur for early-stage clinical trial materials for which knowledge of the drug compound is limited. The basic dilemma that seems difficult to fully circumvent is that one is gaining knowledge about the importance of quality attributes through the development process, yet the application of GMPs is predicated on that knowledge. The authors examine this dilemma and resolution in several areas: step-wise modifications of test products, the concept of "representative," genotoxic impurities, specifications and acceptance criteria, and stability studies.

Basic dilemma

The principles that ensure that modern pharmaceutical products are produced to be safe and efficacious are codified in cGMPs. In the US, these requirements are set forth in the Federal Food, Drug, and Cosmetic Act, specifically in Section 21 of the Code of Federal Regulations (CFR) Parts 210 and 211 (1–3). In Europe, cGMPs are specified in Eudralex Volume 4, and similar standards exist in most countries (4). In large part, existing regulations do not distinguish between clinical supplies and approved commercial products. One notable, but limited exception, is FDA's approach to certain Phase I clinical supplies.


In 2008, FDA finalized a rule exempting certain Phase I clinical supplies from Section 21 CFR Part 211 and simultaneously issued a guidance describing cGMP expectations for these clinical drug products (5). Although the guidance relaxes some expectations for clinical supplies compared with those for approved commercial products, the guidance requires most of the quality systems and components described in Section 21 CFR Part 211, including those for validated test procedures, specifications with acceptance criteria, and stability studies conducted on representative samples of the clinical supplies.

This situation creates a dilemma for industry and regulators: one is still gaining knowledge about quality attributes during development, but application of cGMPs is based on the knowledge that is being acquired. More uncertainty about what is important to quality exists in early development compared with later in development as a product design matures and moves to commercial approval. This knowledge gap means that it is impossible to implement the same level of cGMPs at Phase Ia as compared with a commercial drug product.

Although authorities generally do not expect the same level of cGMPs throughout development, there is little specific regulatory guidance on expectations for early development or for the various stages of development. As a result, International Conference on Harmonization (ICH) guidances on quality and other guidelines that are designed for commercial products are applied unevenly and unpredictably to early-development clinical trial materials because the consideration of quality varies from reviewer to reviewer and company to company. This inconsistency affects issues relating to specifications, shelf-life and stability, starting materials and the description of an API synthesis, and potential impurities.

Specifications. Formal specifications are required for the release of clinical-drug products and related materials, yet the complete relevant knowledge for establishing meaningful specifications will not exist in the early-development stages, including information needed for selecting tests and the associated acceptance criteria. Some examples include tests for metal catalysts used early in an API synthesis, requirements for residual solvents, and dissolution requirements for solid oral formulations.

Shelf-life and stability. Establishing an initial shelf-life for clinical supply requires extrapolation from a previous representative batch based primarily on limited accelerated testing. Adhering strictly to the ICH stability expectations for establishing and extending shelf-life would be a serious impediment to product development. Again, little guidance exists concerning a practical approach to establishing the initial shelf-life of a prototype product that is likely to undergo significant refinements before a mature product design is achieved. Although most regulators expect concurrent stability studies to be conducted on representative samples of the clinical-trial material, this testing is of limited value with respect to ensuring the quality of the batch and frequently is of little value to the developer.

Starting materials and description of the API synthesis. A commercial API synthesis is, in part, defined by the regulatory approval of designated starting materials. This designation has significant implications because it provides a detailed disclosure of the synthesis process and defines where cGMPs apply. Agreement on starting materials typically occurs before the manufacture of registration batches and is based on a detailed understanding of the factors that influence API quality, in particular the impurity profile.

At the early stages in process development of an API synthesis, there is no regulatory approved starting material designation, and there is a limited amount of information regarding the source or even the absolute identity of impurities appearing or potentially present in the final API. In addition, the route of synthesis is changing and evolving as the requirements shift from laboratory-scale quantities to a commercially viable synthesis. The net result is vague regulatory expectations regarding the level of detail and the extent of disclosure of the chemical steps leading to the final API. As a result, there are often additional requests by regulatory authorities that result in further expenditure of resources by the applicant and regulatory authority or potential delays of clinical trials.

Potential impurities. Although the concept of an actual impurity is reasonably well-defined in ICH Q3A (R2) Impurities in New Drug Substances, the concept of what constitutes a potential impurity is not (6). Decisions on designation of potential impurities, including degradation products, drive many activities and issues for the API and drug product. These decisions include the development of analytical procedures, the designation of API starting materials, the development of the API synthesis, the description of the synthesis, formulation development, and the design of stability protocols. Additionally, any process designating potential impurities must consider the level of concern. Recent emphasis on genotoxic impurities complicates the issue because one must consider whether a particular compound is potentially genotoxic and whether it is present at levels of concern, which are typically in the parts-per-million range. Industry and regulators can improve the efficiency of the pharmaceutical-development process by considering these issues and establishing related best practices to the benefit of the industry and patients.

The concept of "representative"

A key concept of the pharmaceutical industry for introducing change is an assumption that some entity is representative of future materials produced by a proposed process. For example, the registration of a commercial product includes some number of so-called registration batches of API and drug product that are intended to be equivalent in quality and representative of the future quality of the commercial product. An application for a change to an approved commercial process typically includes a collection of analytical results from the product considered to be representative of future production, together with appropriate statistical comparison to the existing product. The term representative does not equate with identical. For example, registration batches may be produced at pilot scale rather than at full production scale and still are considered to be representative of the commercial-scale process. For late-stage and commercial scenarios, ICH Q1A (R2) Stability Testing of New Drug Substances defines what can be considered representative (7).

During all phases of development, this same strategy of using representative materials to justify change, justify reduced stability studies and establish initial shelf-life or extend shelf-life is frequently used and can be a valid and efficient development tool. As change is inherent in development, the ability to use representative materials to initially predict future performance is essential. The same criteria that define representative for a commercial process, however, are not transferable to early development. Again, the lack of regulatory guidance or established industrywide best practices relevant to development leads to unpredictable or uneven acceptance of the use of representative materials by regulatory authorities. There is a clear benefit to develop such best practices and/or regulatory guidance that defines the concept of representative by stage of development.

Genotoxic impurities

Compounds that have been shown to induce genetic mutations, chromosomal breaks, and/or chromosomal rearrangements are considered genotoxic and have the potential to cause cancer in humans. Regulatory authorities consider that exposures to even low levels of these impurities may be of significant concern. Current guidances that address issues related to impurities include ICH Q3A (R2) Impurities for New Drug Substances and Q3B (R2) Impurities in New Drug Products, and EMA's Committee for Medicinal Products for Human Use's guideline on the limits of genotoxic impurities (GTIs) (6, 8, 9).

The ICH guidances define an impurity as any component of the drug substance or drug product other than the chemical entity that makes up the drug substance or an excipient in the drug product. Depending on the quantity of drug substance or drug product to which a patient is exposed, these guidances recommend thresholds for the identification, reporting, and qualification of impurities. The identification limits provided in ICH Q3A (R2) and ICH Q3B (R2), however, may not be acceptable for genotoxic or carcinogenic impurities.


The EMA guideline recommends that the level for GTIs should be "as low as reasonably practicable" (ALARP). According to this principle, manufacturers should strive to achieve the lowest levels of genotoxic or carcinogenic impurities that are technically feasible and/or levels that convey no significant cancer risk. For example, alternative synthetic routes that do not lead to genotoxic residues should be used if practical. However, according to EMA, the ALARP principle does not need to be applied to impurities that do not exceed the threshold of toxicological concern (TTC) (10).

The ICH guidances on impurities do not apply to drug substances or drug products used during the clinical research. Issues regarding the presence of genotoxic or carcinogenic impurities, however, often occur during clinical development. In cases where the presence of an impurity with genotoxic or carcinogenic potential is identified or where such an impurity may be expected based on the synthetic pathway, steps should be taken during the clinical development to address safety concerns associated with these impurities. The EMA guideline provides recommendations for acceptable exposure thresholds during clinical development as well as for marketing applications.

The concept of TTC for genotoxic compounds was developed for the EMA guideline and a draft FDA guideline, Genotoxic and Carcinogenic Impurities in Drug Substances and Products: Recommended Approaches (9–11). The guidelines stipulate that exposure to a GTI must be below 1.5 μg/day, which represents a 1 in 100,000 incremental lifetime cancer risk. This level does not apply to the most potent carcinogens or to GTIs for which safe limits already exist based on evidence of a threshold mechanism for genotoxicity. The TTC comes from an analysis of 730 compounds and related carcinogenicity data in the Carcinogenic Potency Database housed at the University of California, Berkeley (12). Originally proposed by regulators as a 0.15-μg limit on carcinogenic impurities in food, the value was adjusted upward for pharmaceuticals because of the benefit gained from taking medications. Even so, the 1.5 μg limit is about 1000 times lower than typical thresholds on impurities.

Alternatives to the TTC approach may be used. If there are adequate safety data for a known GTI, the data can be used to set limits that may differ from the TTC. Lacking data, drug developers have the option to conduct toxicological studies with a GTI to enable an estimation of a compound-specific limit or to default to the TTC. With impurities suspected to be GTIs, drug developers may not bother to classify them depending on the expected risk, needed analytical efforts, and considerations for controlling the impurity. If no testing is done, potential GTIs must be controlled at or below the TTC. If testing finds that a potential GTI is genotoxic, either the TTC or a level based on safety data must be used. If it is not genotoxic, the impurity is handled like any impurity.

Although avoidance of GTIs as reagents, starting materials, synthetic intermediates, and byproducts in chemical processing is an important consideration, it is not always feasible or desirable. Functional groups that render starting materials and synthetic intermediates useful as reactive building blocks may also be responsible for their genotoxicity. Avoidance of mesylate or tosylate salt isolations (i.e., to avoid potential mesylate and tosylate ester GTIs) may limit opportunities for optimal purification, physical properties, stability, or bioavailability of an API. The alternative to avoiding GTIs is to assess and manage potential risk through appropriate application of chemical process design and analytical testing.

The timing for GTI assessment and testing also must be considered. During the early phases of development leading up to and including initial clinical trials, drug-candidate attrition is significant. Adding to the challenge of addressing GTIs is the fact that the chemical synthesis may be rapidly changing as it progresses toward a commercial synthetic route. Taking into account the limited availability of information on the impurity profile, especially in early development, and in the likelihood that the route of synthesis may change in the course of process optimization, a staged evaluation of reasonably expected GTIs or potential genotoxic impurities (pGTI) should be performed throughout development.

A Pharmaceutical Research and Manufacturers of America's (PhRMA) whitepaper proposes a staged TTC approach based on clinical stage of development, dose, and duration of administration (13). The draft FDA guidance also proposes a staged TTC approach, however, at stricter levels as compared with the PhRMA white paper (11, 13). Following a staged TTC approach, drugs can contain higher levels of GTIs if they are given for shorter periods of time, with the level staged to the duration of exposure.

For pharmaceutical companies, the staged TTC greatly eases the burden on characterizing and controlling impurities during the drug-development process. In early-stage work, impurity information is limited, and analytical methods are undeveloped. As candidates advance and synthetic processes are optimized, impurities are routinely assessed, and plans are made for avoiding or controlling them. A risk assessment of the synthetic process, including starting materials, intermediates, solvents, byproducts, and impurities identifies GTIs that are or might be present.

Regulatory agencies may accept in silico structure-activity relationship (SAR) methods instead of laboratory tests to conclude that the impurity is not genotoxic even if it has an alerting structure. Using in silico evaluations and expert opinion, GTI alert structures are identified among the compounds for which no data are available. Examples of commercial software applications used for this purpose include Casetox (MultiCASE and Derek (Network Sciences Corp.) (14, 15). Certain chemical functional groups or structures associated with DNA reactivity are considered alerts for genotoxicity. If an impurity has a structural alert, a bacterial mutagenesis screen, such as the Ames test, can be run to confirm its genotoxicity. A negative Ames test result will overrule a structural alert, and the impurity can be considered nongenotoxic.

Although seemingly clearcut, pGTI evaluations have become a significant issue because there is no standard, agreed-upon process as to how to do an SAR evaluation. Some European regulatory agencies do not use any in silico methods because the systems are too costly to run and maintain. These agencies rely instead on simple structural alerts. On the other hand, FDA has multiple systems, some of which are proprietary and inaccessible to drug developers.

Although this staged approach provides significant clarity to controlling pGTIs, there are still several areas where further guidance would be beneficial. These areas are the use of scientific justification in lieu of actual testing for pGTIs, the scope of search for pGTIs in the synthetic scheme (i.e., raw materials, intermediates, byproducts, the synthetic steps before the final drug substance), testing methodology, and level of validation of these methods.

An online PhRMA survey was created to serve as a benchmark of industry standards and practices regarding genotoxic impurities (16, 17). The survey was sent to 22 different pharmaceutical companies with 15 companies responding to the survey. Some key results included:

  • 85% of respondents said they evaluate the synthetic process for pGTIs at the preclinical stage.

  • 74% of respondents develop analytical test methods for all identified pGTIs.

  • 82% of respondents stated they control pGTIs in nongenotoxic oncology drugs.

  • 96% of respondents that follow a risked-based assessment for deciding not to test for a pGTI consider the number of steps back from the final API where the pGTI originates.

  • 91% of respondents used the staged TTC approach from the PhRMA white paper for setting specifications for pGTIs during clinical development.

The most common rationale for not testing for a pGTI was consideration of where the pGTI was introduced into the synthesis and whether the pGTI is reactive enough to be eliminated in downstream chemistry or processing, for example, acyl chlorides. Differences in approaches to controlling pGTIs exist in several areas: the point identified in the synthetic process to begin monitoring for pGTIs; the use of limit tests versus quantitative reporting of pGTIs; validation of methods to control pGTIs; and, in cases, where more than one pGTI is possible, the use of individual or collective limits. In addition, due to time constraints, when a pGTI has been identified, analytical methods may have to be developed, and controls implemented before confirmation through mutagenic testing.

Assessment and control of pGTIs in drug development is challenging because of the evolving nature of the synthetic process, variable points of entry of pGTIs in the process, and the need for analytical measurements with adequate selectivity and sensitivity. When applying the staged TTC approach, consideration should be given to the drug product's clinical-development stage, the maximum duration of drug administration at that stage, the proposed indication (e.g., a life-threatening condition versus a less serious condition), the patient population (e.g., adults versus children), and the structural similarity of an impurity to a compound of known carcinogenic potency.

Establishing specifications

The quality of a commercial pharmaceutical product is set by a well-designed, understood, and executed manufacturing process using high quality raw materials. This built-in quality is verified at release and on stability by random sampling and testing of critical product characteristics against established specifications containing specific acceptance criteria per characteristic. A natural aim is to control the quality of an investigational pharmaceutical product in a corresponding way, but this quality control is more challenging for several reasons.

In-vivo/in-vitro correlation. Ideally, one would like to know how a change in an in vitro characteristic influences a desirable or undesirable response in vivo. Using such knowledge, together with clinical judgment about where to draw the line between acceptable and unacceptable in vivo responses, it is fairly straightforward to develop appropriate control limits for in vitro characteristics. Unfortunately, these relations often are not fully understood even for products on the market, and the likelihood of being able to develop specifications using this approach is even more unlikely earlier in development.

Lack of information. For a product that is close to registration, several batches already have been studied in Phase I-III development. The recorded release and stability in vitro characteristics of successful earlier clinical batches, together with batches manufactured under similar conditions as the clinical batches, can be used as a basis for statistically established limits for different critical quality attributes (CQAs). However, the earlier in the development process, the more limited this pool of historical knowledge becomes, including that some of the CQAs might not have been recorded for early batches. This problem can happen by not realizing that an attribute was critical or by not having appropriate analytical methodology to make that evaluation.

Change to product. A basis for developing specifications based on previous knowledge is that earlier batches are representative of recently manufactured batches or at a minimum, it is understood how these batches relate to each other. During development, changes to raw materials, excipients, scale, packaging, and formulation occur frequently. Some data from previous batches, therefore, may be irrelevant and difficult to interpret.

Method development. Characterizing a batch of product requires sophisticated analytical methods, and as such, takes time to develop, optimize, and validate. Batches used during development often are characterized using different methods of increasing complexity and performance. Moreover, characterization of development batches and clinical-trial material may be done using different equipment in different laboratories. In vitro data for earlier batches may not be fully comparable to the corresponding data for later batches.

Subjectivity of regulatory authorities in different regions. Considering all issues associated in developing specifications during the early stages of development, it often comes down to collecting all data and scientific information available, judging the value of different pieces against each other, and making the best overall subjective judgment, taking all relevant regulatory guidance into account. The fact that a subjective element is involved can cause issues for releasing clinical-trial material in different regions as the guidelines and regulators' judgment may differ from the sponsor's. As a consequence, the tightest release specification over all regions often must be used. This need to simultaneously satisfy requirements in several regions in some cases lead to unreasonable situations, especially when the requirements are incompatible.

Stability studies

Clinical-trial materials are by their nature experimental items that undergo evolution as the potential drug candidate progress through clinical development. Because they are experimental, there is a certain amount of risk originating from the relative lack of information regarding functional performance and how critical dosage form attributes might be changing over time.

From the standpoint of the formulation scientist, two principal risk-management goals need to be achieved during a clinical trial. First, the safety of participating study volunteers must be ensured by minimizing controllable safety threats. Second, if an investigational compound fails to meet clinical-study objectives, whether for efficacy or toxicological reasons, it should be from a intrinsic outcome of the compound itself and not because of a failure in performance of the dosage form.

The risk-management exercise is driven by the general interest in moving to human trials as soon as possible to validate a proof of concept for pharmacological activity and determine if a particular compound is worthy of further consideration. As a consequence, early-development timelines typically are short, meaning that only a limited amount of information is available on dosage-form performance as a function of time and storage conditions. The uncertainty from the lack of stability data is partially balanced by the fact that early clinical trials are short in duration, involve a small numbers of subjects, and have a high level of control over the storage and handling of the clinical-trial materials. As illustrated in Table I, when a compound moves to the next stages of clinical development, the amount of data collected on performance increases, thereby resulting in reduced uncertainty in some respects. In parallel, the length of trial, the numbers of subjects involved, and the diminished control over the test articles also are increasing, thereby somewhat offsetting the increased data set.

Table I. Comparison of various features: clinical investigation at different stages and extent of control over distributed trial supplies.

The statistical tools and evaluation criteria for stability data collected at the end stages of development, especially in support of marketing authorizations, are well developed. They primarily consist of linear least-squares techniques with hypothesis testing on slopes to determine whether a significant change over time is occurring for critical parameters. Poolability of slopes and intercepts is used to determine if product characteristics are reproducible between batches.

These tools are less useful at the earlier stages in development because of the limited duration of available stability data. Formulations typically are undergoing rapid evolution and are produced in very limited quantities, so homogeneity across batches cannot be ensured. Opportunities to use neural networks in cases such as these have been explored but have not been widely implemented (18). Other multivariate techniques, such as cluster analysis or pattern-recognition methods, which capitalize on previous work with related compounds or formulations would be worthy of further exploration.


There is considerable opportunity to improve drug development without compromising the strong safeguards that mitigate risk to subjects in clinical trials, including, areas that would benefit from additional consideration and harmonization. Although broad regulatory requirements exist for materials going into clinical trials, more defined best practices that reflect the stage of development and risk-management practices would be a benefit to industry, regulators, and patients. The vetting of such best practices should be through an open process that involves these same stakeholders. To be most effective, best practices should be sufficiently detailed yet flexible and subject to continuous improvement. The current philosophy of regulatory agencies has been to set a general framework for regulatory expectations through formal guidance documents but not to supply the level of detailed implied herein. As a result, there is the belief that drug developers either individually or collectively justify approaches that satisfy these general expectations. Further clarification for industrywide best practices, however, would be tremendously beneficial. The vehicle for establishing these best practices needs to be carefully considered and can include pharmacopeial standards, publications by industry consortia, or other standard-setting bodies.

Dennis Sandell* is director of S5 Consulting, Järnåkravägen 3, SE-222 25, Lund Sweden, tel. + 46 46 150703, dennis@s5consulting.com. Terrence Tougas is a highly distinguished research fellow, Dennis O'Connor is clinical supplies officer, and Steve Horhota is a highly distinguished research fellow, all with Boehringer Ingehelheim.

*To whom correspondence should be directed.


1. Federal Food, Drug, and Cosmetic Act, 1938.

2. Code of Federal Regulations, Title 21, Food and Drugs (Government Printing Office, Washington, DC), Part 210, "Current Good Manufacturing Practice in Manufacturing, Processing, Packing, or Holding of Drugs; General."

3. Code of Federal Regulations, Title 21, Food and Drugs (Government Printing Office, Washington, DC), Part 211, "Current Good Manufacturing Practice for Finished Pharmaceuticals."

4. EudraLex, Rules Governing Medicinal Products in the European Union, Volume 4 (Brussels), "EU Guidelines to Good Manufacturing Practice Medicinal Products for Human and Veterinary Use."

5. FDA, Draft Guidance for Industry: CGMP for Phase I Investigational Drugs (Rockville, MD, July 2008).

6. ICH, Q3A (R2) Impurities in New Drug Substances (2006).

7. ICH, Q1A (R2) Stability Testing of New Drug Substances and Products (2003).

8. ICH, Q3B (R2) Impurities in New Drug Products, Step 4 version (2006).

9. EMA, Guideline on the Limits of Genotoxic Impurities (London, June 2006).

10. EMA, Questions & Answers on the CHMP Guideline on the Limits of Genotoxic Impurities (London, Sept. 2010).

11. FDA, Draft Guidance for Industry: Genotoxic and Carcinogenic Impurities in Drug Substances and Products: Recommended Approaches (Rockville, MD, Dec. 2008)

12. University of California, Berkeley, The Carcinogenic Potency Database (2010).

13. L. Muller et al., Regul. Toxicol. Pharmacol. 44 (3), 198–211 (2006).

14. MultiCASE, Multicase Inc Bioactivity Software, www.multicase.com, assessed May 20, 2010.

15. Network Sciences Corp., DEREK, NetSci: Software for Computer Assisted Molecular (Drug) Design, www.netsci.org, assessed May 20, 2010.

16. PhRMA, Industry Survey on Genotoxic Impurities (Washington, DC, 2008).

17. S. Colgan et al., Regulatory Rapporteur 7 (5), 23–29 (2010).

18. S. Ibric et al., J Pharm Pharmacol. 59 (5) 745–50 (2007).