OR WAIT 15 SECS
Predicting long-term storage stability of a given protein and formulation is desirable for effective screening and optimization early in the development process. Multiple routes to aggregation during storage to suggest that multiple measurement types should be made to probe different aspects of protein behavior.
Therapeutic proteins are a powerful and versatile way to treat a wide range of diseases. Commercial products such as Remicade (infliximab), Avastin (bevacizumab), and Humira (adalimumab) treat a wide range of indications, and a significant proportion of drugs currently under development are proteins. When compared to conventional small-molecule drugs, however, the delicate nature of proteins makes them difficult to use as commercial pharmaceutical products. To be a successful pharmaceutical, efficacy alone is not sufficient; the medicine must also be robust enough to retain this efficacy and remain safe for patients during extended periods of storage over multiple years.
The challenge to produce protein medicines that are stable over extended periods has further increased in recent years due to pressure to move away from lyophilized formulations, which need reconstituting by the patient, toward more convenient liquid formulations, which can be directly injected. In addition, an increasingly competitive commercial landscape means that there are considerable pressures to bring products to market faster and for these products to have longer shelf lives.
Aggregation as a degradation route
Proteins in solution can degrade by means of several mechanisms during extended storage, and a common degradation route is aggregation of the protein over time. Recently, this route has been a particular focus for biopharmaceutical developers and regulatory authorities due to increasing concerns that the presence of even relatively small amounts of aggregated protein may lead to unwanted immune responses in patients, thus potentially resulting in adverse reactions or loss of efficacy over time.
Biopharmaceutical developers have two main strategies at their disposal to create liquid biopharmaceutical formulations with long shelf lives and a resistance to the formation of aggregates: engineering of molecules for aggregation resistance and the inclusion of additives that inhibit the formation of aggregates in the solution (i.e., formulation). There are, however, many possible ways to engineer a protein, as well as many different additives and combinations of additives that could potentially be used. How should a development scientist determine which of these many options will be the most effective at ensuring aggregation resistance during extended storage? One option is to try a range of protein engineering and formulation options, store them in the refrigerator, and come back in three years to measure the level of aggregates formed. Three years, however, is a long time to wait, especially if none of the protein engineering or formulations provide adequate aggregation resistance. What is needed is a rapid means of predicting at the start of the study, with a reasonable level of confidence, which molecules and formulations will resist the formation of aggregates. Such a method could be used by protein engineers and formulation scientists early in the development process to screen many molecules and formulations to identify the best options and ensure those taken forward will ultimately be suitable for use in a commercially viable product. Real-time storage-stability studies of the selected molecules and formulations will still need to be performed to satisfy the regulators, such as FDA and the European Medicines Agency, but early use of a predictive screening tool has the potential to dramatically reduce the risk of such studies failing.
Biopharmaceutical developers have historically performed a range of activities to try to predict and optimize the long-term aggregation stability of therapeutic proteins early in development. The scope and reliability of these efforts, however, have been restricted somewhat by the lack of suitable technologies and a limited understanding of aggregation processes. For example, methods such as differential scanning calorimetry (DSC)--traditionally used to experimentally screen the stability of proteins in a range of buffer compositions--use a relatively large amount of often scarce protein sample, which limits the number of conditions that can be tested. Additionally, DSC only probes one of a number of possible routes to aggregate formation and therefore may not be predictive if an alternative mechanism dominates.
Predicting aggregation during long-term storage
In an ideal world, a computational tool would be available to predict which molecules and formulations will have optimal stability and how they will behave over extended periods of time. There are a range of computational tools available that aim to help predict the aggregation propensity of proteins. These are undoubtedly useful. Therapeutic proteins, however, are typically large and complex molecules, and industry’s knowledge and understanding of aggregation mechanisms, the effect of the solvent environment on these molecules, and the exact mechanism of action of many excipients remains incomplete. This incomplete knowledge currently limits the practical use of such computational tools.
Empirical methods that can be applied early in the development process and that will provide the information necessary to make predictions about long-term storage stability must, therefore, be used. A fundamental question is: what experimental measurements of candidate molecules and formulations will, for example, predict the level of aggregation in solution after multiple years stored at 4 °C? Unfortunately for the biopharmaceutical development scientist, there is currently not a single, unambiguous answer to that question. There are, however, a number of attempts to tackle the problem reported in the scientific literature.
A comprehensive article by Weiss et al. from the University of Delaware reviews the state of the art in 2008 and reveals that the science of predicting the long-term storage stability of proteins was something of a work in progress at the time (1). Despite some progress in the intervening years, it remains so today. A number of more recent publications, however, have tackled this difficult question and proposed some answers. This article presents a short review of some of the most notable proposed approaches to predicting long-term aggregation propensity of protein solutions.
Aggregate formation route determines predictive measurements
There are a number of routes by which monomer proteins in solution can come together to form aggregates and, to make accurate predictions of long-term aggregation behavior, measurements must be found that probe in some way the crucial steps along this pathway to aggregation. Two pathways by which protein molecules can aggregate are illustrated schematically in Figure 1. In the first pathway, some of the proteins in solution unfold either partially or fully so that aggregation-competent regions, such as hydrophobic residues, are exposed and cause the proteins to stick together--this is described as “non-native aggregation.” In the second pathway, the protein molecules retain their correctly folded, native conformation, but have aggregation-competent regions on their surface, such as localized charged regions or hydrophobic patches, that cause the proteins to stick together and aggregate. The aggregation rate-limiting steps for these two different pathways are quite different, and, therefore, different experimentally measurable parameters may be potentially useful, depending on which pathway is dominant for a particular molecule or formulation.
Lack of conformational stability in aggregation
First consider the case of non-native aggregation processes, in which the conformational stability of the protein may play a role. One of the most widely used, experimentally determined parameters for screening molecules and formulations for stability is the temperature at which the protein is observed to unfold, which is the protein melting temperature (Tm). Tm may be determined experimentally by applying a temperature ramp to the protein in the solvent of interest and identifying the temperature at which the protein unfolds using either calorimetric methods, such as DSC, or spectroscopic methods, such as intrinsic protein fluorescence, as illustrated in Figure 2.
Tm has been used as a crucial stability-indicating metric for candidate screening and formulation development by the majority of large biopharmaceutical developers for a number of years, and many reports of its application in high-throughput formulation screening have been presented (1-9). Conceptually, a higher Tm will mean that, in a refrigerator at 4 °C, there will be statistically fewer unfolded protein molecules in solution and, therefore, less chance of the formation of non-native aggregates over time. As a means of predicting long-term storage stability, this may be useful in cases for which the Tm of the protein is particularly low, but for comparing proteins or formulations for which the Tm is relatively high, the number of unfolded proteins predicted by this parameter will be very low indeed. An interesting illustration of both the potential--and the limitations--of the predictive powers of Tm was reported by Goldberg et al. from MedImmune, who observed a correlation between Tm and aggregation stability during storage at 40 °C for two monoclonal antibodies, but not for a third (2). The stability of the third antibody appeared to be predicted by its rate of aggregation when heated to 70 °C. Tmis, therefore, a useful prescreening tool to identify particularly conformationally stable or unstable molecules or formulations, but may not, on its own, be predictive of long-term storage stability for all samples. It is worth noting, however, that Tm may be useful in understanding other aspects of protein stability relevant to its use as a biopharmaceutical, such as how the protein will behave at temperatures higher than 4 °C, due to unplanned temperature excursions during storage, the higher temperatures experienced during manufacture, and the elevated temperatures in vivo.
Table 1: Measurements with the potential to predict long-term storage stability.
Probes of protein conformational stability
Type of measurement
Equilibrium thermal unfolding
Thermal unfolding mid-point, Tm
Fluorescence, differential scanning calorimetry
Equilibrium denaturant unfolding
Denaturant unfolding midpoint, D1/2
circular dichroism (CD)
Time-dependent thermal unfolding
Thermal unfolding rate(s), KT unfolding
Time-dependent denaturant unfolding
Denaturant unfolding rates, KD unfolding
Probes of protein colloidal stability
Type of measurement
Second virial coefficient, A2
Static light scattering, self-interaction chromatography
Protein solubility by precipitation
Precipitation midpoint, [precipitant]
Ammonium sulfate or polyethylene-glycol precipitation with static light scattering, turbidity
Protein diffusion interaction
Diffusion interaction parameter, kD
Dynamic light scattering
Probes of combined colloidal and conformational stability
Type of measurement
Thermal scanning aggregation
Aggregation onset temp., temperature at which aggregation rate exceeds a certain value
Static light scattering, size-exclusion chromatography–high-performance liquid chromatography (SEC–HPLC)
Time-dependant, thermally induced aggregation
Static light scattering, turbidity, SEC–HPLC
High-temperature aggregation kinetics as a predictive tool
An alternative or additional approach to the use of Tm as a predictor of storage stability (summarized in Table I) is to observe the kinetics of aggregation when the sample is held at a temperature that accelerates the formation of aggregates. Such measurements essentially probe a combination of the time--dependent rates of unfolding and the aggregation of these proteins. Christopher Roberts and his group at the University of Delaware have demonstrated the value of measuring aggregation rates at elevated temperatures for rapid formulation and candidate screening (1, 10-12), and these authors emphasize that understanding the rate--limiting step in the non-native aggregation pathway is important to predicting behavior at lower temperatures and longer times. One approach to identifying the aggregation rate--limiting step is to try to fit various mechanistic aggregation models to accelerated (elevated temperature) aggregation data to identify the model that fits best. Once this best fit has been established, the selected model may be used to predict behavior at lower temperatures and longer times. Kayser et al. have also demonstrated the application of this approach (13).
The Delaware group of Roberts et al. have also described a practical method which predicts the aggregation behavior for an immunoglobulin G (IgG) molecule of interest (11, 12). A thermal ramp was applied to the IgG in a range of solution conditions, and the amount of aggregation was monitored as a function of temperature. The authors determined that the temperature at which the rate of increase in aggregate content exceeded a certain set value (obtained in an experiment taking approximately an hour) was a useful predictor of aggregation behavior of the same sample when stored at 40 °C for 40 days. Experimentally, there are a number of possible routes to obtaining this kind of data in a high-throughput screening environment, with optical methods being particularly well suited to this type of application. Figure 3 shows typical data obtained using intrinsic protein fluorescence to monitor time--dependent unfolding and static light scattering to monitor the corresponding rate of aggregation.
Protein-protein interactions as a predictive tool
The experimental approaches described so far largely probe non-native aggregation mechanisms arising due to attractive forces between fully or partially unfolded protein molecules in solution. As discussed earlier, attractive interactions between native proteins in solution can also potentially lead to the formation of aggregates, and it is therefore interesting to measure the strength and nature (attractive or repulsive) of these interactions for candidate proteins or formulations. The resistance to aggregation due to native protein-protein interactions in solution is often referred to as the “colloidal stability” of the protein. A number of experimental methods are available to determine this stability, including self-interaction chromatography and dynamic light scattering. Static light-scattering arguably provides the most accessible and developed method for measuring protein-protein interactions in solution and requires only the protein concentration-dependent light-scattering intensity from the protein of interest in the solution of interest. By combining these data with suitable physical and instrumental constants, one can generate a graph, known as a Debye plot, from which a value called the second virial coefficient (also known as A2 or B22) can be extracted (14). The sign of this value indicates whether the protein-protein interactions are attractive (a negative value) or repulsive (a positive value) while the magnitude of the value indicates the strength of the interaction. Generally, one might expect that molecules or formulations with net repulsive protein-protein interactions will be more resistant to aggregation. A number of studies have integrated use of the second virial coefficient, or its equivalent, into their formulation screening approaches (7, 14-16) and, like Tm, it has proven to have a useful predictive value for some proteins, but not for all.
It is also worth noting that the aggregation rates of unfolded monomer proteins will be determined, in part, by the nature of interactions between these species. For example, attractive hydrophobic forces caused by exposure of the hydrophobic residues could potentially be overcome by repulsive electrostatic forces if all the molecules carry a significant net charge.
Protein solubility might hold the key
An alternative approach to investigating the effect of native protein-protein interactions is to determine the solubility of the candidate protein in the solution of interest. An interesting study was recently presented by Banks et al. from Amgen, which experimented with two strategies to formulate an IgG (17). One approach sought to stabilize the conformation of the protein, and the other sought to improve its colloidal stability. The effects of the formulations on conformational stability were evaluated with DSC, and the effect on protein solubility was investigated using ammonium sulfate precipitation. Importantly, the formulations were also stored at 4 °C for nearly a year, and the rate of aggregate formation was measured to directly assess the effectiveness of the formulation in preventing aggregation during long-term storage. The key result of this study was that, for the molecules and formulations studied, the effect of the formulation on the solubility of the protein, rather than the effect on the conformational stability, was key to improving the long-term aggregation resistance of the protein at 4 °C.
Complications at high protein concentrations
To be delivered subcutaneously, many therapeutic proteins need to be supplied at very high concentrations, often 100 mg/mL or more. At these high concentrations, the protein molecules are very close to one another, and the nature of the dominant forces acting between them can change from long-range forces to shorter-range forces. This change has the potential to further complicate the process of predicting shelf life because, in some instances, there may be different processes limiting the rate of aggregation for dilute and concentrated solutions.
An interesting example of this phenomenon was reported by Kumar and colleagues from Abbott, who studied two molecules, a monoclonal antibody (IgG1) and a dual variable domain antibody (DVD-Ig), and obtained a range of analytical data at low and high (up to 150 mg/mL) concentrations (18). The data were correlated with the rate of aggregate formation when the samples were stored at 5 °C for an extended period. The results suggested that, for the monoclonal antibody studied, the second virial coefficient (A2) was a good predictor of aggregation behavior for both the low and high concentration samples. However, for the DVD-Ig molecule, although the second virial coefficient was a good predictor at low protein concentrations, it failed at high concentrations, for which the thermal conformational stability was found to be a better predictor. The authors rationalized this as being due to the interactions leading to aggregation of the IgG at both high and low concentrations and of the DVD-Ig at low concentration being electrostatic and, therefore, relatively long-range in nature. In contrast, at high concentrations of DVD-Ig, short-range, attractive hydrophobic forces became the dominant interactions, leading to aggregation during quiescent storage. The authors hypothesized that, in conditions where the DVD-Ig molecule had low thermal conformational stability, there may be a greater chance of the normally buried hydrophobic residues becoming more exposed. This would enhance attractive hydrophobic interactions and explain the apparent correlation between Tm and aggregation at high concentrations.
As can be seen from the previous discussions, a number of rapid analytical measurements have been used to successfully predict protein-aggregation behavior during long-term storage at refrigerator temperatures (2-8 °C). Currently, however, no single measurement can be considered predictive of long-term storage stability for all proteins in all formulations and at all protein concentrations. To improve the predictive capabilities of early-stage protein candidate and formulation screening, multiple measurement types are needed to probe different potential pathways to aggregation. These measurements can be used to optimize protein candidates and formulations. In addition, these measurements can be used to investigate the dominant mechanisms that lead to aggregation for a particular molecule, so that strategies to alleviate these may be rationally designed. Measurements probing conformational stability, such as Tm and time-dependent rates of thermal unfolding, as well as measurements probing colloidal stability (e.g., second virial coefficients from static light scattering or solubility from ammonium sulfate precipitation) should all be performed to give a full picture of the mechanisms which may have an impact on long-term storage stability. Some measurements, such as aggregation rates at elevated temperatures approaching the Tm of the protein, might be expected to contain a convolution of information about conformational and colloidal stability (e.g., rates of unfolding and rates of aggregation of unfolded species).
Conveniently, recent advances in analytical instrumentation mean that many of the measurements described above and in the literature referenced in this article can be rapidly performed in an automated, high-throughput manner using modest protein- sample volumes. Some of these instruments can obtain multiple conformational and colloidal stability parameters from a single sample, further streamlining the process of obtaining the empirical information needed to make accurate predictions. These technologies make it feasible to obtain a wide range of potential predictive parameters routinely and cost-effectively as part of the protein candidate and formulation screening and optimization workflow. Although this still may not provide guaranteed predictions of what will happen to every product after three years in a refrigerator, it should greatly reduce the risk of any unpleasant, and expensive, surprises in the future.
About the Author
Simon Webster is cofounder and former chief scientific officer of Avacta Analytical, Wetherby, UK, firstname.lastname@example.org.