Rethinking Limits in Cleaning Validation

Published on: 
Pharmaceutical Technology, Pharmaceutical Technology-10-02-2015, Volume 38, Issue 10

An integrated approach can improve the efficiency of cleaning validation studies.

Abstract:Cleaning validation programs must have cleaning limits, worst-case residues to validate, and recovery factors to accurately determine how clean the equipment must be.  If a program is to be robust, it must also address the question of which products should be validated, and how to test residue levels accurately, to assure compliance with the defined clean validation limits. This article presents an integrated approach that can establish the necessary information in an efficient, compliant manner.

An effective cleaning validation program requires substantial up-front work to withstand regulatory scrutiny. Acceptable residue limits (ARLs) must be defined prior to any cleaning validation and development work (1,2). The ARL is the level to which product residues must be removed to assure patient safety and that the subsequent product manufactured on the cleaned equipment will not be contaminated.

A number of factors go into the determination of the ARL, including product dosage levels, batch sizes, and equipment product contact surface areas (3). The dosage and batch size parameters are typically well defined for a product before it gets to commercial production. The equipment product contact surface areas, however, require more planning and more steps to execute. Many vendors do not supply product contact surface areas or even dimensions of the product contact surfaces. Measurements and calculations of each piece of equipment’s product contact surfaces must be completed, documented, and reviewed to determine the ARL of the product. Although it can be time-consuming to calculate the product contact surface area for a piece of equipment, the resulting figure only changes when the equipment is altered, which would only occur under a documented change control.

In addition to establishing a calculated ARL, the first criterion of a cleaning procedure is that the equipment be visibly clean after the cleaning process. The visible residue limit (VRL) is the level below which a residue is not visible, under defined viewing conditions. It can be a valuable tool when applied to a cleaning validation program. Viewing distance, viewing angle, and viewing light level must be defined for the facility and applied to the determination of a VRL.

VRLs have been used for a number of cleaning validation applications (4,5). Once established, VRLs can be implemented for multiple aspects of a cleaning validation program (6), including routine confirmation that the equipment is cleaned to appropriate levels before changeover to a new product. One approach to cleaning validation is to validate every product manufactured at a site. This approach, however, is impractical at multi-product facilities.

To streamline validation efforts, products can be grouped together by therapeutic family or based on the hardest-to-clean products manufactured on the equipment. Defining the worst-case product can itself be a challenge. Solubility of the API is sometimes used as the criteria for determining “hardest to clean,” but this approach ignores the product excipients, which can often be more difficult to remove from equipment than the API.

Cleanability of the product residue more closely approximates the actual cleaning procedure necessary to result in acceptably clean equipment. Cleanability study conditions can range from simple immersion to vigorous cleaning actions (7-10). As long as the cleanability parameters are consistent, they can define a relative cleanability ranking of the site product residues.

Even with an executed cleanability study, before validation studies can be executed, cleaning development studies must define and confirm the cleaning conditions necessary to clean product residue effectively, down to an acceptable level. Cleaning development studies define the critical cleaning parameters and their ranges, which will assure the consistent capabilities of the cleaning process. The critical cleaning parameters can include detergent definition and concentration, water temperature, cleaning action (e.g., impingement, cascade, or manual scrubbing), contact time, and rinse times. The conditions established during cleaning development should be confirmed under an executed protocol prior to validation activities.

Finally, recovery factors must be established to demonstrate that swab or rinse samples taken after cleaning are representative of the cleanliness of the equipment (1, 2, and 11). To accomplish this, a known amount of the residue of interest is first spiked onto a coupon of the material of construction representing the manufacturing equipment. The spiked residue is recovered, either by using a swab wet with a solvent to dissolve the residue, or with a known volume of purified water to represent the final rinse of the cleaning procedure. The recovered samples are then tested to determine the amount of recovered residue. The recovery factor is the amount of residue recovered compared to the amount of residue spiked onto the coupon.

Although the data are typically generated for these cleaning validation parameters in separate efforts, a coordinated study can determine all of the factors using available data and an efficient set of experiments. This approach will save time and resources, and result in a more aligned set of cleaning limits, soils to be validated, and recovery factors for the cleaning validation studies.



Acceptable residue limit determinations
The ARL must be determined prior to cleaning development, analytical method validation, visual limits and recovery factors. The subsequent data and recovery factors are all based relative to the ARL of the residue of interest. The primary goal of a health-based ARL is patient safety, and the recommended method to define a health-based calculation incorporates the acceptable daily exposure (ADE) (12), which is the amount of material one can ingest on a continued daily basis without harmful pharmacologic effect. Although the ADE approach is preferred as being more scientifically sound, the ADE can be compared to a currently used 1/1000th minimum-daily-dose (3) approach during the transition period to the use of ADEs to determine if additional validation work is necessary.

The factors that go into an ARL calculation include the dose and the batch size of the next product, which determine the degree to which any carryover is spread among subsequent product dosages. The product contact surface area of the manufacturing equipment assumes an even distribution of any worst-case carry over residue. This assumption of even residue distribution is addressed by swabbing worst-case, hardest-to-clean and critical equipment locations; that is, those locations where residue is most likely to build up, or where residue could be transferred to a small number of subsequent doses.

The swab area and the recovery factor enable one to relate the result for a single sample to a total residue amount in the manufacturing equipment. The health-based calculation employing the ADE is shown in Equation 1:

(ADE (mg/day))/(MDD (doses/day)) x (BS (doses))/(SA (cm2)) xM (cm2/swab)x RF=ARL (mg/swab) (Eq. 1)

ADE is ADE of Product A being cleaned (in mg/day)
MDD is maximum daily dose of subsequent Product B (in doses/day)
BS is batch size of product B (in number of doses)
SA is product contact surface area of the equipment train (in cm2)
M is swab area = 25cm2
RF is recovery factor (e.g., 0.90 for a 90% Recovery Factor)
ARL is acceptable residue limit.

The intent is to consider the health-based ARL and the VRL to satisfy regulatory requirements for cleaning so that the patient is safe and the equipment is visually clean. Although the VRL for residues should be related to the health-based cleaning limit, as well as the analytical detection limit (LOD), the VRLs can be determined prior to defining the final ARL because the VRL is more an experimentally determined physical characteristic of the API or product established under defined condition rather than a relative characteristic based on other factors.

Visible residue limit studies
VRLs must be determined using well-defined viewing parameters to better transfer the implementation of the VRLs to the production equipment and to limit subjectivity. The viewing variables associated with studying visible residue must be defined, and then experimental parameters for the study can be established. The parameters considered are:

  • Equipment material of construction

  • Light intensity

  • Viewing distance

  • Viewing angle

  • Observer subjectivity

  • Solvent effects.

Stainless steel is an obvious choice for surface material, because more than 95% of manufacturing equipment surfaces at a typical pharmaceutical manufacturing site are made from this material. Representative stainless-steel coupons are used for spotting purposes in the laboratory setting.

In addition to stainless steel, other widely used materials of construction (MOC) include:

  • Polytetrafluoroethylene (PTFE), including Teflon

  • High-density polyethylene (HDPE)

  • Low-density polyethylene (LDPE)

  • Polycarbonates, including Lexan

  • Glass, which can be addressed as part of the cleanability determinations.

Although the VRLs for Lexan and glass are comparable to that of stainless steel (13), VRLs for the remaining MOCs would be higher than the VRL for stainless steel. Cleanability provides a much simpler answer to the question of VRLs for different MOCs.

Lighting conditions in the manufacturing plant typically differ, from room to room. The light intensity is measured in each room of the plant and the wash area to determine a range. For consistency, the light measurement is taken in the same location in each room, for example in the center of each room, approximately four feet from the ground. The light level in a typical pharmaceutical manufacturing plant generally ranges from 200-1000 lux.

The viewing distance and viewing angle are based on the manufacturing equipment that is used at the site. Larger pieces of equipment can often be viewed at a distance of no greater than 10 feet, and, if the equipment is disassembled for cleaning, the viewing angles are marginally limited for visual inspections.

Suspensions of the products are prepared in methanol at concentrations of the API and spiked onto stainless-steel coupons. For products compressed from common formulation blends, the highest potency product is used for the VRL determination. For the remaining products, the single strength manufactured at the site is used for VRL determination.



The spiked coupons are allowed to air dry, and the distance, angle and light level viewing parameters are set. Site personnel view the spiked coupons from multiple distances and angles. The VRLs are determined at a distance of two feet. Increasing viewing distance and viewing angle observations of the spiked coupons establish the viewing parameter limitations on the ability to see the VRL levels. If the observers are not able to see the VRLs at the distance of the larger equipment, this limits the use of VRLs to those pieces of equipment that can be viewed from established VRL viewing limitations. Literature references (6) have shown that most VRLs can be detected from 10 feet. Any seeming inconsistency is likely a result of very low VRLs that were determined experimentally. Some VRLs can be detected at levels as low as 0.1µg/cm2, compared to the literature average of 1.1µg/cm2. The viewing angle should be greater than 30 and the light level should be greater than 200 lux (6).

The ARL and VRL will indicate how clean the equipment needs to be. Table I shows results spiking preparation and target concentrations for one test. The next step is to determine which product(s) or residue(s) to validate.

Cleanability/cleaning development studies
For efficiency and economy, the initial cleaning development work can be conducted at laboratory scale in three phases. These laboratory-scale process and cleaner studies (PACE evaluation) were executed in the technical laboratory at STERIS Corporation, St. Louis, Missouri. The Phase 1 studies challenge a worst-case set of conditions. Baked-on residues are cleaned using different cleaning agents, cleaning mechanisms, times and temperatures. Once the optimal cleaning agent and conditions have been identified, the cleaning parameters are challenged in Phase 2 studies, to determine the minimum times and concentrations necessary to achieve clean equipment. In the Phase 3 study, the minimum cleaning parameters identified in Phase 2 studies are used to clean the worst-case soil identified in Phase 1 studies, from the MOCs that make up most of the product contact surfaces of the equipment at the facility.

In Phase 1 of the study, to create worst-case conditions, cleanability studies are performed on the products and blends manufactured at the facility, to determine which cleaning agent will adequately remove product residue. The study results will provide cleaning conditions, including the concentration of detergent to be used, critical cleaning parameters, time for rinse, etc.

In the Phase I cleanability study, dry, clean stainless-steel coupons are weighed on an analytical balance (±0.1 mg) to obtain their pre-coating weight; then they are coated with 3-5 mL of 10% w/v slurry or 3-5 grams of sample. They are then baked at 57 °C for 4 hours then air dried overnight, and weighed on an analytical balance. The coated surface area is measured, the dry coating weight calculated, and the “loading” of the sample, in milligrams per square centimeter of dried residue, is determined. The spiked coupons can be cleaned by agitated immersion, spray wash (11 psi), cascading flow, or scrubbed manually using a nylon-bristled brush; in addition to the cleaning technique, the type and concentration of detergent, the cleaning temperature, and the cleaning time are recorded.

After cleaning, the coupons are removed and visually observed for cleanliness; then, each side of the coupon is rinsed, first, with tap water for 10 seconds at a flow rate of 0.5 gal/min and then with de-ionized water. It is then examined for a water break-free (WBF) surface, after which it is dried and weighed on an analytical balance to determine the post-cleaning weight.

A coupon is considered to be clean if it is visually clean, water break-free, and if its pre-coating weight and post-cleaning weight are equal (0.0 mg residue). WBF is a qualitative test that indicates the cleanliness of a metal surface. On a clean surface, free from organic residue, water sheets evenly without breaks in the water film as it runs from the surface of the metal panel. The results of a Phase 1 case study are shown in Table II, with the worst-case soils designated. Two soils are designated as worst-case Product A is manufactured in dedicated equipment. Product B is manufactured in multi-use equipment with the remainder of the site product portfolio.

Phase 2 study
Phase 2 studies further evaluate the effectiveness of the cleaning agents that demonstrated positive results in Phase 1. These tests are run under conditions that more realistically reflect what would be experienced in actual use. The major difference in the Phase 2 coupon preparation is that the coupons are air dried only for 24 hours, rather than dried in an oven. Phase 2 cleanability studies are performed to provide a more focused look at critical cleaning process parameters such as time, temperature, concentration of detergent, and cleaning agent, and to demonstrate the ruggedness of the cleaning parameters identified in Phase 1 studies. The two worst cases, one product and one blend, from Phase 1 testing, are tested in Phase 2 to minimize testing resources and the results shown in Table III.

Phase 3 study
In Phase 3 studies, the minimum cleaning parameters identified in Phase 2 studies are used to clean the worst-case soil identified in Phase 1 studies from the MOCs that make up most of the product contact surfaces of the equipment at the facility. The worst-case product is applied to different MOC coupons, air dried for 24 hours, and cleaned using agitated immersion at the previously identified cleaning parameters. The results are shown in Table IV.

If all of the materials of construction are cleaned for the worst-case soil under the same cleaning conditions, it can be concluded that, if the equipment’s stainless-steel surfaces are clean, the other equipment surfaces are cleaned to the same level of cleanliness. An acceptable visual inspection of the stainless-steel surfaces would provide confidence that the other surfaces such as PTFE, HDPE, or LDPE, on which spots would be more difficult to detect, are clean to the same acceptable level.



Swab recovery studies
Swab sampling is the preferred technique to determine equipment cleanliness because it is direct surface sampling and targets hard-to-clean and critical locations such as tablet press tooling.Swab sampling is generally more sensitive than rinse sampling because of the larger volumes associated with final rinses. An accurate swab sample requires that a swab recovery factor be established. The swab recovery factor is established by spiking a known amount of the API or product formulation onto a material of construction coupon, letting it dry, and swabbing the coupon to recover the residue.

To execute a swab recovery study, the parameters of the study must first be defined. These parameters include the coupon MOC, swab area, swab manufacturer and mode, number of swabs, swab solvent, swab technique, the extraction solvent, and the swab extraction parameters.  Once these parameters are defined, they can often be applied across the APIs and products that require recovery factors. The last remaining parameter is the level of analyte to recover (i.e., the amount of analyte to spike on the coupons). 

The logical level at which to perform a recovery is at the cleaning limit itself, because the cleaning limit is the pass/fail point of the residue test. For relatively safe products, the health-based cleaning limits are often quite high, for example greater than 1 to 10 mg/swab, which most likely would overload the swab and result in low recoveries. These levels also would be easily visible, which would fail the cleaning before the swabs are even taken. An efficient level for recoveries would be around the VRL level, because samples were already made for the VRL determination. For example, samples can be prepared at 5.0, 7.5, 10, and 12.5 μg of API/swab, slightly higher than the reported average VRL of 1.1 μg/cm.2 These levels are also close to the levels that one would expect to see after cleaning. The risk is that these levels are also close to the limit of quantitation (LOQ) of the analytical test method and could result in low recoveries with high % relative standard deviations (RSDs).

The swab recovery samples are prepared using the suspensions prepared for the VRL study. The 5.0-, 7.5-, 10-, and 12.5-μg samples are spiked onto stainless-steel coupons using 50, 75, 100, and 125 μL, respectively, of the 100-μg/mL suspensions. The spiked coupons are allowed to dry. For swabbing, each sample container is labeled, recording the API/product, amount (µL), volume (mL), name, and date.
The swab is wetted with methanol, water, or other appropriate solvent, which will dissolve the API. Any excess solvent is removed by pressing the swab against the side of the sample container to wring out excess solvent from the tip. The swab area is at least 25 cm2, and pains must be taken to ensure that all the area covered by the dried residue is swabbed.

The swabbing technique is shown in Figure 1. Using the flat side of the swab, slight pressure is applied to the swab stick, and full contact is made with the coupon. The coupon should be swabbed, using a back-and-forth motion, for approximately 10 seconds. Then, one should flip the swab over and swab the coupon in a perpendicular direction, using a back-and-forth motion, for approximately 10 seconds. Finally, one should snap the swab head into the sample container and close the container. The swabbing is repeated for each coupon and the samples submitted for analysis. The validated high-performance liquid chromatography (HPLC) test methods used for analysis should be specific for each analyte.

The swab recovery results should be greater than 70% for stainless steel based on historic data (14). The variability (%RSD) should be less than 10%. However, performing recoveries at levels close to the analytical method LOQ can result in lower-than-expected recoveries, with higher %RSDs. The results will be affected by a number of factors:

  • Low spike levels near the LOQ of the analytical methods

  • Experience level of the staff performing the recoveries

  • Robustness of the extraction parameters

  • Size of the swab head.

A larger swab head would be expected to retain slightly more residue than a smaller swab head of the same material. Ideally, the swab recovery spike levels are around the VRL, but these levels are likely to be too low for quantitation by the HPLC methods after swab recovery and extraction. That is why the swab spike levels are targeted slightly above the method LOQs. The proximity of the spike levels and the method LOQs could contribute to both low recoveries and high variability if small, variable amounts of analyte adhere to the swab. Also, the lower HPLC area counts near the LOQ could contribute to the higher %RSD compared to comparably spread data with greater HPLC area counts.

Typical validated HPLC methods have LOQs of approximately 1 µg/mL. If the methods can be optimized with real samples prior to validation, a lower LOQ can often be established, and better recovery conditions identified. The effort required for optimization, however, may not be worth marginal improvements to the recovery data, especially when compared to the calculated ADE-based cleaning limit. Although the personnel involved in the recoveries are a factor, their contribution to low and variable results is considered minor compared to analytical issues (14).



Rinse-recovery studies
Rinse-recovery studies are conducted using the materials and previously prepared solutions or suspensions to offer the flexibility to use rinse sampling. Although rinse sampling is considered indirect surface sampling, it covers all of the product contact surface area and more easily samples those areas that are inaccessible to swab sampling. In addition, rinse samples are easier to take and more efficient to test. Rinse sampling, however, is generally less sensitive than swab sampling because of the larger volumes associated with final rinses. The rinse-recovery factor is established by spiking a known amount of the API or product onto a material of construction coupon, letting it dry, and rinsing the coupon to recover the residue.

To execute a rinse-recovery study, the parameters of the study must first be defined. These parameters include the level of analyte to recover, the coupon MOC, the rinse area, the rinse solvent, and the rinse volume. Once these parameters have been defined, they can often be applied across the APIs and products that require recovery factors.

To be consistent with the swab recoveries, the spike level for the rinse recoveries can be set at 10 µg of API or product onto a 25-cm2 area of stainless-steel coupon. Final rinses are all done with purified water, and a volume of 10-25 mL is used, making sure to keep the final solution at or above the method LOQ.

The rinse recovery results should be more than 70% for stainless steel. The variability (%RSD) should be less than 10%. However, performing rinse recoveries at levels close to the analytical method LOQ can result in lower than expected recoveries and with high %RSDs. The results will be affected by a number of factors: the low recovery levels near the LOQ of the analytical methods and the volume of the rinse water. Too small a volume will not remove the residue and too large a volume will be undetectable. For the residues with high %RSD as well as those for which no quantitative values, an investigation should be conducted to determine a root cause.

Although the described studies serve to establish the necessary background data for the cleaning validation effort, there were several issues that could be addressed differently. The VRLs are established under laboratory conditions, and recent VRL levels obtained averaged 0.1 µg/cm2. This raises the concern that data might not translate to full-size equipment in the manufacturing plant. Future work would answer that question, by taking spiked coupons and placing them inside actual equipment, or equivalent conditions, to confirm the laboratory generated data.

Very low VRLs also raise the question of how the VRLs are defined. Past work had defined the VRL concentration by dividing the amount of material spiked by the entire surface area of the circle formed, even though most of the material forms a ring, leaving the middle of the circle empty. This is not a concern as long as the VRL determinations have been defined consistently. With the VRL defined using this approach, however, the VRL of residues could approach or even be lower than the LOQ of the analytical method. Defining the VRL as the area of only the ring and not the entire circle might be a better, more realistic approach, and could relate more closely to the recovery factors and the analytical test method LOQ.

The LOQ of the HPLC analytical test methods should be optimized for cleaning validation samples. A lower LOQ would alleviate some of the concern with the VRL levels and probably would improve the variability of the swab and rinse recovery studies. The added value must be deemed significant enough, however, to expend the additional resources for this work. Coordination with the testing laboratories is essential during cleaning development work.

The rinse-recovery study levels of 10 µg are based on swab recovery levels. Because the rinse volumes are higher than the swab extraction volumes, the resulting rinse-recovery concentrations are lower. To ensure accurate data, the rinse recovery samples should be spiked based on the final concentration of solution (µg/mL) of the rinse-recoveries.

Background data for a cleaning validation program can be generated in an efficient, coordinated effort for a pharmaceutical manufacturing facility. Using this approach, the ARL calculations and the cleanability data are combined to define the worst-case product for cleaning. The VRL, swab recoveries, and rinse recoveries are established using a single set of product suspensions, which illustrate the relationship among the three factors. These studies clearly demonstrate that a cleaning validation program can be established or revalidated using an efficient, coordinated effort to establish the necessary background information.

The author gratefully acknowledges Amanda Deal, Steve Robbins and Paul Lopolito of STERIS Corporation for their support, and for conducting the PACE studies as part of this development project case study.

1. FDA, Guide to Inspection of Validation of Cleaning Processes (Division of Field Investigations, Office of Regional Operations, Office of Regulatory Affairs, Washington, D.C., July 1993).
2. EU, Guidelines for Good Manufacturing Practice for Medicinal Products for Human and Veterinary Use, Annex 15: Qualification and Validation (2014).
3. G. L. Fourman and M. V. Mullen, Pharm. Technol. 17 (4), pp. 54-60 (1993).
4. R. J. Forsyth and V. Van Nostrand, Pharm. Technol., 28 (10), pp. 58-72 (2004).
5. R. J. Forsyth and V. Van Nostrand, Pharm. Technol., 29 (10), pp. 152-161 (2005).
6. R. J. Forsyth, Pharm. Technol., 33 (3), 102-111 (2009).
7. A. J. Canhoto, R. Azadan, et al., J. Va. Technol., 11 (1), pp. 6-15 (2004).
8. A. J. Canhoto, R. Azadan, “A Scientific Approach to the Selection of Cleaning Validation Worst-Case Soils for Biopharmaceutical Manufacturing,” in PDA - Cleaning and Cleaning Validation 1, (2009).
9. N. Rathore, W. Qi, C. Chen, and W. Ji, BioPharm Inter., 22 (3) (2009).
10. C. Chen, N. Rathore, W. Ji, and A. Germansderfer, BioPharm Inter., 23 (2) (2010).
11. Health Canada, Cleaning Validation Guidelines (Guide-0028) (2008).
12. D.L. Conine, B.D. Naumann, and L.H. Hecker, Qual. Assur:Good Proc., Regul. & Law 1, pp. 171-180 (1992).
13. R. J. Forsyth, K. Bader, and K. Jordan, Pharm. Technol., 37 (10), pp. 2-6 (2013).
14. R. J. Forsyth, J. C. O’Neill, J. L. Hartman, Pharm. Technol., 31 (10), pp. 102-116 (2007).

Article DetailsPharmaceutical Technology
Vol. 39, No. 10
Pages: 52–60

Citation: When referring to this article, please cite it as R. Forsyth, “Rethinking Limits in Cleaning Validation,” Pharmaceutical Technology39 (10) 2015