The two primary factors for statistical analysis were the material of construction and the test substance. The data were analyzed
to determine if there were any trends or relationships that might be leveraged to reduce testing without compromising quality.
The statistical methods used included analysis of variance to compare recovery factors across substances and materials. Variance
components analysis estimated variance across sites and repeated tests. Least squares means estimated average recovery factors
adjusted for different materials and substances and confidence intervals around those averages. Exploratory data analysis
identified the primary causes of differences in recovery factors.
Results and discussion
Assumptions in recovery-data analysis. Although standardized methods for recovery exist throughout the company, each site that conducted recovery studies implemented
these procedures with different equipment, personnel, and slight modifications to the methods and swab technique. For the
purposes of this analysis, these differences were all combined into a site-to-site variance component. Specific causes of
variation among sites were not explored.
This data analysis was performed with available data from routine recovery studies at each site. There was no attempt to
collect data in a structure, so the data were not balanced across materials, substances, or sites. For example, a particular
product may have been studied on only one material at one site. This approach made it difficult to separate recovery differences
based on materials from recovery differences attributed to substances or sites. Analysis of variance on unbalanced data is
analogous to building a table with legs of unequal height. Although differences may be detected between substances or materials,
they are more difficult to support if the data structure is not well balanced across test conditions.
Analysis of data from such undesigned data sets can identify differences between substances and materials, but it does not
provide strong support for cause-and-effect relationships. Follow-up designed experiments will test the hypotheses generated
from this initial data exploration.
The data set used for this analysis, shown in Table I, consisted of 1262 RF values obtained for 48 different substances tested
on 29 different materials. The mean RF was 80 and ranged from 3 to 154. Although the majority of the substances tested were
products (APIs and formulated drug substances), detergents were also tested. There were 1072 RF values available for 42 products
and 190 RF values available for 6 different detergents.
Table I: Detergents and products tested.
The average RF was 13 units higher for detergents than for products. This result agreed with the concept that detergents are
typically formulated to be easily removed from surfaces. The standard deviations for the two groups were similar: 22 units
for detergents and 19 for products. The RF ranged from 30 to 154 for detergents and from 3 to 107 for products, shown in Figure
1, which each color represents a different material group.
Figure 1: Product recoveries by material. (FIGURE 1: MERCK & CO. INC.)
The recovery range for detergents was higher than expected. All of the high results were assayed using total organic carbon
(TOC) analyzers, and the product data were generated using high-performance liquid chromatography. Several detergents had
a low carbon load, and at low assay levels, TOC can have high recovery data accompanied with high assay variability (5). Historically,
detergent-residue assays have been far enough below the ARL that the high recoveries and increased variability were not considered
factors requiring additional work.