Use of Artificial Neural Networks and Genetic Algorithms - Experiences from a Tablet Formulation

May 1, 2004
Elizabeth Colbourn, Nils-Olof Lindberg
Pharmaceutical Technology Europe
Volume 16, Issue 5

This article describes the formulation of a tablet for a specific purpose, primarily using fractional or full factorial designs. The formulation work generated a matrix that was processed by two software packages based on neural networks. When the dataset was divided into smaller subsets, the agreement between the predicted and observed tablet properties of the optimized formulations was reasonable.

Tablets - the most common of all dosage forms - are solid preparations each containing a single dose of one or more active substances. The active pharmaceutical ingredients (APIs) are combined with excipients; common excipients are fillers (diluents), binders, disintegrants, glidants and lubricants. Other less common excipients include colours, sweeteners, flavouring substances and substances capable of modifying the behaviour of the preparations in the body (such as buffers).

In this article, a tablet formulation for a specific purpose - produced by direct compression - was investigated. Artificial neural networks (ANN) and genetic algorithms have been applied previously in the formulation of tablets.1-3 When models generated using ANN have been compared with those generated using statistical techniques, the ANN results were either equivalent or better. The latter conclusion was specifically valid when applications were multidimensional in nature and when variables exhibited interdependencies. ANN are able to deal with complex applications in which data is fuzzy and non-linear; they can be used wherein rules are extremely difficult to develop but there is a large amount of historical data.4

Formulation work

During this formulation work, designed experiments were generally used. Most of the experiments were fractional or full factorial designs with the centre point replicated three times.

With traditional approaches to experimental design, the options are to use ordinary variables, such as particle size and viscosity, as design factors for each excipient, or to include the excipients in a qualitative factor for each excipient class (such as filler or binder). Each has its disadvantages. With the former, keeping the number of experiments on an acceptable level requires using only a few variables for each excipient. With the latter, the number of necessary experiments drastically increases with many excipients in the qualitative design factor.

The principal properties of the ingredients have been used as a means of reducing the number of runs in factorially designed experiments.5 The principal properties were calculated for different excipient classes such as binders and disintegrants; therefore, the excipients were characterized by means of Fourier transform infrared (FTIR) and near infrared (NIR) spectroscopy in multiple variables. For each excipient class, separate principal component analysis models were fitted. The important information was extracted in a few principal components or principal properties, and these were assumed to reflect real differences in excipient properties. Consequently, the excipients can be compared and related to a continuous scale of principal properties under these assumptions.

The main excipients employed in the formulation work were divided into three classes: fillers, binders and disintegrants. Excipients can have dual functions and were classified according to both functions. The API was also characterized by FTIR and NIR, and was described by one principal component.

The formulation work generated a great amount of data. Approximately 250 formulation tests were made. With many formulations more than one compression force was used, generating a matrix with approximately 600 rows (records). There were 75 variables, of which 60 were input variables (independent variables, descriptors, factors) and 15 were output variables (dependent variables, response variables). The disintegration time and the resistance to crushing (crushing strength) are the most important properties of this kind of tablet. Other responses were also measured but are not discussed here.

Generally, the different excipients were described by their principal properties; that is, they are described by quantitative variables in the calculations. For some of the studies, however, the different excipients were described as qualitative variables. The software used can handle both alternatives.

Software

Calculations were performed with the INForm software package (Intelligensys Ltd, UK). Multilayer perceptron neural networks are used to generate models that link inputs to outputs. Genetic algorithms are then employed for optimization in the multidimensional space to produce a tablet with specific desired properties. The advantage of ANN is that they can generate models for non-linear problems, without prior knowledge of the functional form of the relationship.

That models are created but no simple mathematical function is presented for the user is a disadvantage; this is the"black box" model. The multiple correlation coefficient (goodness of fit, R2 ) indicates if it is a good or poor model. Of the subsets, 90% of the data (the training data) were automatically selected for developing models whereas the remaining 10% (the test data) were kept back and used for validation of the model. R2 values greater than 0.85 indicate that the model (developed using the training data) is very good, even if R2 5 0.85 has been considered to be an arbitrary acceptance limit.6 It is important not to over-fit the data - to model the relationships and the noise in the data.

Calculations were also performed with the FormRules software package (Intelligensys), to determine what were the key variables that affected the formulation. The software is based on neurofuzzy logic, a technology that combines the learning capabilities of an associative memory network with the capabilities of fuzzy logic to express complex concepts in a simple linguistic form. Rules, with an associated confidence level, are derived from the models, giving what has been termed a "grey box" model. The software selects the important variables using neurofuzzy logic to develop parsimonious models by trying out various models, starting with the simplest and then trying more complex models to see which one fits the data best while still developing simple models. Statistical criteria are used to select a model that balances simplicity with goodness of fit. No optimization is possible with this software.

Results and discussion

During the formulation work more than one type of filler, water-soluble and water-insoluble, and more than one binder were tested. Some excipient classes were excluded while others were included. This made the matrix non-homogeneous from a formulation point of view and caused missing inputs. As INForm cannot handle missing inputs (cause-and-effect relationships are missing) many rows in the matrix had to be excluded.

Not until the matrix was divided into smaller subsets that were homogeneous regarding some of the major constituents, such as the water-soluble filler (filler1) and the main binder (binder1), was there a reasonable agreement between predicted and observed tablet properties of the optimized formulations (Table I). In formulation 1, from a subset containing 39 records, there was a water-soluble filler (filler1) and only one binder (binder1). The major excipients were filler1 and binder1. Formulations 2 and 3, obtained from a subset with 71 records, also contained a water-insoluble filler (filler2) and a second binder (binder2) compared with formulation 1.

In formulation 3, where the concept with principal properties was applied, filler2 and binder2 with the principal properties suggested by the software were not commercially available. Instead, excipients with similar principal properties were used; the excipients in these two formulations are similar but not identical as the amounts of the excipients were different.

The model for the disintegration time of formulation 1 was better than that of crushing strength (Table I). There was an excellent agreement between predicted and observed disintegration time. The model of crushing strength was poor according to the R2 value for the test data. Only three records were withheld for the test data because of the size of the data set. R2 for the test data tended to depend on which records were kept back for testing. Various options were tried but none resulted in a good value of R2 for the test data. Consequently the records that gave a good model for the disintegration time were selected; these results are summarized in Table II. Considering the difference between the actual and predicted data, the difference between the predicted and observed results for the crushing strength in Table I was acceptable.

In formulation 2, acceptable models were obtained for disintegration time and crushing strength (Table I). The model for disintegration time was somewhat better, with an acceptable agreement between predicted and observed responses.

Formulation 3 was based on the same subset as formulation 2 but principal properties were used as input variables to the neural network. The model for disintegration time was better according to the R2 value for the test data (Table III). There was a large difference between predicted disintegration times using the principal properties suggested by the software and using the principal properties of the excipients that were actually used in the formulation.

The disintegration time was influenced by two principal properties (Table IV): the first one of filler2 (Insolfi1) and the third one of binder2 (Bindpp3). However, input variables other than the principal properties dominated the influence on disintegration time. There was an acceptable agreement between observed and predicted disinte-gration time.

The crushing strength of formulation 3 was not affected by any of the principal properties, as is clear from Table IV. Consequently, the difference between the predicted results based on the principal properties suggested by the software and those based on the principal properties of the actually used excipients was small (Table III). However, the discrepancy between observed and predicted crushing strength was too large, as the model was poor.

In Table IV the figure within parentheses represents the number of membership functions, where 2 means two functions: high and low; and 3 means three functions: high, medium and low. It is obvious from Table IV that only disintegration time is affected by input variables with principal properties, Bindpp3 and Insolfi1.As the difference between predicted and observed disintegration times of formulation 3 was acceptable (Table III), the difference between actual and theoretical principal properties was not very influential in this case.

Conclusions

During this formulation work, designed experiments were generally used. Most of the experiments were fractional or full factorial designs with the centre point replicated three times. When the dataset containing all of the formulations from the formulation work was divided into smaller subsets (which were more homogeneous regarding the major constituents), the agreement between predicted and observed tablet properties of the optimized formulations that were determined using ANN and genetic algorithms was reasonable. With formulation 1, with a water-soluble filler and one binder, the agreement between predicted and observed values of disintegration time was excellent; for crushing strength it was acceptable.

Formulation 2, which also contained a water-insoluble filler and a second binder compared with formulation 1, had acceptable agreement between predicted and observed disintegration time and crushing strength. With formulation 3, based on the same subset of experiments as formulation 2 but using principal properties as input variables, there was an acceptable agreement between observed and predicted disintegration time. However, the discrepancy with the crushing strength was too large. Generally, the neural networks, together with genetic algorithms for optimization, have performed satisfactorily in producing tablets with specific desired properties.

References

1. R.C. Rowe and R.J. Roberts, Intelligent Software for Product Formulation, (Taylor & Francis, London, UK, 1998) pp 161-166.

2. E.A. Colbourn and R.C. Rowe, "Modelling and Optimization of a Tablet Formulation using Neural Networks and Genetic Algorithms," Pharm. Technol. Eur. 8(9), 46-55 (1996).

3. R.C. Rowe and E.A. Colbourn, "Applications of Neural Computing in Formulation," Pharmaceutical Visions, May, 4-7 (2002).

4. R.C. Rowe and R.J. Roberts, Intelligent Software for Product Formulation, (Taylor & Francis, London, UK, 1998) pp 64-66.

5. J. Gabrielsson et al., "Multivariate Methods in the Development of a New Tablet Formulation," Drug Dev. Ind. Pharm. 29, 1053-1075 (2003).

6. A.P. Plumb, "Evaluation of Artificial Intelligence in the Modelling and Optimization of Pharmaceutical Formulations," Masters Thesis, University of Bradford, Bradford, UK (2000)