Virtual screening using chemical feature-based pharmacophores and virtual molecule libraries

April 1, 2006
Thierry Langer

Gerhard Wolber

Pharmaceutical Technology Europe

Pharmaceutical Technology Europe, Pharmaceutical Technology Europe-04-01-2006, Volume 18, Issue 4

The discovery of suitable lead structures for new drugs from an inexhaustibly large reservoir of theoretically possible compounds is one of the biggest challenges for the pharmaceutical industry. In the last few years, combinatorial chemistry methods have been developed to synthesize a huge amount of diverse new chemical entities (NCEs), which may subsequently be tested for biological activity in vitro.

The discovery of suitable lead structures for new drugs from an inexhaustibly large reservoir of theoretically possible compounds is one of the biggest challenges for the pharmaceutical industry. In the last few years, combinatorial chemistry methods have been developed to synthesize a huge amount of diverse new chemical entities (NCEs), which may subsequently be tested for biological activity in vitro. Although combinatorial chemistry is extremely efficient from a technical point of view, the number of compounds that can be produced using this technology still only covers a small fraction of chemical space. Modern combinatorial libraries may contain up to several millions of molecules, while overall chemical space is supposed to comprise more than 1060 small molecules when considering all reasonable molecules with a molecular weight up to 500.

The experimental effort to measure biological activity of big libraries, however, is still considerably high. Furthermore, these combinatorial libraries are typically designed for high diversity and not for a specific target, resulting in bad hit rates for the in vitro screening process. Computer-aided drug design methods may be able to overcome these difficulties: by simulating the synthesis of chemical compounds and their biological affinity to a macromolecular target, computational methods help medicinal chemists to decide which molecules to synthesize first.

The filtering of large molecule libraries through the use of computational methods based on discrimination functions that permit the selection and prioritization of promising candidates has been termed 'virtual screening' (VS). From several millions of compounds, such a VS procedure should be capable of reducing the number of molecules to be examined experimentally to several hundred candidates. Moreover, these selected compounds shall constitute more promising lead candidates than 'random' hits as found from screening a combinatorial library.

VS techniques and 3D pharmacophores

Each screening technique, regardless of whether it is a high-throughput in vitro technique or a computational approach, is assessed by the time and related cost it takes to process a molecule. Therefore, it is essential to reduce search space with efficient and fast techniques at the beginning, advancing to more accurate and time-consuming methods in later stages, where fewer molecules are left. Obviously, the paradigm that a compound library has to show maximum diversity is no longer valid; a library must be focused, compounds must be synthetically feasible, and, most importantly, libraries must conform to specific constraints regarding absorption, distribution, metabolism, elimination and toxicity (ADMET), such as oral bioavailability or blood-brain permeability.

Figure 1

Most of these requirements can be fulfilled by calculating topological descriptors allowing the approximate calculation of physicochemical properties. Our VS process, therefore, starts by examining these descriptors, which can be calculated very efficiently and are thus applied to the first screening stage to filter an existing library.

The next, more time-consuming step is to use the 3D structure of the target and/or its ligand. Three-dimensional structure-based drug discovery is usually tightly associated with docking algorithms, which align a small organic molecule in the 3D structure of the binding site of a macromolecule and subsequently score the optimal position. Docking programs, however, are rather time-consuming: one single experiment takes from 20 s to several minutes for each molecule and, therefore, their use for VS purposes is limited; even if expensive computer clusters or grids are used for calculation.1–3

Figure 2

A different approach, pharmacophores describing drug–target interaction, has been widely used over the past decades. It describes biological activity in computational approaches and can be incorporated in a VS strategy. A pharmacophore (pharmacophore model, pharmacophoric pattern) is defined as the ensemble of steric and electronic interactions of a ligand necessary to trigger biological activity for a specific target. Following this definition, a chemical feature in a pharmacophoric pattern does not describe functional groups, but abstract and thus universal chemical functionality such as hydrogen-bond-acceptors/donors, hydrophobic contacts and charge interactions. Typically, pharmacophores are used to characterize chemical feature sets occurring at several comparable compounds, which show biological activity for one specific target.

Novel software tools have been devised to derive 3D pharmacophores directly from a known macromolecular structure.4 For example, a protein structure that was determined experimentally using X-ray crystallography (Figure 1). While the accuracy of structure-based 3D pharmacophore screening is comparable to docking, it is considerably faster: using the pharmacophore screening platform Catalyst (Accelrys Inc., San Diego, CA, USA), VS of several 10000 molecules can be performed in minutes using standard hardware.

Figure 3

Virtual compound libraries

Although VS is a valuable tool for examining libraries with real chemical molecules, it can also be used for de novo drug design by examining virtual libraries consisting of artificial compounds. Criteria for assessing the quality of a virtual library include:

  • similarity to ligands for a specific target

  • maximum diversity within the similarity constraints

  • conformity to the correct ADMET rules such as oral bioavailability or blood-brain-barrier permeability

  • synthetic accessibility

Library generation tools address these challenges by providing a balanced set of chemical building blocks combining the various reactive sites using a Monte Carlo algorithm to create novel structures with a maximum of diversity within the user's specified scope (Figures 2 and 3). ADMET filters include oral bioavailability, blood-brain-barrier permeability, lead-likeness and drug-likeness.5 It can be used to simply generate diverse libraries to generate new ideas for lead structures or even to simulate more complex results of specific reactions such as the Biginelli condensation as shown in Figure 4.6

Key points


Structure-based pharmacophore modelling together with virtual library generation is a valuable and useful tool for early-stage drug discovery. Three-dimensional pharmacophores can not only be used to select compounds that show desired biological effects, but also to exclude compounds that show an undesired side-effect. The hits obtained by VS will enable the pharmaceutical industry to find more promising candidates in all stages of drug discovery: molecules that have a higher probability to reach clinical phases and also exhibit a lower risk to fail in later development stages.

Figure 4

Gerhard Wolber and Thierry Langer are executive partners at Inte:Ligand, Austria.


1. B. Kramer, M. Rarey and T. Lengauer, Proteins: Structure, Function and Genetics 37(2), 228–241 (1999).

2. G. Jones et al., J. Mol. Biol. 267, 727–748 (1997).

3. C.M. Venkatachalam et al., J. Mol. Graphics Modell. 21(4), 289–307 (2003).

4. G. Wolber and T. Langer, J. Chem. Inf. Model 45(1), 160–169 (2005).

5. T.I. Oprea, J. Comput. Aid. Mol. Des. 14(3), 251–64 (2000).

6. A. Stadler and C.O. Kappe, J. Comb. Chem. 3(6), 624–630 (2001).

Related Content:

Formulation | Development