1. Introduction
In the field of medicinal chemistry lead optimisation, there has been an increased focus upon improving the high attrition rates of candidate drugs, where about 90 percent of clinical trials result in failure [
1]. Much of this attention has centred upon the concept of optimising Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) profiles of molecules. Particular attention has centred on the absorption aspect, thus designing drugs that possess a high degree of oral bioavailability has become a priority. This has led to the development of simple filters to identify compounds that satisfy this particular requirement. Starting with Lipinski’s seminal ‘Rule of 5’ criteria [
2], a number of filters have been designed to identify compounds with desirable ADMET profiles. These filters include the Ghose, Veber, Mugge, and Egan filters [
3,
4,
5,
6]. These ‘rule of thumb’ approaches are facile to understand, and apply, and therefore enable medicinal chemists to readily incorporate them into in their drug design workflows. The art of medicinal chemistry is mutiparametric in nature and lead optimisation is a complicated business. This is because teams need to not only take account of potency, but factors such as drug stability, toxicity, patentability, level of binding to plasma proteins, and pharmacodynamics need to be assessed. Consequently, methods aimed at predicting drug properties like pharmacokinetic profiles are attractive to drug hunters.
Also, learning about the profile of compounds as early as possible can also help to save on research expenditure (because of the traditional high attrition rates seen in drug development). Estimates of pre-launch research and development costs range from several hundred to multibillion US dollar levels.
Lipinski postulated that in most cases, orally active compounds tend to violate no more than one of the following criteria: a molecular weight of less than or equal to 500 gmol-1, a lipophilicity (logP) of less than or equal to 5, less than or equal to 5 hydrogen-bond donators, and less than or equal to 10 hydrogen-bond acceptors [
2]. It is interesting to note that these rules were originally framed as a way of selecting compounds that are more likely to be orally bioavailable, rather to solve any wider medicinal chemistry problems. This seminal research inspired further development of these criteria. These interations included the aforementioned Ghose filter, that constrained the molecular weight to being in-between 160 and 480 gmol
-1, and included a criterion of molar refractivity between 40 and 130 [
3]. Veber’s rules included a criterion for a topological polar surface area (TPSA) no greater than 140 Å2, and no more than 10 rotatable bonds [
5]. Egan’s ‘boiled egg’ filter further constrained the logP and TPSA criteria to no greater than 5.88 and 131.6 Å2 respectively for GI absorption, as well as including an additional set of criteria for drugs that are required to permeate through the blood-brain barrier [
6]. Mugge’s filter further restricted the criteria to a molecular weight of 200 gmol-1, a logP between 2 and 5, a TSPA no greater than 150 Å2, less than or equal to 15 rotatable bonds, and no greater than 4 rings [
4].
These filters are a highly valuable tool in the toolkit of a medicinal chemist; however, many researchers working in drug discovery have come to over rely on such filters, almost treating them as set in stone. This has resulted in the reduced exploration of molecular space, as chemists are confined to the rules set forth by druglikeness filters. Although a number of chemists have begun to explore beyond rule of five chemical space [
7], our understanding of this space is still relatively limited. It is also important to understand how overreliance of druglikeness filters may adversely affect the progress of drug development projects. The concept of molecular obesity has been proposed to illustrate a recent trend in the drug design process that artificially inflates the lipophilicity of compounds (in attempts to increase potency) [
8]. Although this may have the desired effect of increasing drug potency, the increasing rate of drug attrition in clinical trials has partially been attributed to molecular obesity. In other words, the inappropriate use of lipophilicity may lead to the progress of potential drug candidates becoming stunted.
However, these approaches are only suited for compounds that need to be absorbed through the gastrointestinal (GI) tract. It is important to flip modern ways of thinking around the properties needed for orally bioavailable molecules when considering agents suitable for non-systemic circulation. Also, there are a number of pathological targets that reside within the intestinal lumen, and thus require drugs exhibiting a low likelihood of passing through the GI tract. These include bacteria such as
Vibrio cholerae [
9] which produce absorption and secretion altering toxins, as well as
Shingella and
E.coli [
10,
11] which result in the destruction of the colonic epithelium. Several different methodologies have been developed for the design of ‘non-systemic’ drugs; however, this class of compounds are typically characterised by high molecular weights, resulting in difficult, and expensive synthetic methodologies [
12]. The ability to design small (molecular weight <500 Daltons) molecules with ‘non-systemic characteristics’ could facilitate the development of new drug classes suitable for non-systemic exposure. However, although standard druglikeness filters are effective at predicting druglikeness, they are not able to predict non-druglikeness with the same accuracy [
13]. So, there is a need to development new filters that can evaluate non-systemic drugs, and the identification of novel chemical descriptors may enable this to become possible. The concept of developing drugs with high gut residence is a vital tool in the treatment of diseases located in the small intestine, such as
Clostridium Difficile (
C. Diff) infections [
14,
15,
16,
17], and in our research into the discovery of non-systemic, fructose-selective aryl boronic acid scavengers for the treatment of fructose malabsorption. Drugs can pass through the small intestine through different ways [
18]. Most often, this is achieved via passive diffusion as described by Fick’s law. In the context of drug absorption through the small intestine, lipid diffusion is the most important type of diffusion in ascertaining a drug’s permeability, hence the importance of logP in determining a drug’s oral bioavailability. Alternatively, drugs can be absorbed via the use of carrier-mediated membrane transporters that utilise both active transport and facilitated diffusion. Active transport has the benefit of being able to transport drugs from areas of low concentration to areas of higher concentration, unlike passive diffusion. However, this process requires energy, unlike passive diffusion. Actively transported drugs are required to form a complex with the carrier molecule. One example of facilitated diffusion is shown by the behaviour of the antidiabetic drug Metformin, which forms a complex with organic cation transporter 1 (OCT1) before passage through the small intestine.
Lovering’s seminal paper introduced the chemical descriptor fraction of sp
3 carbon atoms (fSP
3), as shown by the formula in
Figure 1. The concept of ‘flatness’ [
19] has encouraged the design of more complex compounds that emulate natural products more than their counterparts that adhere solely towards traditional druglikeness filters, such as Lipinski’s Rule of Five [
2]. A select number of compounds are present in the literature with low fSP
3, resulting in poor oral bioavailability despite a low molecular weight. These compounds are ideal for the treatment of non-systemic targets. Lovering discovered that the average fSP
3 for marketed drugs is 0.47, and 0.36 for hit compounds. The introduction of fSP
3 was intended to design drugs with increased oral bioavailability, by taking into account drug design trends found in natural products. One of these trends is the incorporation of spirocycles. These compounds feature two molecular rings that share one atom between them, resulting in an increase in fSP
3 relative to compounds bearing two rings and sharing two atoms between them. Spirocycles have the added benefits of potential chirality [
20], resulting in increased affinity for protein binding and specificity [
21]. A number of recent compound libraries have been developed that incorporate spirocyclic groups in order to increase fSP
3 [
22,
23].
In this paper, we present the following new hypothesis; that fSP3 can be applied as an indicator for drugs with poor bioavailability. This may lead to the development of new chemical motifs for the design of non-systemic small molecules that are localised in the gut. The introduction of new metrics that are able to predict non-systemic exposure may lead to the development of new ADMET filters that can accurately identify non-systemic drugs, enabling medicinal chemists to develop suitable compound libraries. We look forward to the drug discovery community helping us to further validate this hypothesis.
Flaws in Lipinski’s Rule of Five and Quantifying Druglikeness
Lipinski’s seminal Rule of Five (Ro5) has provided medicinal chemists with a methodology to develop compounds with predicted high oral bioavailability [
2]. However, this rule is not perfect, and despite Lipinski’s indications that his filter should be only used as a ‘rule of thumb’ approach, it is treated by many medicinal chemists as absolute. This results in an over restriction in chemical space by medicinal chemists in drug design when compared to the larger structural variety found in natural products, which may lead to potential therapeutic strategies being omitted in drug discovery [
24]. No filter can be perfect at identifying orally bioavailable compounds, and Benet showed that despite only 2.8 percent of compounds with high solubility and permeability violating Ro5, 13.6 percent of compounds with low solubility and permeability violate Ro5 [
13]. This indicates that although Ro5 is an effective indicator for compounds with high oral bioavailability, it performs less well as an indicator for non-systemic compounds, which suggests a need for new metrics for the discovery of non-systemic compounds. It is important to also recognise that as drug discovery incorporates more computationally-lead methodologies, such as
de novo design [
25] and the application of Machine Learning (ML) models, the need to move on from binary (pass or fail) filters to more quantitative methods for evaluating drugs-likeness are needed, such as Bickerton’s Quantitative Estimate of Drug-likeness (QED) [
26]. Quantitative methods enable the comparison of druglikeness between different compounds and put them into greater context. This methodology is effective at quantifying oral bioavailability; however, the methodology is not as effective at identifying non-systemic compounds and is not as readily applicable as Ro5, and other more simple druglikeness filters. This is because the metrics used to identify orally bioavailable compounds are not as effective at identifying non-systemic compounds. Benet determined that 14.3 percent of highly permeable oral drugs violate Ro5, and 11.9 percent of highly soluble drug violate the filter, indicating there is scope to evolve the filter [
13]. Fraction of sp
3 carbon atoms has be used to develop new metrics, such as MCE-18 [
27], shown below in
Figure 2, which aims to identify novelty and lead potential in molecules. MCE-18 stands for Medicinal Chemistry Evolution, 2018,
and represents a useful tool.
MCE-18 aims to iterate upon the initial formula for fSP
3 by considering a major weakness in fSP
3, which is that fSP
3 fails to identify out of plane carbon atoms. This results in the 3-dimensional shape of a molecule not always being described by fSP
3 [
24].
Design of non-systemic compounds
A number of approaches have been discovered for the design of non-systemic drugs, resulting from Chamot’s proposal for five classes of non-systemic drugs: I) sequestering agents; II) ligands of soluble intestinal enzymes; III) enzymes; IV) minimally absorbed and rapidly metabolised drugs; and V) ligands of apical targets [
12]. Each of these classes present a different approach to the design of non-systemic drugs, by preventing absorption through the GI tract. Examples of classes I-II are shown in
Figure 3.
Class I compounds, also referred to as sequestering agents, bind to small molecules within the lumen of the small intestine, resulting in the formation of an insoluble complex that passes through the GI tract, and eventually eliminated through faeces. These compounds are typically synthesised as crosslinked polymeric materials of varying morphologies. Contact with intestinal fluids results in the resin swelling, and transfer of endogenous ion from the GI lumen with counter ions from the resin. Compounds in this class include bile acid sequestrants, such as Cholestramine, Colestipol, and Colesevelam [
28,
29,
30].
Class II compounds bind to host proteins in the intestinal lumen, resulting in the inhibiting of digestive enzymes due to blocking digestion of nutrients to easily absorbable molecules, reducing nutrient absorption. Example compounds in this class include the appetite suppressants Orlistat [
31] and Sibutramine [
32], both of which are lipase inhibitors, and the disaccharidase inhibitors Acarbose, and Miglitol [
33,
34]; these are also shown in
Figure 3. Both Acarbose and Miglitol are used to regulate levels of glucose in the blood of patients affected by diabetes.
Class III compounds compensate for enzyme deficiencies in the host GI tract or metabolise bacterial and metabolic toxins. This class of compounds has been used in the administration of porcine pancreatic enzymes for the treatment of Cystic Fibrosis [
35]. Although offering a useful approach for the treatment of some conditions, this class of compounds falls more into the realm of biologics, and thus approaches related to the design of small molecules are not as applicable for this class of compounds.
Class IV compounds target the inner wall of the GI tract. The parent drugs of this type are broken down quickly by first pass metabolism in the enterocytes of the liver, resulting in low concentration in the blood stream. These compounds have also been referred to as ‘soft-drugs’. Buchwald published a thorough review of this class of rapidly metabolised compounds [
36], such as the β-blockers Esmolol and Landiolol [
37,
38]. It is important to note that these compounds may have a high degree of GI tract permeability; it is their rapid metabolism that leads to a low concentration of the drug found in blood.
Class V compounds target membrane proteins on the gut epithelia’s surface. These have a wide range of applications, including transporter inhibition, G-protein coupled-receptor binding, and tight junction permeability modulation. These classifications aim to provide detail for the various types of targets that exist prior to intestinal absorption. However, the classification system does not indicate how the compounds achieve non-druglikeness. There is a need for the creation of classes of small molecular scaffolds with poor druglikeness in order to investigate non-systemic targets and distribution.
A potential application of non-systemic compounds is the treatment of bacterial infections residing primarily within the lumen of the small intestine. Compounds in this category must be taken orally, however, excessively high oral-bioavailability, and subsequent absorption into the bloodstream, will hinder the therapeutic effect of these compounds. One such example of this kind of infection is caused by the bacteria
C. Diff [
14], often occurring because of the changes in intestinal bowel microbiota caused by antibiotic use [
15]. Current treatment of
C .Diff is carried out using very large, polar molecules such as Vancomycin (molecular weight = 1447.30 gmol-1; Topological Polar Surface Area (TPSA) = 530.490 Å), and Fidaxomicin (molecular weight = 1056.425 gmol-1; TPSA = 266.660 Å) [
17,
42]. However, recent research into a new, smaller compound, Ridinilazole, [
16] has shown promise in initial clinical trials for the treatment of
C. Diff. Paradoxically, this drug (shown in
Figure 4) is not absorbed through the small intestine despite passing all traditional druglikeness filters. Summit Therapeutics developed this compound and demonstrated superior performance and clinical response rates versus Vancomycin, the standard method of care for this type of infection [
16]. Although drug absorption is complex and cannot be explained simply using a singular metric [
43], it is worth considering alternative physiochemical properties to identify drugs with the potential for poor oral bioavailability. Ridinilazole has a fCSP
3 of 0, and it is possible that its high degree of ‘flatness’ may be a key factor in its surprisingly low absorption. This may lower the surface area of the molecule relative to molecules of a similar molecular weight and atom count. This may result in a reduction in solubility, due to the formation of a weaker solvation shell.
An additional application for non-systemic compounds is represented by Cholecystokinin 1 receptor (CCK1R) agonists. CCK is a hormone secreted into the gastrointestinal lumen by duodenal endocrine I-cells after the consumption of food, resulting in a number of physiological responses related to the feeling of satiation [
44,
45]. It has therefore been determined that CCK1R modulation could be used in the treatment of obesity [
46,
47]. Since CCK1R is expressed on the epithelium of the GI tract, it is necessary to develop agonists with poor GI tract permeation. GI181771X (MW 606, cLogP = 5.1, cLogDpH6.5 = 4.1, TPSA = 139 Å2, fSP
3 = 0.15) was found to act as a CCK1R agonist, while also possessing low oral bioavailability (0.4%) in rats [
48]. Further developments identified CE-326597 (MW = 595, cLogP = 6.3, LogDpH7.4 = 4.9, fSP
3 = 0.19) [
49] which was also found to have poor oral bioavailability in rats (F = 1%). It is worth noting that the very low fSP
3 values in these compounds may play a significant role in their poor oral bioavailability, despite their other physiological properties meeting, or coming close to meeting, the values prescribed by traditional druglikeness metrics.
The treatment of irritable bowel syndrome often requires drugs with a high degree of gut residence. One such example is Otilonium bromide (cation: MW 484, cLogP = 4.0, TPSA= 65 Å2, rotatable bonds = 17, fSP
3 = 0.52) [
50], an antispasmodic agent which inhibits muscarinic receptors, resulting in the smoothing in the rate of GI muscle relaxation. The poor permeation of Otilonium bromide (F < 3 %) has been attributed the presence of a charged nitrogen atom as part of a quaternary ammonium group. The therapeutic space around the design of drugs to alleviate GI complications is of great interest to us.
The compounds listed herein show that fSP
3 parameter, among other physiological properties, presents a highly useful indicator for poor oral bioavailability that goes beyond the descriptors used by traditional druglikeness filters. There is a need for further study to ascertain their ability to describe druglikeness in the development of non-systemic drugs, thereby aiding the design such compounds. There is limited understanding of the role of fSP
3 regarding medicinal chemistry, especially in the context of oral bioavailability. However, one such study was conducted by Hirata et al. [
51] in the development of orally bioavailable RORγ inhibitors, finding that, as shown in
Table 1, substituents with high values of fSP
3 values correlated with increased potency and higher liver permeance, which despite being less relevant in regards to our work, are noteworthy observations. Iusupov et al. also incorporated fSP
3 in their drug design philosophy, introducing spirocyclic substituents to increase fSP
3 [
52]. This resulted in more sterically rigid molecules, that are more able to undergo stereochemically precise interactions with their targets, due to a reduction in the degrees of rotational freedom. These studies show the potential of considering fSP
3 in developing more potent lead compounds; however, there is still much more room for further development of the concept, and there is no literature regarding whether low values of fSP
3 can result in a reduction of oral bioavailability.
Case study – Phenylboronic acid Fructose Scavengers
Phenylboronic acids are known to be able to readily bind to 1,2-
cis diols, such as those found in monosaccharides [
53]. This property presents the opportunity to utilise phenylboronic acids as sugar sequestering agents in the treatment of gastrointestinal agents, such as fructose malabsorption [
54]. In general, boronic acids have been widely studied for medicinal chemistry purposes [
55]. Since a build up of fructose is located within the gastrointestinal lumen itself, it is necessary that the sequestering agent has limited absorption through the small intestine and remains in the lumen to bind to fructose. With this requirement in mind, we aimed to explore pre-existing chemical databases to identify reported and novel phenylboronic acids with potential for poor oral bioavailability, using fCSP
3 as an indicator. A substructure search of phenylboronic acid on the ChEMBL database [
56,
57,
58,
59,
60] resulted in 1728 compounds being found. The removal of pinacol-protected compounds and benzoboroxoles, which are beyond the scope of our study, left 571 phenylboronic acid compounds to be analysed. Since we were exploring beyond Ro5 chemical space to further promote non-systemic behaviour [
61], we filtered the compounds to select those with a molecular weight between 500 and 700, leaving 85 compounds.
O’Donovan postulated that a pKa less than 4 is desired for permeation of compounds with a molecular weight in our chosen range [
62]; however, lower pKa values are needed to promote good Lewis acidity for sugar binding at physiological pH [
63]. Setting a final filter for compounds with a pKa range between 4 and 8, using the pKa prediction software OPERA [
64] resulted in a final 48 compounds for semi-empirical sugar binding analysis to identify the most suitable candidates for non-systemic, fructose-selective scavenging. Our sugar recognition work will be reported on in due course, and we plan to check other compound databases to identify potential chemical starting points.
The aforementioned 48 compounds are listed in
Table S1 (see supporting information). A range of heterocyclic rings, carbocyclic rings, and functional groups are represented in the listed molecules. In addition, a couple of these compounds bear two boronic acid motifs.