DBAASP stores data on 1565 ribosomal and 464 non-ribosomal SPs, which are peptides with length in the interval of 6-25 AA. Among these, 482 ribosomal and 367 non-ribosomal peptides are cyclized by intrachain covalent bonds.
3.2.3. Ribosomal short cyclic peptides
Opposed to non-ribosomal SCPs, ribosomal SCPs mainly use Cys residue for cyclization. To create rings, Cys participates in disulfide and sulfide (thioether) bond formation. In over 80% of ribosomal SCPs, rings are formed through the creation of cystines (CST), aminovinyl cysteines (AVC), and lanthionines (LAN). Such rings are not formed in non-ribosomal SCPs (see
Table 3).
Peptides with rings, created by amide bonds represent only about 20% of ribosomal SCPs of DBAASP, whereas over half of non-ribosomal SCPs are cyclized through amide bonds. Also, as observed in
Table 3, lactone rings (LCNs) are not represented in ribosomal SCPs, while they constitute about 40% of non-ribosomal SCPS.
One additional difference, concerning five-membered rings, is of note. In ribosomal SCPs, five-membered rings (such as OXZ and OXZN) are created by ether bonds, while in non-ribosomal SCPs, five-membered rings are mainly closed by thioether bonds, forming rings such as THZ, THZN, and TZD. Only THZ is a shared five-membered ring, appearing in both, non-ribosomal and ribosomal SCPs.
3.2.3.1. Intrachain bonds and created rings in ribosomal peptides
Disulfide bond. In the majority of ribosomal SCPs, rings are formed through the disulfide bonds.
Figure 4 represents the distribution of lengths for ribosomal disulfide-bonded peptides in DBAASP, ranging up to 50 AA. The total number of such peptides is 1100, which constitutes the majority of all cyclic ribosomal peptides (1269) of that length.
The lengths of approximately 360 ribosomal disulfide-bonded peptides fall within the interval of 6-25 AA in length. That is, they are short. In this section, we will focus on disulfide-bonded short ribosomal cyclic peptides (RSCPs).
The amino acid composition of the RSCPs from DBAASP, compared to the compositions of UniProt [
23] sequences (
Figure 5), reveals that RSCPs are more hydrophobic (with a higher abundance of phenylalanine and isoleucine) and more basic (due to the higher abundance of lysine) compared to the ‘average protein’. RSCPs also contain more cysteines and glycines than the ‘average protein’. The presence of glycine is likely related to the requirements for flexibility to close the ring and, at the same time, glycine can promote ‘chameleonic’ ability of peptide, while the presence of cysteine is crucial for cyclization, which is important for the activity and chemical stability.
The distribution of cysteines along the peptide chain is also an interesting aspect to investigate, in order to assess the size of cycles that can be closed. The analysis of the distribution of i-spaced amino acid pairs (DiSAAP), specifically focusing on cysteine pairs, in RSCPs of DBAASP reveals a statistically significant abundance of pairs with 5 and 8 AA spaces, while pairs with 0 and 1 spaces are scarce compared to random distribution (
Figure 6). These findings can be explained by the role of Cys residues. Cys can be the source of intrachain bonds and through post-translational modification, it can stabilize the certain peptide structures. Taking into account the geometry of the side chain, it is problematic to form disulfide bond by pairs of Cys residues spaced by 0 or 1 residues because of steric hindrances [
24]. Therefore, the appearance of neighbouring Cys pairs or those separated by one residue is limited. On the other hand, the abundance of Cys pairs with 5 and 8 AA spaces, suggests the presence of cyclized loops consisting of 7 and 10 AA, characterizing many disulfide-bonded RSCPs in DBAASP. For instance, the abundance of Cys pairs spaced by 5 AA points to the prevalence of ‘Rana box’-containing peptides (RBPs) among RSCPs of DBAASP. RBPs, such as Brevinins (DBAASPR_1009), Nigrocins (DBAASPR_1542), etc., contain small C-terminal loops closed into rings by disulfide bridges and N terminal linear tails containing more than 7 AA [
25]. Consequently, the overall topology of chains of such peptides resemble a lasso. In the DBAASP, peptides with bigger disulfide-bonded loops that differ in amino acid composition from RBPs [
26] or are situated at the N-terminus of peptides [
27] can be found.
Moreover, when the linear tail is absent, the peptide chain topology resembles a hairpin. Small size, rigid structure, stability to proteases, and wide spectrum of biological functions make hairpin AMPs an attractive molecular basis for drug design [
28]. Many disulfide-bonded RSCPs from DBAASP, such as Protegrins (DBAASPR_672), Tachyplesins (DBAASPR_2229), Arenicins (DBAASPR_2025), Θ-defensins (DBAASPR_823), etc. have hairpin topology of chains.
Figure 13.
Thus, among disulfide-bonded ribosomal peptides of DBAASP, about 290 contain a single disulfide bond, allowing us to envision their structure. These peptides can adopt either a ‘lasso-like’ (LL) structure (
Figure 7A), with a C- or N- terminal cyclized loop, or a ‘hairpin-like’ (HL) structure, formed when the Cysteines are located near the peptide termini (
Figure 7B).
To define the type of structure, the parameter
s and
p have been introduced. For the given length of peptide
L, positions
n1 and
n2 (
n1 < n2) of first (C
1) and second (C
2) cysteines, respectively, the value of
s and
p are calculated as:
Relying on the values of
s and
p, the correspondence of the peptides to LL or HL structure has been assessed. If
s≥0.5 and
0.3≥ p≥3 or n1=1, then the peptide structure is defined as LL, and otherwise (when
s<0.5), the peptide is defined as HL. In the case of LL peptides, we distinguish N-LL (when
L–n2>n1) and C-LL (when
L–n2<n1). Among ribosomal SCPs that are cyclized by one disulfide-bond, the majority have been defined as C-LL, and only one as N-LL. The number of HL peptides is about 60 (see
Table 4). The size of loops of LL peptides are about 7-8 AA (including Cys-s), while HL loops are a little larger, about 10-11 AA. Such macrocyclic peptides have some conformational freedom and, consequently, flexibility.
The structures of about 70 peptides among disulfide-bonded ribosomal SCPs are stabilized by two or more disulfide bonds. Various combinations of pairing can be considered for a given number
n=2k (k=1, 2, …, m) of cysteines (C) that participate in the formation of
k disulfide bonds. The number (
N) of variants of pairing to form
k disulfide bonds is assessed by the formula:
while the number of structures created as a result, can be classified into three groups (
Figure 8). If cysteines of the peptide with
k disulfide bonds are enumerated according to the increasing positions along the chain as C
1, C
2, C
3, …, C
n, then pairing schemes that define three different groups of structures can be presented. For example, the pairing of cysteines according to the scheme C
1–C
n, C
2–C
n-1, …, C
n/2–C
n/2+1 (
Figure 8A) gives the formation of ‘ladder-like’ multicyclic structures (LD). Another scheme C
1–C
2, C
3–C
4, …, C
n-1–C
n (
Figure 8B) is the formula to form structures consisting of ‘strings of rings’ (SR) closed by disulfide bonds. All other variants of pairing cause the crossing of rings formed by disulfide bonds (
Figure 8C), and thus, the formation of structures with ‘crossed-rings’ (CR).
Consequently, disulfide bonds can create three kinds of multicyclic structures: LD, SR, and CR. For
k=2, each kind of structure is presented with a single variant of pairing. With the increasing of
k, the number of variants of pairing causing CR is increased, while variants that correspond to LD and SR are always equal to one. Therefore, if we consider the process of pairing of cysteines as random, we can suppose that, in disulfide-bonded peptides, CR structures prevail. Indeed, in the majority of peptides in DBAASP with 3 and more disulfide bonds, pairing of cysteines results in crossing of cycles, which means CR structures are formed, while other types of structures appear rarely (see
Table 4). Although, opposed to this, in the short peptides with 2 disulfide bonds, the LD structure, having a hairpin topology, markedly prevails over the other two types of structure, and thus, this fact cannot be considered as a result of a random process. We can also note non-randomness concerning the fact that C-LL type of structure prevails over N-LL for peptides with one disulfide bond. Apparently, simple structures, like LL and HP, are self-sufficient to perform defense functions. The abundance of Gly and Pro relative to the ‘average protein’ is apparently linked with the necessity to promote such simple structures, which can be the building blocks to form longer and more structurally complex AMPs.
Thioether bond (Sulfide). DBAASP stores more than 85 ribosomal peptides with thioether bonds, ranging in length from 6 to 38 AA. Peptides are rich in Cys, Ser, and Thr. Thioether bonds are formed between the side chains of Cys and the side chains of Thr or Ser (their derivatives), as in lantibiotics [
29], or between the side chains of Cys with the carbonyl group (main chain) of the neighbouring amino acid, as in peptides containinig thiazole ring. Thioether bonds predominantly form the amino acid lanthionine (see
Table 1), and consequently, lantibiotics are more widespread among thioether bond-containing peptides. In lantibiotics, residues participating in side chain-to-side chain bonds are spaced by 2-5 AA along the chain, and thus, they form small rings of 4-7 AA (including Cys and Ser or Thr). Such rings are rarely crossed to form CR structures. For example, Nisin contains 5 rings, and among them, 3 do not cross each other. Only the 4-th and the 5-th are crossed. The majority of lantibiotics have one or more inter-crossed rings in their structures. However, there are peptides that contain a string of uncrossed rings, such as Enterocin W-beta (DBAASPR_1475).
Along with the thioether bonds, lantibiotics use disulfide (e.g., Enterocin W-alpha – DBAASPR_1474) and head-to-tail amide bonds (e.g., Subtilosin-A – DBAASPR_6076) for cyclization. Unlike other lantibiotics, Subtilosin-A have a ladder-like structure. The number of rings varies from 3 to 7.
Another posttranslational modification links the side chains of Cys with the side chains of Thr or Ser by TIE bonds and results in the creation of aminovinyl cysteine (AVC) rings (
Table 1). Often, such rings appear in combination with lanthionine rings. However, there are some peptides containing AVC rings without LAN rings, such as Microvionin (DBAASPR_21973), Goadvionin B2 (DBAASPR_21974), Thioholgamides (DBAASPR_21975 and DBAASPR_21976), JBIR-140 (DBAASPR_21979), Thioviridamide (DBAASPR_21978). Structures adopted by these peptides resemble LL-type, because the Cys participating in the formation of AviCys is the C-terminal amino acid.
As mentioned above, there is an opportunity to link the side chain of Cys with the carbonyl group of the neighbouring amino acid through a TIE bond to create aromatic, unsaturated thiazole ring (THZ). The number of THZ-containing peptides in DBAASP is not high. Only 8 peptides, such as Micrococcins, Patellamides, Bottromycines, etc., have been identified (see
Table 3). The structure of peptides containing more than one thiazole ring can be presented as a ‘string of rings’.
Among the peptides with TIE bonds in DBAASP, about 70 have a length of less than 25 AA and can be included in the set of ribosomal SCPs (
Table 3).
Amide bond. Amide bonds are used more rarely than disulfide bonds to close rings in ribosomal SCPs. 66 SCPs are cyclized into NCB rings, while the number of peptides with LAC rings formed by bonding of side chain-to-main chain equals 30 (
Table 3). It is worth noting that, although LAC rings in both, ribosomal and non-ribosomal SCPs are closed by isopeptide bonds, they are distinguished by the groups of side- and main chain atoms participating in bond formation. As mentioned above, LAC rings of the non-ribosomal SCPs and USCPs are closed by bonding between the amine group of side chains of basic residues (such as diaminopropionic acid – DAP, diaminobutyric acid – DAB, Lys, ornithine, 3-aminotetradecanoic acid, 3-aminohexadecanoic acid, etc.) and C-terminal carboxyl groups of the main chain. In contrast, in the ribosomal SCPs, the amine group of the N-terminus forms an isopeptide bond with the carboxyl side chain of a glutamic or aspartic acid residues to create a LAC ring of 7-9 AA. Such peptides form a family of ribosomally synthesized and post-translationally modified peptides called ‘lasso’ peptides (LP) [
30]. Biosynthetic gene clusters (BGCs) for lasso peptides are presented in many bacterial genomes [
31]. Often, the additional ring is formed in the C-terminal tail by disulfide bonds. The C-terminal tail is trapped within the ring either by bulky amino acids or disulfide bridges, or both [
30]. Of note, due to the frequent appearance of Asp and Glu in LP, they are mainly anionic.
Amine, imine, ether, ester, and carbonyl bonds. In addition to disulfide, thioether, and amide bonds, other bonds, such as, amine, imine, ester, and carbonyl are used to close rings in ribosomal SCPs. However, cyclic peptides with such rings are scarce in DBAASP. As mentioned above, one of the reasons of scarcity could be insufficient data in DBAASP at the moment.
As in the case of non-ribosomal SCPs, in ribosomal SCPs, amine bonds are used to create LAC rings. Five such SCPs can be found in the DBAASP. They are Dynobactin A (DBAASPR_20296) and type B lantibiotics of Duramycin (DBAASPR_19104) and Cinnamycin (DBAASPR_21774) families. In contrast to non-ribosomal SCPs, where side chain-to-main chain bonding takes place, in ribosomal SCPs, amine bonds link the side chains of pairs of residues Lys and Ser (by forming Lysinoalanine in lantibiotics, e.g., in Duramicyn B – DBAASPR_19105) or His and Tyr (between the imidazole Nε2 of His and β-carbon of Tyr in Dynobactin A).
LAC rings are created in two SCPs of DBAASP (Bottromycin family, DBAASPR_21130) by bonding of N-terminal amine group nitrogen to the main chain carbonyl group carbon of 3-methyl-valine through imine bonds.
Carbon bonds between the 6-th carbon of the indole ring of Trp with the β-carbon of another residue (Lys, Asn, or Trp) create a LAC ring in ribosomal SCPs, such as Dynobactin A (DBAASPR_20296) and peptides of the Darobactin family (DBAASPR_17389, DBAASPR_22221-22222).
In the Ribosomal SCP Micrococcin P1 (DBAASPR_21120) [
32], an unsaturated six-membered pyrimidine ring is formed by two carbon bonds: between the β-carbons of the first and 10-th dehydroalanines (DHA) and between the carbonyl group carbon of the 9-th Cys and the α-carbon of the first DHA.
Ether bonds, which are not represented in non-ribosomal SCPs, participate in the creation of five-membered unsaturated rings in ribosomal SCPs. SCPs of Patellamide family (e.g., DBAASPR_21121) consists of less aromatic oxazoline and meth-oxazoline rings, while Microcyclamide (DBAASPR_21139) and Wewakazoles (DBAASPR_21267) involve more aromatic oxazole and meth-oxazole. Additionally, ether bonds between the C7 indole carbon of one tryphtophan and the β-carbon of another tryphtophan form macrolactam (LAC) ring in post-ribosomally synthesized peptides of the Darobactin family (DBAASPR_17389, DBAASPR_22221- 22222).