1. Introduction
Human whole saliva is a hypotonic fluid lining the oral cavity and composed of water (99%), a complex mixture of organic and inorganic compounds resulting from salivary gland secretion, oral flora, oropharynx, upper airway, gastrointestinal reflux, gingival crevicular fluid, food deposits, and mucosal surface secretion containing blood-derived components [
1]. Saliva is essential for the accomplishment of multiple physiological func- tions encompassing lubrication, buffering action, maintenance of tooth integrity, chewing, initial digestion of some foods, swallowing, tissue hydration and lubrication, speech, wound healing, antibacterial, and antifungal activity [
2]. The dramatic sequelae observed in patients suffering by the Sjögren syndrome clearly demonstrate the relevance of saliva and its components, particularly salivary proteins, in the protection of the mouth. Recent proteomic inventories report more than 3000 proteoforms in human saliva [
3]. Before, during and after secretion, most salivary proteins undergo numerous post-translational modifications (PTMs), whose role has not yet been clearly elucidated. Over the last twenty-five years, our group has been able to characterize many PTMs of the salivary proteome, mainly by application of top-down proteomic platforms to saliva from subjects of various age and from patients affected by various pathologies. In this review, we report the PTMs most frequently observed, suggesting, when possible, some hypotheses on the possible role played in the protection of the mouth. We have done exhaustive research of these topics in the literature. Nonetheless, we apologize in advance for any possible omission.
3. Phosphorylation
Phosphorylation is probably (after proteolytic cleavage) the most common PTM in human saliva. Generally, phosphorylation is a reversible PTM, catalyzed by more than 500 different human protein kinases, while de-phosphorylation is due to enzymes called phosphatases [
78]. The detection of phosphorylated proteoforms of human saliva is strongly dependent on the proteomic pipeline utilized: bottom-up or top-down. Com- monly, the bottom-up strategies are always accomplished by enrichment capture in order to increase the number of phosphorylated fragments.
A study of the group of Oppenheim and Helmerhorst [
79] utilized a chemical deri vatization using dithiothreitol (DTT) of the phospho-serine/threonine-containing pep tides obtained after trypsin digestion of whole saliva samples. The DTT-phospho-peptides were enriched by covalent disulfide-thiol interchange chromatography and analysis by nanoflow liquid chromatography and electrospray ionization tandem mass spectrometry (LC-ESI-MS/MS). The specificity of DTT chemical derivatization was evaluated separately under different base-catalyzed conditions with NaOH and Ba(OH)
2, blocking cysteine res- idues by iodoacetamide and enzymatic O-deglycosylation prior to DTT reaction. Further analysis of whole saliva samples that were subjected to either of these conditions provided supporting evidence for phosphoprotein identifications. The combined chemical strate- gies and mass spectrometric analyses identified 65 phosphoproteins in whole saliva; of these, 28 were based on two or more peptide identification criteria with high confidence and 37 were based on a single phospho-peptide identification. Most of the identified pro- teins (∼80%) were previously unknown phosphoprotein components.
A phospho-proteomic study based on a bottom-up pipeline on human saliva gener- ated a large-scale catalog of phosphorylated protein fragments too [
80]. To circumvent the wide dynamic range of phosphoprotein abundance in whole saliva, the proteomic plat form combined dynamic range compression using hexapeptide beads, strong cation ex change HPLC peptide fractionation, and immobilized metal affinity chromatography prior to mass spectrometry. In total, 217 unique phospho-peptides sites were identified representing 85 distinct phosphoproteins at 2.3% global FDR. From these peptides, 129 distinct phosphorylation sites were identified of which only 57 were previously known. Cellular localization analysis revealed salivary phosphoproteins had a distribution like all known salivary proteins, but with less relative representation in “extracellular” and “plasma membrane” categories compared to salivary glycoproteins. Sequence alignment showed that phosphorylation is mainly linked to the action of the Golgi casein kinase called Fam20c (see below), but occurred also at acidic-directed kinase, proline-directed, and basophilic motifs. [
80].
The top-down pipelines are more conservative, giving precise information on the sal- ivary protein substrate of kinase and on their sites of modification. All the phosphorylated proteins pertaining to the secretory pathway are phosphorylated by a pleiotropic Golgi casein kinase which, until few years ago, has been elusive. An interesting study performed by Tagliabracci and coll. [
81,
82] was able to establish that the enzyme is the kinase named Fam20C. The main consensus sequence of Fam20C is a serine with a +2 specific negative residue, either glutamic acid or phospho-serine (
SXE/S(phos)). The salivary proteins and peptides submitted to the action of Fam20C in agreement with this consensus sequence are reported in
Table 3.
Phosphorylation of cystatin S, whom N-terminal sequence is S
1SS
3KE
5… is interest- ing. Mono-phosphorylated cystatin S is called cystatin S1, and due to the presence of the glutamic acid residue in position +2, Ser
3 is necessarily the first site of phosphorylation. Di-phosphorylated cystatin S is called cystatin S2 and the second phosphorylation at Ser
1 is strictly hierarchical, occurring only after that of Ser
3. Cystatin S1 is the most abundant in adult human saliva and the approximated ratio between the relative percentages of the three components (S, S1 and S2) is 5:80:15, respectively [
58,
83]. A more complex situation concerns the phosphorylation of aPRPs, which are commonly di-phosphorylated having Ser
8 and Ser
22 as the two main sites of phosphorylation [
9,
84]. However, by mass spec trometry it is possible to detect also in small amount non-, mono- and three-phosphory lated aPRPs (on Ser
8, Ser
22 and Ser
17) [
9] (
Figure 2). The consensus sequence recognized for Ser
8 phosphorylation is the canonical S
8QE, with a negatively charged residue at position +2, while the phosphorylation of Ser
22 by Fam20c is ensured by the recognition of the sec- ondary consensus sequence S(X)
3-4 (D/E/S(phos))
3 [
84]. To emphasize that phosphorylation of Ser
22 allows the hierarchical phosphorylation of Ser
17 which is located in the sequence …S
17DGGDS
22EQFIDEE…[
9]. Hence, the mono-phosphorylated aPRPs can be phosphor ylated either on Ser
8 or on Ser
22 (with a different proportion in favour of Ser
8) [
9]. The di phosphorylated components (the most abundant) are commonly phosphorylated on Ser
8 and Ser
22, but a very small percentage of aPRPs phosphorylated on Ser
17 and Ser
22 could be present. The hierarchical phosphorylation on Ser
17 is further substantiated by a study characterizing the aPRP-1 Roma-Boston Ser
22(phos)→Phe variant, which was never de- tected as di-phosphorylated proteoform, also using high resolution HPLC-MS apparatus [
85]. Statherin is mainly di-phosphorylated on Ser
2 and Ser
3 but being the N-terminal se- quence DS
2S
3EE… the two phosphorylation sites are independent and the phosphorylated Ser of mono-phosphorylated proteoform, always detectable in adult human saliva, can be either Ser
2 or Ser
3.
The N-terminal sequence of Hst1 is DS
2HE… and about 90% of the peptide is phos- phorylated on Ser
2, 10% of the non-phosphorylated peptide is detectable in adult human saliva [
49]. Even though the Fam20C consensus sequence was not respected, a study of Halgand [
86] and coll., carried out with a top-down pipeline evidenced a second minor phosphorylation site on Ser
20.
The canonical consensus sequence of Fam20C is responsible for the phosphorylation of several bPRPs too. Indeed, the IB-1 and II-2 proteoforms are both phosphorylated on Ser
8. Among the gPRPs Gl-1, Gl-2, Gl-3, GPA, II-1, and Cd-IIg, which have an N-terminal sequence motif similar to IB-1 and II-2, ((E/Q)XXXEDVS
8QEES…, where XXX is LNE in IB-1, II-2, Gl-1, Gl-2, and Gl-3 and SSS in GPA, II-1, and Cd-IIg) Gl-2 is phosphorylated on Ser
8 [
23]. Phosphorylation is an almost complete event because <1% of the non-phosphor- ylated forms can be detected in parotid granules, parotid, and whole saliva and probably occurs after the cleavage of the proprotein [
6]. It can be supposed, by sequence similarity, that also Gl-1 and Gl-3 undergo the same PTMs, although experimental evidence is miss- ing. The presence of the …S8QE consensus sequence of GPA, II-1, and Cd-IIg (Group 2C of bPRPs) suggests phosphorylation of Ser
8 for these bPRPs, too, even though these mod- ifications have not been experimentally evidenced until now. A second potential, but not demonstrated, phosphorylation site at Ser
3 is present in the sequence of Group 2C bPRPs, (<ESS
3SED….). The activity of other kinases can be revealed in human saliva by using a top-down proteomic approach, among them the MAPK14, a kinase pertaining to the p38 mitogen-activated protein kinase pathway, which can partly phosphorylate the protein S100A9 on the penultimate Thr residue of its sequence [
87].
3.3. Variation of phosphorylation as a function of age and for the diagnosis of different diseases
Some bottom-up studies suggested that the analysis of the phospho-proteome of sal- ivary extracellular vesicles could offer a possibility for the diagnosis either of the lung cancer [
88] or to distinguish oral squamous cell carcinoma patients from healthy individ- uals [
89]. Our studies were able to evidence that, during the last months of foetal growth the phosphorylation of secretory salivary proteins and therefore the activity if Fam20C is very low, if not completely absent [
90]. It increases slowly reaching the level observed in the adult few weeks after the normal time of delivery [
91]. Interestingly, a top-down study of the salivary proteome performed in a group of children with autism spectrum disorders evidenced significant lower phosphorylation levels of four salivary peptides, in compari son with age and gender matched healthy controls, giving a clue of the molecular patho genesis responsible for these disorders [
92].
4. Sulfation
Sulfonation (or sulfation) is usually an irreversible PTM. It consists in the transfer of a sulfate group (-SO
3-1) from the only known sulfate donor, i.e. 3-phosphoadenosine-5 phospho-sulfate (PAPS), to endogenous substances such as proteins, carbohydrates, cate cholamines as well as estrogenic steroids and xenobiotics [
93]. 3’-phosphoadenosine-5’ phosphosulfate synthase (PAPSS) is the enzyme responsible to biosynthesize PAPS dur ing two reactions: inorganic sulphate is first converted to adenosine-5-phosphosulphate (APS) by ATP sulfurylase (EC 2.7.7.4) and this intermediate molecule is then phosphory- lated by the APS kinase (EC 2.7.1.25) to form PAPS [
94]. Both APS and PAPS are activated sulphuryl donors that possess a phospho-sulphate anhydride bond [
95]. In prokaryotes, fungi, and plants, synthesis of PAPS is performed by two separate enzymes [
94]. In the animal kingdom, however, the ATP sulfurylase and the APS kinase are encoded by the same gene and translated into a single polypeptide which forms the dual-functional en- zyme PAPSS [
95]. PAPSS1 and PAPSS2 are two characterized isoforms of this enzyme according to their different localization. Moreover, relation of various pathological condi- tions to deficiencies of PAPSS (both isoforms) has been demonstrated [
96]. While PAPSS1 and PAPSS2 are responsible for the bioactivation of sulphate, sulfo-conjugation reactions are catalysed by enzymes known as sulfotransferases [
97]. Sulfotransferases are mainly divided into two groups, as they are either cytosolic or membrane-bound [
98]. Cytosolic sulfotransferases constitute the superfamily of enzymes known as SULTs which are in volved in the sulfonation of xenobiotics and small endogenous compounds such as neu rotransmitters and hormones [
99]. The membrane-bound sulfotransferases are found in the Golgi apparatus and are responsible for post-translational sulfation of endogenous 605 macromolecules such as proteins, lipids, and glycosaminoglycans [
100]. Currently, 12 SULT isoforms have been identified and detected in human tissues [
101]. Availability of PAPS in different tissues can highly affect the sulfation pathway by modifying the affinity of sulfotransferases [
100,
101,
102].
In human saliva we were able to determine that Hst1 is partly sulfated on the last four tyrosines (out of five) of its sequence [
103] and until now it is the unique phospho- sulfo-peptide detected in human saliva. As previously reported, phosphorylation of Hst1 is not a complete event, because in whole human saliva it is possible to detect about 10% of the non-phosphorylated peptide. Further studies performed by high-resolution HPLC MS apparatus suggested that: a) as supposed in our work, the sulfation process is hierar chical, being Tyr
27 the first residue sulfated, followed by Tyr
30, Tyr
34 and Tyr
36. Indeed MS CID fragmentation data on the mono-sulfated (non-phosphorylated) Hst1 clearly indi- cated Tyr
27 as the unique residue involved c) the sulfation process is strictly confined to the submandibular gland, because sublingual gland does not express Hst1 (manuscript in preparation) and Hst1 secreted by parotid gland is not sulfated; d) the phosphorylation and the sulfation processes are independent, because, by the study of the neutral losses generated during the MS/MS CID induced fragmentation it is possible to discriminate non-phosphorylated poly-sulfated derivatives from the phosphorylated ones; f) the per centages of the different polysulfated derivatives varies sensibly in human submandibu lar saliva of human g) sulfation of Hst1 is not detectable in children till the puberty; h) potential changes in physio-pathological conditions as well as during pharmacological treatments require extensive statistical analysis on a large population.
5. S-modifications
The thiol side chain of cysteine residues present in the proteins is characterized by high redox sensitivity. This redox sensitivity derives from the pKa of the thiol group, which due to the environment of the protein, even at physiological pH may be partly pre- sent in the thiolate form (RS-) most reactive than the thiol form (RSH). Indeed, RS- is prone to donate an electron pair to target reagents generating the oxidized derivative sulfenic acid (RSOH), mainly present as sulfenate (RSO-) by a reversible reaction. Then, sulfenate may undergo other reversible, and irreversible oxidative modifications (
Figure 5) [
104]. For instance, it can rapidly react with another thiol (RSH) by generating the corre sponding disulfide (RSSR), or with GSH or free Cys to form S-glutathionyl (RSSG) or S cysteinyl derivatives. This is not the only formation pathway for disulfide derivatives, since the latter can also be generated in a process termed thiol-disulfide exchange, in which a thiol form reacts with a disulfide to generate a different thiol and disulfide.
In the presence of strong oxidants, sulfenate can undergo further oxidation and gen- erate sequentially the sulfinic (RSO
2H) and sulfonic acid (RSO
3H) derivatives. The latter two modifications are regarded as irreversible and associated with oxidative damage. Sul- fenate can also react with hydrogen sulfide (HS-) to form a cysteinyl persulfide (R–SS-), which can be oxidized to a cysteinyl thiosulfate(R–SSO
3-). The thiol side chain of cysteine may also react with nitric oxide resulting in the formation of nitrosothiol derivatives. [
105]. These modifications have been observed and characterized on specific salivary pro teins as cystatins and S100A proteins.
Cabras and coll. evidenced that in human whole saliva cystatin B is present mostly as S-modified derivatives on Cys
3, being the S-unmodified proteoform rarely detectable in saliva of healthy adults. Generally, more than half of cystatin B (55%) is found S-gluta- thionylated, 30% is present in the dimeric form, and the remaining 15% is S-cysteinylated [
106]. In an in-depth study, performed both in the acidic supernatant of whole saliva and in enriched fractions obtained by preparative RP-HPLC, the structures of all these S-mod ified proteoforms of cystatins were confirmed by an integrated top-down/bottom-up pro teomic approach [
57]. In the study, a carboxymethylated Cys
3 derivative of cystatin B was also detected and characterized. Carboxymethylation, like the formation of sulfinic and sulfonic acid derivatives, is a nonenzymatic irreversible and stable modification generated by the endogenously formed glyoxal with Cys sulfhydryl groups (Figure 7). This modifi cation was novel not only for cystatin B but also for any other human salivary protein. It should be outlined that Cys residue present in the N-terminal region at C
3 position of cystatin B is involved in the formation of these and other derivatives. The N-terminal re gion of cystatin B is crucial for the biological function of the protein, being involved in the binding of cysteine proteinases [
107]. Indeed, it has been shown that Cys
3 is the most im portant residue for the interaction with papain and cathepsin H and is also a major con tributor to cathepsin L binding [
108]. Differently from adults, the S-unmodified cystatin B represented the main proteoform in saliva of preterm newborns. Interestingly, it was ob served that the high relative amount the unmodified cystatin B observed in preterm new borns decreased as a function of the PCA, reaching at the normal term of delivery values 675 like those determined in at-term newborns, children, and adults [
60]. As reported in sec tion 1.3 describing the proteolytic cleavages, in the same study high relative amounts of fragments 1−53 and 54−98 of cystatin B were detected in very preterm newborns. The N terminal 1-53 fragment was detected both as unmodified and modified at the level of Cys
3 residue, being the percentage of the different forms like those determined for the intact protein [
60].
S100 proteins constitute the largest family of calcium binding proteins with more than 3000 related entries in the NCBI Reference Sequences Data Bank. They are EF-hand calcium binding proteins and bind calcium via helix-loop-helix motifs often present in multiple copies. The “S100” name originates from the solubility of the first identified S100 proteins in 100% ammonium sulfate solution. The members of the EF-hand superfamily can be divided according to their calcium affinity and their ability to change conformation following binding of calcium. Intracellular functions of S100 proteins include: i) regulation of protein phosphorylation by interaction with the substrates of the kinases, thus they play a role in signal transduction; ii) regulation of enzyme activity [
109].
In the extracellular milieu, S100 proteins do not function as Ca2+ sensors as they are saturated by the mM range Ca2+ concentration, however they are recognized to play an important role in mediating inflammatory responses through the activation of several cell surface receptors, after release from activated or necrotic cells [
110]. S100A8, S100A9, and S100A12 are constitutively expressed in high amounts in neutrophils and are inducible in macrophages, cells that generate high amounts of ROS, while expression of the complex S100A8/S100A9 (calprotectin), S100A12 and S100A7 can be induced in keratinocytes, en- dothelial cells, and epithelial cells during inflammation [
111].
Human S100A9 is encoded by a single copy gene with two isoforms: full-length and truncated S100A9, which is translated from an alternate start site at codon 4 of the full- length form and lacks the single Cys
3 residue, making it less susceptible to oxidation. In the studies of Lim and coll. [
112,
113] the many pro-inflammatory functions described for S100A8 and S100A9, as well as its anti-inflammatory roles in wound-healing and protec tion against excessive oxidative tissue damage, are discussed and an explanation that ox idative modifications may act as a regulatory switch for the disparate, functional roles of S100A8 and S100A9 is suggested.
As far as it concerns biofluids, in a series of studies carried out to investigate possible salivary biomarkers of pathologies both confined to the oral cavity and systemic, different oxidized derivatives of S100A8 and S100A9 proteins were revealed and characterized both in healthy and diseased subjects. Structure of these new proteoforms was established by high-resolution HPLC-ESI-MS/MS analysis of the mixture of peptides obtained by trypsin digestion of salivary fractions enriched with S100A8 and S100A9 oxidized derivatives [
68]. In this study the long form of S100A9 was detected as glutathionylated at Cys
3 in 15/32 subjects and cysteinylated only in 3/32, while S100A8 was sporadically detected as unmodified S100A8 (4/32), and as sulfonic derivative at Cys
42 (5/32).
Glutathionylated and cysteinylated S100A9 derivatives were also detected in saliva of human preterm newborns [
114]. Finally
, the nitrosylated derivative of S100A8 was also observed in saliva from adult healthy subjects, but sporadically [
73,
74].
5.1. S-modifications of salivary proteins in pathologies
A wider variety and abundance of S-modified proteoforms has been revealed in sa- liva of subjects affected by various pathologies. For instance, a study performed on saliva collected from patients with schizophrenia and bipolar disorder, compared to healthy non-smokers and smokers control groups, revealed more than 10-fold increase in salivary levels of S-cysteinylated and S-glutathionylated cystatin B, in addition to α-defensins 1-4, S100A12, and cystatin A, suggesting dysregulation of the peripheral white blood cell im mune pathway associated with the pathologies [
115]. S-glutathionylated cystatin B was 726 found at high levels also in Predominantly Antibody Deficiencies [
116].
In WD patients, top-down proteomic analysis of saliva revealed significant higher levels of S100A9 and S100A8, and some of its oxidized proteoforms, with respect to con- trols [
68]. S100A8 oxidized at Cys
42 to sulfinic acid (S100A8-SO
2H) was detected only in patients, while the sulfonic derivative (S100A8-SO
3H) oxidized also at Trp
54 was detected both in patients and healthy controls, even if with different levels. The latter proteoform was subjected to further oxidation to give the so-called hyperoxidized S100A8, with the second oxidation located at either Met
1 or Met
78 (position to be determined). S100A8 was also shown to undergo glutathionylation, nitrosylation, and formation of a disulfide bridge with Cys
3 of long S100A9 (S100A8/A9-SS dimer), but only in the patient group. In the same group for the first time the homodimer proteoform of long S100A9 was charac- terized. Overall, the salivary proteome of WD patients reflected the oxidative stress and inflammatory conditions characteristic of the pathology, highlighting differences that could be useful clues of disease exacerbation [
68].
Oxidative cross-linking via disulfide bonds of S100A9 and S100A8 has been observed also in saliva from healthy adults, and in lavage fluid from the lungs of patients with res- piratory diseases by Hoskin et al. [
117]. The authors showed that reactive halogen species promote cross-linking of the non-covalent heterodimer of S100A8/S100A9 and hypothe- sised that the cross-linking detected in the saliva samples was most likely mediated by hypothiocyanous acid produced by lactoperoxidase. They also demonstrated that for- mation of the disulfide cross linked derivative enhanced susceptibility to proteolysis by neutrophil proteases. Furthermore, Gomes et al. observed that sulfinic and sulfonic acid derivatives of monomeric S100A8, together with novel oxathiazolidine oxide/dioxide forms, were present in asthmatic sputum [
118].
The salivary proteome of Alzheimer disease (AD) patients highlighted also elevated levels of some S100A8 and S100A9 oxidized proteoforms with respect to age and gender matched healthy controls. They were the hyperoxidized proteoform of S100A8, S100A8 SNO, and glutathionylated long S100A9 [
77]. This finding was not surprising since oxida tive modifications of proteins are common in neurological disorders, due to the strongly oxidizing characteristics of the extracellular milieu because of generation of ROS and re active nitric oxide species [
119]. Higher levels of S100A8-SNO were also determined in MuSc patients with respect to healthy controls while levels of mono- and di-oxidized cys tatin SN, mono- and di-oxidized cystatin S1, mono-oxidized cystatin SA were lower [
73]. The reduced level in the patients of oxidized derivatives of S-type cystatins was an intri guing result. Indeed, high levels of oxidative stress markers and lower antioxidant status have been reported in saliva and plasma of MuSc patients under corticosteroid therapy by Karlik and coll. [
120]. However, different results were obtained by Schipper and coll. [
121] that highlighted a positive effect of the immunotherapy in oxidative stress evidenced by a reduction of oxidative stress markers in MuSc. Manconi et al. speculated that reduced S-type cystatin oxidation in MuSc patients could be related to the treatment of 32/49 pa- tients enrolled for the study. Indeed, S100A8-SNO was found at significantly higher level only in the group of untreated patients with respect to controls [
73].
Recently, analysis of the salivary proteome of patients affected by AD compared to healthy adult and elderly controls highlighted significant higher levels of S100A8-SNO and hyperoxidized S100A8, as well as glutathionylated S100A9 (long) in healthy adults with respect to elderly, and in AD patients with respect to healthy elderly. The same trend was observed for both glutathionylated and dimeric cystatin B, but not for the cysteinyl- ated proteoform which showed a significant difference only by comparing patients and elderly subjects [
75].
6. Transglutamination
Several salivary proteins are involved in the formation of proteins layers, i.e. the so called “oral pellicles” The general term for these layers is pellicle, but due to the different characteristics of the coated surfaces the enamel pellicle and mucosal pellicle are their own entities. These protein films have a dual role because one “the acquired enamel” is im portant for the integrity of tooth and the second “mucosal pellicle” for the protection of the oral mucosa. There is considerable information on the enamel pellicle: many proteins 782 and their fragments are involved with a particular concern for PRPs, statherin, histatins and P-B peptide [
122,
123]. On the contrary, only limited data are available on the mucosal pellicle [
124]. This can be attributed to the difficult to develop a standardized preparation of this latter biological structure. However, it has a completely different ultrastructure as 786 compared with the enamel pellicle. Since it is comprised of larger glycoproteins retaining water, it might be considered as a hydrogel, and it appears to have a lower tenacity than the enamel pellicle. Maturation and turnover are influenced by the delivery of salivary proteins, by the flow of saliva and the underlying desquamating oral epithelium. Its prob able functions include lubrication and moisture retention. Furthermore, interactions be tween mucosal pellicle proteins and bacterial surfaces are responsible for specificity of the 792 bacterial colonization during the earliest stage of plaque formation [
125]. The
in vivo pel licle is thought to be an insoluble network of proteins generated by post-secretory pro cessing of proteins mainly due to cross-linking. Cross-links for the formation of oral mu cosal pellicle were demonstrated firstly by Bradway and coll., which highlighted the ex istence of a network of proteins formed by components of saliva adsorbed onto buccal epithelial cell surfaces that cover the oral mucosal surface [
126]. Recently, a new proteomic protocol was optimized to investigate the proteins participating to the composition of the oral mucosal pellicle, among them proteins of the PLUNC family were identified [
127].
The oral mucosal pellicle is a thin lubricating layer generated by the binding of saliva proteins on epithelial oral cells [
128]. This protein molecular network interacts with the oral epithelial-cell plasma membrane and its associate cytoskeleton and contributes to the mucosal epithelial flexibility and turnover. It was demonstrated that acidic-proline-rich proteins, statherin, the major histatins as well mucins are substrates of oral transglutami nase 2 (TG2) and they participate in cross-linking reactions [
126] as putative pellicle pre cursor proteins. Whatever the structure of these protein networks may be, oral transglu taminases (mainly type 2 transglutaminase) are the pivotal enzymes for pellicle formation. TG2 can cross-link acidic PRP-1 and statherin
in vitro [
122]. TG2 is the ubiquitous tissue enzyme expressed also in oral epithelial cells [
129], which catalyzes different biological processes and generates a cross-link between two peptide chains, typically between ε amine of the lysine residue acting as lone-pair donor and a glutamine residue, the lone pair acceptor, and the reaction is accomplished by the loss of an ammonia molecule. TG2 is a Ca2+-dependent enzyme, released by the epithelial oral cells, negatively modulated by GTP [
129] and affected by the reversible formation of an intramolecular disulfide bridge. [
130]. Recently, a study of our group [
131] showed that also bPRPs and P-C peptide are potential substrates of TG2. Nonetheless, they showed a very different reactivity for mon- odansyl-cadaverine (used as lone pair donor). Mass spectrometry analyses of the reaction products highlighted that P-C, P-H and P-D (both Pro
32 and Ala
32 variants) peptides were active substrates of TG2, II-2 was less reactive, while P-F and P-J peptides showed negli- gible activity [
131]. MS characterization suggested that the consensus sequence for the linking is connected more to the environment of glutamine residue, than to the donor ammine. The pivotal residues characterized for P-H, II-2, and both variants of P-D evi denced …GNPQ… as the consensus sequence recognized by TG2 on bPRP peptides [
131]. This consensus sequence is not present in P-F and P-J peptides and, probably for this rea son, they were poor substrates for TG2.
P-C, P-H, P-D peptides formed cyclo-derivatives after TG2 reaction, and only specific glutamine residues were involved in the cycle formation and reacted with specific mono- dansyl-cadaverine [
131]. The stereospecificity of TG2 was at first recognized on statherin, which under the action of TG2 forms a cyclic derivative involving almost only the Gln
37 (out of 7 glutamine residues) and Lys
6 as unique lysine residue in the sequence (
Figure 4) [
132]. The detection of small amount of cyclo-statherin in adult human whole saliva [
132] and the high reactivity of secondary glutamine residues after the formation of cyclo-sta therin and cyclo-P-C were suggestive for the
in vivo formation of ring structures with a pivotal role in the architecture of the oral mucosal and enamel pellicles [
133,
134]. What ever the molecular mechanism towards the structure of the oral pellicles, they have a rel evant role in the protection of the mucosal from the mechanical and thermal high stresses that the mouth undergoes during the human life.
8. Citrullination
Citrullination is a PTM consisting in the conversion of peptidylarginine to peptidyl- citrulline in a calcium-dependent reaction catalysed by peptidylarginine deiminases (PADs), a family of five isoenzymes (PAD 1-4 and 6) with tissue specific expression [
169,
170]. The reaction of PAD enzyme with the arginine residue of a peptide/polypeptide chain forms an adduct with release of ammonia and subsequent formation of a ketone functional group following the cleavage of the adduct by a water molecule. The result of the deimination reaction is the loss of a positive charge, i.e. the immino mojety, resulting in a generation of a neutral amino acid, the citrulline, lacking the strong basic character proper of arginine, and producing a delta mass increase of the protein/peptide molecular mass of +0.9840276 and +0.98476 Da, monoisotopic and average, respectively. Interest- ingly, the rate of peptidylarginine citrullination by PADs enzymes is depending on the amino acid position along the protein/peptide sequence, with about the 80-90% of citrul- linated arginines positioned after aspartic acid residues. Arginines close to glutamic acid or to the N-terminal sequence trait are largely citrullinated too [
169,
170,
171].
The citrullination affects the physico-chemical properties of a protein because at neu- tral pH the positive charge of arginine is lost by the modification and causes changes of the overall charge and charge distribution of the protein, altering the isoelectric point, ionic bonds, protein structure, activity, and protein-protein interactions [
169,
170,
171]. The in crease of hydrophobicity produced by citrullination allows to easily distinguish the un modified from the modified form of a small protein or a peptide by reverse phase liquid chromatographic separations. In addition, because citrulline is not included in the list of natural amino acids incorporated in proteins, it was hypothesized to possibly induce im mune response and suggested to deeply investigate its role in autoimmune inflammatory diseases such as rheumatoid arthritis (RA) [
172]. In fact, citrullination PTM was studied in relation to various physio-pathological processes in addition to RA, such as apoptosis, multiple sclerosis, Alzheimer disease, and psoriasis, systemic lupus erythematosus, peri odontitis, COVID-19, cancer, and thromboembolism [
169,
170,
171]. Citrullination has been the topic of a very recent review that outlines the diagnostic as well as the therapeutic ap proaches based on this PTM and the future perspectives of “citrullinome” characterization and disclosure [
172]. Studies and clinical trials are nowadays based on the detection of anticitrullinated protein antibodies as high specificity tool for RA diagnosis also before clinical evidence [
173,
174].
Several are the proteins that are substrates of PAD enzymes and that can exhibit cit rullination PTM in physiological- as well as in inflammatory pathological states, and in clude filaggrin, keratin, fibronectin, actin, tubulin, vimentin, glial fibrillary acidic protein and histones [
173]. PAD 2 is the isoenzyme acting in salivary and secretory glands, as well as in other cell types and tissues, targeting myelin basic protein, C-X-C motif chemokine 10 and 11, vimentin, actin, glial fibrillary acidic protein, S100-A3, histones H3 and H4 [
173].
The importance of anti-citrullinated protein antibodies (ACPAs) detection in serum for the diagnosis of RA [
173,
175], has increased the interest of their application to saliva analysis for the characterization of citrullinated proteins.
Yasuda et al. identified citrullinated cytokeratin 13 in saliva samples of RA patients and healthy subjects by two-dimensional electrophoresis, silver staining and immunopre- cipitation-western blotting [
175]. In the same paper the origin of pre- or post-secretion of citrullinated cytokeratin 13 in saliva is also discussed based on calcium concentration data inside salivary gland cells and secreted saliva since PAD enzyme catalysis is calcium de pendent. According to Genotype-Tissue Expression database PAD 1-4 and 6 calcium bind ing enzymes are reported to be expressed in human minor salivary glands [
176]. Citrulli nation could take place either in saliva, i.e. post-secretion, because the biofluid contains calcium levels proper to PAD enzyme activity, or intracellularly depending on the calcium concentration increasing. Indeed, intracellular citrullination of histones and other proteins was recognized [
177]. Citrullination of salivary proteins could therefore occur either in tracellularly, followed by secretion of the modified protein, or extracellularly in the bio fluid, after secretion.
A recent paper compared the levels of citrullinated proteins levels in saliva of RA patients with respect to healthy controls also studying their correlation with the periodon- tal status and temporomandibular joint associated disorder [
178]. Confirming previous data [
175], no differences in citrullinated protein levels were found between RA patients and healthy controls apart differences in periodontal status observed between groups. However, in RA the salivary anti-cyclic citrullinated peptide levels were in correlation with periodontal disease staging, considering that bacteria involved in its pathogenesis produce non-physiological citrullination by a family of PAD enzymes very similar to hu- man PADs [
179].
A study investigated the influence of inflammation from periodontal disease in RA model mice analysis in serum and saliva samples of both anti-citrullinated protein anti- body analysis and citrullinated proteins by gel electrophoresis [
180]. Saliva and serum showed similar results evidencing protein citrullination around 55 kDa molecular weight electrophoretic band. Periodontitis exacerbated RA symptoms, indicating a relationship between the
Porphyromonas gingivalis periodontitis bacterial infection and RA.
Patients affected by Sjögren’s syndrome have been found positive for ACPAs and showed increased levels of PAD 2 enzyme, suggesting a possible role of citrullination in the disease that could be responsive for autoantigens triggering [
180]. However, in the same study it was outlined that only approximately 7.5% of Sjögren’s syndrome patients are positive for ACPAs.
Although the poor knowledge on the protein pattern that underwent to citrullination in saliva, the indications of the high potential of this biofluid in the development of new high specificity and sensitivity diagnostic tools of low invasiveness targeting selected cit- rullinated proteins in different pathologies are clear. However, deeper research still needs to be done to decipher the saliva citrullinome associated to physiological as well as to pathological states and to clearly understand where the modification occurs, i.e. inside salivary gland cells or extracellularly or both. To the best of our knowledge, combined top-down/bottom-up proteomic platforms have never been applied to the study of saliva citrullinome. Their application coupled to high resolution mass spectrometry detection could finally characterize the proteins subjected to citrullination in saliva in a wide mo- lecular range and abundance, and, especially, could localize the modification inside the sequence and better define its molecular features.
9. N-terminal modifications
The N-terminal of a polypeptide chain is a structural site which must be often modi- fied and protected during the ribosome synthesis. The positive charge of the terminal amino group can be an obstacle for the proper structural folding of the protein which starts in the nascent protein [
181]. The most common modifications in human are the ex- cision of the initiatory methionine (iMet), which is commonly carried out by enzymes called methionine aminopeptidases (MAPs). In any case the polypeptide chain, with or without iMet can be submitted to N-terminal acetylation induced by enzymes called N- terminal acetyltransferases (NATs). To date, 12 NATs have been identified, possessing different compositions, substrate specificities, and modes of regulation. In humans, NATs are present as group of six enzymes (called from NAT-A to NAT-F), with slightly different specificity [
182].
Another possibility, less common, of N-terminal modification occurs when the ter- minal residue is either a glutamic acid or a glutamine, which can generate a cyclic amide, called pyroglutamic N-terminus, with the loss of a molecule of water (Glu) or ammonia (Gln), respectively. This PTM can proceed spontaneously, at reasonable rates, or by the action of enzymes called glutaminyl cyclases, in a reaction commonly much faster with N-terminal glutamines than with glutamate residues [
183].
Our top-down mass spectrometry pipelines allowed to verify that all these three PTMs are largely detectable in human saliva, while C-terminal modifications are rarely detectable, probably because at the end of its synthesis the proteins have assumed a fold- ing close to the functional conformation. Few minor examples of C-terminal modifications of human salivary proteins will be reported at the end of this section.
9.1. Excision of initiatory methionine and Nt-acetylation
It is relevant to outline that when the methionine residue is at the first position of a leader peptide at turn responsible for the docking of the protein to the ribosome at the endoplasmic reticulum, acetylation is not carried out. This is observed for cystatins A, B, and S-type (S, SA and SN), and all the PRPs, suggesting the absence of NATs into the cisternae of cytoplasmic reticulum. In all other proteins, the action of MAPs is not man- datory and seems to depend on the type of the residue present at second position in the sequence. Therefore, Nt-acetylation may occur either on the iMet or on the first residue after iMet excision by MAPs. N-termini of proteins with small amino acid residues (Ser, Ala, Thr, Val, Gly, and Cys) at second position are mostly processed by MAPs, and the newly generated N-termini may be acetylated by NAT-A, as observed for the serine resi- due of thymosins β4 and β10, of S100A7, and of the two proteoforms of small proline-rich protein 3 [
87,
184,
185,
186]. N-termini of proteins with larger amino acid residues in the second position are not cleaved by MAPs, but potentially acetylated directly on the iMet by a variety of NATs depending on the N-terminal sequence. NAT-B potentially acetylates Met-N-termini when the second residue is Asp, Glu, Asn. NAT-C potentially acetylates Met-N-termini when the second residue is a hydrophobic amino acid (mainly Leu, Ile, Phe). Hydrophobic terminals are also recognized by NAT-E and NAT-F, suggesting that redundancy in activity exists between NATs [
187]. Moreover, NAT-F, showed a broader specificity than NAT-C in acetylating Met-N-terminals when the second residue is Met, Lys and Gln, and the potential to acetylate the same N-terminals recognized by NAT-A in proteins in which iMet has not been cleaved [
188]. Examples of human salivary proteins are represented by cystatin A, which has an isoleucine residue in position 2 and it is prob ably acetylated on Met
1 by NAT-C (or NAT-F) and cystatin B, which has a second methi onine residue in position 2 and it is acetylated on Met
1 probably by NAT-F. [
57,
87].
A peculiar example concerns the acetylation of S100A9, which among the five N terminal residues has two Met, one at position 1 and the other at position 5 (
MTCK
M…). Because of this N-terminal sequence, MAP catalyzes the cleavage not only at the level of the first Met residue, generating the long proteoform (1-113 residues), but also at the level of the fifth Met, giving rise to the short proteoform (1-109 residues), being the percentages of the two proteins similar in human saliva [
25,
87]. The Cys
3 residue, present only in the long form, is responsible for the formation of the S-cysteinyl and S-glutathionyl deriva- tives of S100A9, above described in the paragraph 5. Since these four proteoforms of S100A9 can be also phosphorylated in the penultimate threonine residue of the sequence, the number of possible proteoforms of S100A9 detectable in human saliva are eight [
188]. As we are aware, no information exists about specific function of these different pro teoforms of S100A9.
9.2. Pyroglutamic acid modification
As above reported, several human salivary proteins undergo N-terminal cyclization of glutamine and glutamic acid, as a protection for the N-terminal amino group. This PTM has been characterized for all the proteoforms of aPRPs (PRP-1 and PRP-3 types), [
90,
189] and bPRPs (II-2, IB-1), all the gPRPs [
23] as well as of P-B peptide [
41]. It is relevant to remark that although this PTM allows the integrity of the protein structure by protecting it from the action of aminopeptidases [
190], on the other hand it significantly alters its hydrophobicity and solubility [
183], facilitating protein aggregation. For this reason, many authors hypothesized that it plays a relevant role in several amyloidopathies [
191]. Indeed, experiments carried out in a mouse model using glutaminyl cyclase inhibitors demonstrated a reduced level of pGlu-modified Aβ peptides that appeared to attenuate Alzheimer’s disease [
192]. Nonetheless, the enzyme does not seem essential for the cyclization, which according to Bersin and coll. [
193], is spontaneous and observed in ab sence of the enzyme too.
11. Conclusions
This review, which provides many answers regarding the complexity of the PTMs that human salivary proteins undergo, nevertheless prompts even more questions regarding the roles of many of them.
The first question concerns the possible functional role played by the products gen erated by the multiple proteolytic cleavages to which the components of almost all protein 1152 families are subjected.
It is relevant to underline that they could be evidenced only by a top-down strategy. Since evolution is addressed towards the selection of molecular events relevant to the functions of cells, organs and tissues, particularly puzzling is the functional meaning of the cleavages to which undergo acidic and basic PRPs, as well as challenging is the role of P-B peptide. Particularly challenging is defining the role of the myriad of recurrent fragments generated from bigger salivary proteins by exogenous proteinases. Therefore, the development of new strategies to study the biological role of the fragments is demanding.
The role of almost all the other PTMs described is obscure too. Proteomic searches, devoted to the discovery of new disease biomarkers, are observational studies, supported by powerful statistical tools, able to suggest potential biomarkers, but not able to suggest their specific physio-pathological role. Therefore, new experimental designs are needed to help researchers to connect the information obtained on structural modifications and the role of these changes in health and disease. This knowledge is fundamental for the biotechnological utilization of biologically active salivary proteins [
194]. The recent pandemia evidenced the utility of saliva as human biofluid for painless non-invasive free at-home diagnostic purposes [
1,
195,
196]. Indeed, saliva is to date also used for the fast check of illicit drugs abuse on the road [
196]. We hope that this review could be a stimulus for further investigations and clinical appli cations.
Author Contributions
Conceptualization, MC, IM, GF, TC, AF, BM; methodology, All authors; software, AO, TC, CC, GG, CD, DR; validation, All authors; formal analysis, All; investigation; data curation, All authors; writing—original draft preparation, All the authors; writing—review and editing, All the authors. All authors have read and agreed to the published version of the manuscript.