Introduction
The Notch pathway plays a major role in embryological development and the ongoing life processes of many animals. Among its many roles, one may list:
Neuronal function and development [
1], fate determination in angiogenesis [
2], and in cardiac development [
3], control of the differentiation of both endocrine and exocrine pancreas [
4], cell fate decisions between the secretory and absorptive lineages in the gut [
5], in the development of mammary glands [
6] and of alveoli in the lung [
7].
In many of these cases, the Notch pathway acts as an arbiter between two alternative cell-fate decisions: “Notch acts in a permissive manner: It does not determine the fate of a cell but rather whether a given fate is adopted” (Ehebauer etal [
8]).
This scheme shows two cells; the one on the left is the cell that we will call the SENDER cell, that on the right is the RECEIVER cell. Embedded in the membrane of the sender cell are the DLL and JAG proteins, which are the ligands for the NOTCH proteins, also membrane- embedded, of the receiver cell. On binding of the ligands to the NOTCH proteins, these receptors are hydrolytically cleaved and the intracellular portion of the NOTCH proteins move to the cell nucleus (depicted as the circle with in the receiver cell) where they interact with the RBP proteins that regulate the HES transcription factors. This transcription determines the expression or repression of targeted proteins in the receiver cell and hence its fate. The central core of the system is thus the DLL/JAG – NOTCH – RBPJ – HES chain that enables the sender cell to control the expression of proteins in the adjacent receiver cell. Additional members of the pathway activate or repress other components of the chain. An example is the co-repressor complex depicted in
Figure 1 as being within the nucleus of the receiver cell. Another is the cytoplasmic γ-secretase complex, also depicted in the figure. The DVL and NUMB proteins interact with, and repress the NOTCH proteins, as do the proteins of the DTX family (although these can also activate NOTCH proteins). The extent of these activating or pressing processes are, of course, determined by the epigenetic status of the receiver cell. Finally, the Fringe proteins (LFNG, MFNG, and RFNG) are glucosaminyltransferases that have been shown to modulate Notch activity.
The evolution of such an important system is an intriguing question and has been tackled before. Gazave etal [
9] presented an impressive analysis of the question, but their study comprised less than half of the Notch system proteins depicted in
Figure 1. The comprehensive work of Lv etal [
10] is confined to Notch signaling in the invertebrates, while the study of Babonis and Martindale [
11], that considers a broad range of signaling systems, deals with only twenty components of the Notch system. The present work considers all 49 of the Notch pathway components (46 that appear in
Figure 1 and
Figure 3 that were identified after
Figure 1 was built) and considers both vertebrate and invertebrate animals. It is based on the use of orthologs. This provides a firm foundation to elucidating the appearance of the proteins of the Notch signaling system, evolutionary level by evolutionary level, from the first protozoa to modern humans. Only with our own appearance does the recruiting to the system reach completion.
Methods
Searches for orthologs were performed using the protein BLAST (Basic Local Alignment Search Tool) program of the NCBI (National Center for Biotechnology Information of the National Library of Medicine):
https://blast.ncbi.nlm.nih.gov/Blast.cgi with the following search parameters: Max target sequences 1000, Expect threshold 200000, Word size 2, Max matches in a query range 0, Matrix BLOSUM62, Gap Costs Existence: 11 Extension: 1, No compositional adjustments, No Low complexity regions filter.
To be recognised as an ortholog in the present study, a sequence found in the databases had to be annotated with the same name as the probe sequence from the human genome. A sequence annotated as “-like” was rejected. Of course, by the definition of an ortholog, the probe sequence had to be absent when earlier -appearing clades were searched.
Sequence comparisons between two proteins and dot plots of the comparisons were made using the BLAST 2-sequences tool of the BLAST program, using the same search parameters as above. All one-on-one Expect values recorded in this paper were results of BLAST 2-sequence comparisons between the two proteins.
Properties of the orthologs listed in Supplementary Table S1 were taken from the listings in GeneCards
https://genealacart.genecards.org/Result. GeneCards proved useful also when the aliases of genes discussed in the literature had to be interpreted to provide HGNC symbols.
The clades leading up to the emergence of the primates are numbered as Phylostratum Levels in the following scheme, which follows the formulation by Domazet-Loso and Tautz [
12].
Figure 2.
The clades leading to Homo sapiens. The numbering of the 19 Phylostratum Levels follows that of [
12].
Figure 2.
The clades leading to Homo sapiens. The numbering of the 19 Phylostratum Levels follows that of [
12].
Results
Figure 1 above depicts 46 proteins of the NOTCH signaling pathway. These are listed in
Table 1a,b together with the orthologs found by BLAST searches (see Methods). Included in the tables are the three recently found paralogs of the NOTCH2NL gene, that appeared in human evolution some three million years ago [
13]. Also listed in the tables are the HGNC symbol for each protein (in cases where the name listed in
Figure 1 is not an HGNC symbol), the Uniprot symbol for the protein, the ortholog level, the Phylostratum number, and the Expect value obtained by a BLAST 2-sequence comparison between this ortholog and the corresponding protein in
H.sapiens.
Figure 3 depicts the contribution of the orthologs at each phylostratum level (see the definitions of the phylostratum levels in Methods), while
Figure 4 shows the data as the accumulated total of orthologs as a function of phylostratum level:
Figure 3.
Contribution of orthologs of the proteins of the Notch signaling system at each phylostratum level. Data from Tables 1a and 1b.
Figure 3.
Contribution of orthologs of the proteins of the Notch signaling system at each phylostratum level. Data from Tables 1a and 1b.
Contributions begin already at the level of the protozoa with major additions occurring with the appearance of the sponges, the sea anemones and the jawed fish. Interestingly, it would appear that the sea urchins and tunicates did not contribute new proteins to the Notch signaling pathway.
From
Figure 4, it would appear that by the time of the arrival of the sea anemones, fully half of the Notch signaling proteins had accumulated to the evolutionary development of the pathway.
Figure 5 below shows the proteins that were to form the Notch signaling system as they appeared, clade by clade, with the evolution of each new clade in the early evolutionary history of the animals. It will be appreciated that these proteins accumulate- so that at each clade, those Notch signaling proteins that appeared in earlier clades will be, in general, still retained in the newly evolved organism.
(The genome of the primitive Metazoan, the Placazoa Trichoplax adherens possesses no Notch protein designated as such, but a sequence designated as “uncharacterized protein TRIADDRAFT_57304 “, when BLASTED against H. sapiens yields as the two top hits NOTCH1 and NOTCH2 - that show Expect values of 0.0 and 0.0, respectively, in BLAST 2-sequence analyses against this Trichoplax sequence).
The protein that appears as [DLL] among the Porifera components in
Figure 5 above is designated in this fashion because the five DLL proteins of the sponges do not fit the definition of an ortholog as given in Methods: “To be recognised as an ortholog in the present study, a sequence found when searching the databases had to be annotated with the same name as the probe sequence from the human genome”. However,
Figure 1 depicts only three DLL proteins, not five. To accommodate this complexity, the “ortholog” in the sponges is designated here as a single protein [DLL].
The sequences of two of the five DLL proteins of the sponge
Amphimedon Queenslandica are compared with the sequence of DLL1 of
H. sapiens in
Figure 6 below.
Exploring the Deep Evolutionary Origins of the Notch, DLL and JAG Proteins
The Porifera are classified in the Metazoa, the earliest multicellular animals.
Figure 5 shows that both a DLL protein and NOTCH2 had already appeared in these animals, giving them the possibility of cell-cell interactions, the DLL in the Sender cell and the NOTCH in the Receiver cell both being present. It is of interest to ask what might have been the ancestors of these fundamental proteins of the Notch signalling system. To answer this question, we performed a BLAST search, using the NOTCH protein of the glass sponge
Oopsacas minuta as bait and querying all the animals below the Metazoa. The top hit was BAM76481.1, the annotation of a sequence in the protein tyrosine kinase (PTK) of the bacterivorous amoeba
Ministeria vibrans. A BLAST 2- sequence comparison between this sequence and that of the sponge’s NOTCH returned a convincing Expect value of 2E
-103.
A comparison between the protein sequences of the NOTCH protein of the sponge
O. minuta, the four NOTCH proteins of
H. sapiens proteins and the protein tyrosine kinase of
M. vibrans is shown in
Figure 7 that follows:
Using now as bait the Delta protein of, again, the glass sponge O. minuta (The glass sponge possesses a single DLL protein named Delta), and querying all the animals below the Metazoa, the same sequence BAM76481.1 of M.vibrans was returned. A BLAST 2-sequence comparison between this sequence and that of the sponge’s DLL returned an Expect value of 4E-80.
A comparison between the protein sequences of the single Delta protein of the sponge
O. minuta, three DLL proteins of
H. sapiens, and the protein tyrosine kinase of
M. vibrans is shown in
Figure 8 that follows:
Finally, the two JAG proteins of the Notch signalling pathway, present also with the DLL proteins on the membrane of the Sender cell (
Figure 1), again have strong sequence similarity with the protein tyrosine kinase of the amoeba
M. vibrans, the BLAST 2 sequence Expect vales being 2E
-120 for JAG1 and 3E
-122 for JAG2, respectively. A comparison between the protein sequences of the JAG proteins of
H. sapiens and the protein tyrosine kinase of
M. vibrans is shown in
Figure 9:
A phylogenetic tree of the relationship between the protein tyrosine kinase (PTK) of
M. vibrans and the JAG and DLL proteins of
H.sapiens is shown in
Figure 10:
Figure 11 that follows depicts the phylogenetic tree of the DLL and JAG proteins of
H. Sapiens, the Delta protein of
O. Minuta and the protein tyrosine kinase of
M. vibrans (the same input as was used in
Figure 10), after dialling down the maximum sequence difference to 0.7. At a maximum sequence difference of 0.8, all 7 proteins were still present (data not shown).
This analysis of these sequences by dialling down in the maximum sequence difference routine of the COBALT program, where only the sponge’s DLL and the PTL now remained, showed that the amoeba and sponge sequences were more closely related to each other than either was to the sequences of the H. Sapiens proteins.
All of the sequences depicted in Figures 6 through 9 show a series of six or more repeating cysteine residues, characteristic of proteins of the EGF family. This will be considered further in the DISCUSSION section.
From
Figure 5 it can be seen that by the time the Cnidaria appeared, the entire central core of the pathway was in place, with two of the membrane-bound ligands of the Sender cell (DLL1 and JAG1), and the nuclear-active terminal protein of the Receiver cell being present, as well as two NOTCH proteins – the receptors on the Receiver cell. In addition, some proteins serving as activators and repressors of the central core proteins had appeared, members of the co-repressor complex (
Figure 5) and the γ-secretase complex, as well as the Fringe proteins (LFNG, MFNG, and RFNG). The further evolution of the Notch signaling pathway from the cephalochordate [
14] through to the sharks and rays (the Elasmobranchi) and bony fish [
15] was of proteins that contribute further to the control of the central core (listed in
Table 1). Notch4, with a role in angiogenesis [
2], and PTCRA (Pre T-Cell Antigen Receptor Alpha), which contributes to the regulation of early T-cell development [
16] appeared with the amphibia and amniota, respectively. The three NOTCH2NL paralogs appeared during the recent evolution of the human species.
Discussion
In the evolutionary trajectory from the first living organisms to the emergence of the humans, genes were added to the accumulating genome at each Phylostratum Level. At many levels, these included genes that were later to form the Notch signaling system. It is of much interest to ask what was the function of such a “Notch signaling” gene in an animal that long preceded the appearance of the complete Notch system.
Figure 5 above shows the contribution of orthologs of the proteins of the Notch signaling system at each Phylostratum Level for the lower half of the evolutionary trajectory leading to the humans. At the level of the Eukaryota, we see four genes appearing: RBPJ, SNW1, HDAC1 and HDAC2. These four are found in the nucleus of the Receiver cell as can be seen in the scheme of
Figure 1. The two HDAC genes are shown in the figure as being in the co-repressor complex while RBPJ is depicted as directly interacting with DNA.
Of these four genes, HDAC genes are expressed in the cell nucleus of the protozoan
Trypanosoma brucei, where they participate in the expression of the VSG (Variant Surface Glycoprotein) genes that enable the parasite to evade the host’s immune response [
17]. Indeed, the paper [
17] reported on attempts to use inhibitors of HDACs to combat infection by the parasite. For RBPJ, Prevorovsky etal [
18] found orthologs of RBPJ in several fungal species while Oravcova etal [
19] showed that one such expressed protein (named in
Schizosaccharomyces pombe as Cbf11) could act as a sequence-specific DNA binding activator of transcription. Finally, for SNW1, Halova etal [
20] working with the fungus
Saccharomyces cerevisiae showed that the splicing factor Prp45 (a homolog of SNW1) was expressed in the yeast. It is present in the spliceosome and takes part in co-transcriptional splicing.
Thus, all four genes have been shown to be expressed in fungi or protozoa and suggestions made as to their roles in these organisms. Yet hundreds of millions of years would pass before a complete sender-to-receiver Notch signaling system would appear in the Cnidaria (the sea anemones, corals, and jellyfish).
Figure 5 shows that, meanwhile, thirteen Notch signaling system orthologs were recruited with the appearance of the Porifera, the sponges. Schenkelaars etal [
21], noting that the sponges were the first animals to be truly multicellular (the fungi forming mere syncitia), investigated whether the cell-to-cell Notch signaling system might already be fully present in a sponge. Investigating the Notch and Wnt signaling pathways in the glass sponge
O. minuta, they showed that. while many components of the Notch pathway were already present in the sponge including the eponymous NOTCH, key components were not. The nuclear-active HES1 and HES5 of the Receiver cell were not identified nor were JAG1 or JAG2 of the Sender cell. Schenkelaars etal [
21] did find a DLL of the Sender cell, shown as [DLL] in
Figure 5 above. The DLL in the sponges are not named as either of DLL1, DLL3, or DLL4 – see above -, although they may play the role in the sponge as a Notch ligand [
21]. The extracellular portion of the protein has a strong sequence similarity to that of the human DLL proteins –
Figure 6 - and it is this portion of these molecules that bind to the NOTCH proteins. Richards and Degnan [
22,
23], studying the sponge
Amphimedon queenslandica, followed the expression of Notch pathway components during the embryological development of this organism with results suggesting that classic roles for the pathway – like mediating the choice between cell fates (here between the path to an anterior pole cell or alternatively, to a flask cell) – are already operative at this stage of evolution.
The Cnidaria have a full Notch pathway, extending from DLL1 and JAG1 that are ligands in the Sender cell (
Figure 1) to the nuclear-active HES1 in the Receiver. Pharmacological inhibition of Notch pathway components (inhibiting presenilin by treatment with the PSEN inhibitor DAPT (N-[N-(3,5-difluorophenacetyl)-L-alanyl]-S-phenylglycine t-butyl ester) or blocking the intracellular domain of NOTCH itself by treatment with the Notch inhibitor SAHM1 (stapled α-helical peptide derived from MAML) showed in
Hydra that such treatment disturbed head regeneration and tentacle patterning. These effects demonstrate a functional role for the pathway in the animal’s development [
24]. Pan etal [
25] extended these findings, showing that genetic modification of Notch pathway components (over-expressing the intracellular domain components of NOTCH, or knockdown of NOTCH) recapitulated the pharmacological effects. Note that PSEN1 and PSEN2, inhibited in the experiments of Munder etal [
24], were added to the evolving genome already in the Porifera - thus showing that earlier -appearing components of the Notch pathway were integrated into the signalling system by the time that the Cnidaria appeared.
Our exploration of a possible pre-Metazoan origin of the Notch proteins suggested (see Results) a protein tyrosine kinase (PTK) of the amoeboid Ministeria vibrans as a possible candidate. To our surprise, our subsequent explorations of such an origin for the DLL and JAG proteins came up with the same protein tyrosine kinase as possible candidate also for these proteins.
The PTK, DLL, JAG, and Notch proteins show extensive sequence similarity (Figures 6 through 9) and, in particular, a forty-membered sequence containing sets of six-fold repeats of Cysteine residues. This is a feature characteristic of the large family of EGF proteins [
26] where these sequences form part of the extracellular ligand-binding domain of these proteins. The 42-long sequence GDK
CQTDMNE
CLSEP
CKNGGT
CSDYVNSYT
CK
CQAGFDGVH
C in NOTCH2 includes the calcium-binding domain of this protein (GeneCards – see Methods).
Suga and his colleagues [
27] performed an extensive survey of the PTK proteins of all pre-Metazoan animals. In particular, they analysed the Filasterea clade to which
Ministeria vibrans belongs. (“Filasterea is a proposed basal Filozoan clade of single-celled amoeboid eukaryotes that includes Ministeria and Capsaspora.” – Wikipedia). The Filasterea are a sister group to the Metazoa and Choanoflagellata and are thus central to a discussion of the evolution of the Metazoa. Suga etal [
27] concluded that the “divergence pattern of [PTKs] is consistent with the hypothesis that they act to detect changes in the extracellular environment or to recognize and catch prey, because each organism has to be adapted to its own environment and nutrient conditions”, a statement that provides a plausible answer to the question of what was the role of the PTKs in these pre--Metazoan animals.
The Filasterean PTKs have transmembrane sequences as do other members of the EGF protein family. It has not been shown that in these pre-Metazoan PTKs, dimerization of the transmembrane portions of the molecule plays an important role in their function as tyrosine kinases. This has been found, however, in Metazoan PTKs [
28].
It is thus a possible scenario that, as the Metazoans diverged from the common Filasterean/Metazoan precursor, their protein tyrosine kinases evolved into forms that would recognize and bind to one another in the typical ligand/receptor mode. Thus, evolving from a pre-Metazoan protein tyrosine kinase, the DLL and JAG proteins became ligands on the extracellular surface of a Sender cell while the Notch proteins became the extracellular receptors on the Receiver cell.