3.2. Transcription factor among DEGs
TFs have been identified among sex specific DEGs sets from RNA-seq analyses [
11]. The results of the RNA-seq experiment supplemented with annotation of TFs, TRs, PKs can be found in Supplement S3. Among DEGs from the tested comparisons: LvS, FvM, FvH, MvH, the 8,7%, 10,77%, 5,88% and 10,91% of TFs were detected, respectively (
Table 1).
The largest number of TFs was detected for the LvS comparison (8,70%). However, this number is due to the largest number of DEGs detected between the leaf and shoot apex and contains genes responsible for the transition from the vegetative to flowering phase. Number of DEGs in the shoot apex among male, female, and hermaphrodite lines was significantly lower, while the percentage of TFs detected for these comparisons remained similar. In comparison MvH, where six TFs were detected, half of them were the MADS-MIKC family. The other assigned families were HB-WOX, NAC, C2C2-YABBY. For comparison FvH, in which only two TFs were detected, the bHLH and C3H families were defined. The following
Figure 3 and
Figure 4 show graphs with the highest number of the TFs detected in the FvM and LvS comparisons, respectively. For the FvM and MvH comparisons the largest number of differentially expressed factors belonged to the MADS-MIKC family. This represents a significant increase in the proportion of these TFs relative to their contribution to the whole genome. When comparing LvS, the TFs that were found in the highest abundance, i.e., bHLH, MYB, NAC, and C2H2, correspond in abundance to the distribution of TFs across the genome. TFs of the MADS-MIKC family in this comparison also represent an increased proportion in the number of 16 differentially expressed TFs relative to their presence in the reference genome. MADS TFs are a family of DNA-binding proteins that play an essential role in various plant developmental processes, especially floral organ identity and differentiation [
16,
17] additionally controlling the expression of genes that determine the identity and morphology of sepals, petals, stamens, and carpels. The MADS TFs in cucumber are similar to those in other plants, as they are also involved in the flowering time regulation and the floral organs developement [
18]. The detection of the MIKC-MADS family as the most abundant among the TFs detected directing us towards linking MADS to the ABC model of flower development [
19].
In addition, the presence of eight differentially expressed TFs of the AP2 family is important notification, due to the link between sex determination processes and ethylene metabolism [
20].
3.3. Ontology analysis among differentially expressed genes
For DEGs, an overrepresentation of GO terms was found in all comparisons: LvS, FvH, FvM and MvH, what is presented in
Figure 5. For FvH, most of the GO terms were related to processes like pollen sperm differentiation, male gametogenesis, or microgametogensis differentiation, indicating significant relationships to processes involved in sex determination and male organ formation. In this comparison, the female organs are formed in the female as well as in hermaphrodites, while the male organs are present only in hermaphrodite flower.
In the FvM comparison, the most significant enrichment concerned genes that are involved in the flower formation developmental processes, overall organ development and carbohydrate metabolism processes.
Processes related to enzyme activity: monooxygenase, oxidoreductase, hydrolase or pectinesterase were the most enriched in MvH comparison. Genes involved in the development of the carpels and gynoecium were also enriched. In this comparison, the female organs are formed in the hermaphrodite flowers but in male flowers this organ is inhibited in the growth, thus the difference connected to these group is expected. When comparing LvS, the enriched processes differed significantly from the other comparisons. The highest enrichment in this case was for processes related to photosynthesis, light response, plastid or chloroplast organization. Comparing a vegetative organ such as the leaf with a generative organ such as the whole structure of shoot apex with small floral buds, indicates which genes and processes differentiate these two organs mostly. The number of DEGs is the highest in the LvS comparison and significantly exceeds the number of DEGs in the other comparisons. The analysis of the ontology network (Supplements S4-S7) shows that those processes are significantly enriched and are marked in red and yellow. For the FvH and MvH comparisons, the created ontology networks are significantly simpler than the other: FvM and LvS comparisons, which is due to the smaller number of significantly DEGs identified. FvH comparison represents the most enriched final processes, converging to a single final process of pollen germ cell differentiation. Similarly, for the MvH comparison, the structure of the ontology converges on final terms describing gynoecium development and carpel development, although a separate branch indicating metabolic processes and a final term describing oxidation and reduction processes are additionally described. In comparison the FvH ontology network is much more extensive, where three main branches can be observed. The first corresponds to floral organ formation processes, the second indicates metabolic processes taking place while the third describes enriched processes related to transport. In the case of ontological terms describing organ development, processes such as floral organ development and floral organ formation have the greatest enrichment. In addition, we can distinguish terms describing gynoecium development and carpel development. The processes responsible for oxidation-reduction and carbohydrate metabolism have the greatest enrichment within metabolic processes. The different branches of ontology converge to final terms describing processes related to metabolism of pectin, salicylic acid, inositol and fatty acids. A separate branch of the network describes processes related to transport of carbohydrates, saccharides and sucrose.
The next step of the analysis was to check whether there were DEGs in the enriched GO term that were TFs. For this purpose, we checked all differentially expressed TFs in the sex specific comparisons. In
Figure 6, the frequency of TFs among DEGs for each ontology term is shown for FvM comparison. It can be seen that TFs are responsible for floral developmental processes, carpel development, and organ formation. This indicates the actual involvement of TFs in processes linked to the plant's sex development. No TFs were detected for the enriched ontology terms in the FvH comparison. For the MvH comparison, two TFs involved in gynoecium and carpel development processes were detected among the enriched ontology terms. They are also directly related to issues of plant sex development which, as can be seen, represents the relevance of TFs in this process. The LvS comparison is the most abundant. Therefore, it consists of the largest number of TFs. The ontological terms to which the TFs were assigned were in relation to metabolic processes and biosynthesis.
3.6. Regulatory Transcription Factors influencing DEGs.
The study of flowering in cucumber is crucial due to its significant economic importance and vulnerability to both: endogenous and exogenous factors. The impact of these factors collectively determines the expression of genes, which in turn is influenced by the activity of various TFs. The regulatory TFs were assigned to families and functionally curated.
Our study shows the interaction of regulatory TFs and their influence on DEGs thus, taking together, we can answer the question: what TFs influence flower morphogenesis at early stages of growth.
As a result of the TF enrichment program in the PlantRegMap database, a list of enriched regulatory TFs was obtained for each of the FvM, FvH, MvH and LvS comparisons. The retrieved regulatory TFs were annotated according to B10v3 information data sets. (Supplement S9). Analysis of the detected TF families revealed that the NAC family was the most abundant family detected for the FvH and MvH enriched comparisons. Furthermore, genes encoding TF families such as bHLH and MYB showed a higher frequency of detection. Notably, a greater number of enriched TFs were identified in the FvH and MvH comparisons compared to the FvM comparison. For each of the comparisons considered, enriched TFs were detected for a significant number of TF families. The LvS comparison showed the highest number of enriched TFs, with MYB, bZIP, bHLH and DOF being the most frequently detected families.
Created interaction maps between regulatory TFs and their targeted DEGs are highly complex and have therefore been presented in the form of interactive networks available in HTML files (Supplements S10-S13). The advantage of such a network presentation is that it is possible to view the gene of interest together with the genes with which it interacts (DEGs or regulatory TFs), and to read their annotation. A static image of the networks for FvM, FvH and MvH comparisons is presented in
Figure 8.
Between FvM, the number of DEGs is significantly higher than that for FvH and MvH comparisons. The FvM interaction network thus contains many more connections, showing that the male and female lines have significant differences in expressed genes. The FvH and MvH lines are thus a simplified model as the number of differentially expressed genes is much smaller but refers to processes associated with sex variation.
The resulting interaction networks between regulatory TFs and DEGs varied in complexity, what was based on the number of DEGs used to construct the network. The largest network was created for the LvS comparison consisted of 3029 nodes. The network for FvM comparison consisted 468 nodes, for FvH comparison consisted of 177 nodes, while for MvH comparison consisted of 191 nodes. The number of individual regulatory TFs and DEGs used to construct the networks is shown in
Table 2. The smallest networks were created for the FvH and MvH comparisons, due to the flowers possessing a common element in the flower architecture. The FvM network is more developed due to flower architecture concerning distinction of the generative organ.
In the next step we grouped up regulatory TFs that influence DEGs into 34 families and performed functional characterization (Supplements S14). This allowed as to elucidate which and how the regulatory TFs influence DEGs and thus proteins involved in metabolic processes in lines varying in sex in cucumber. In order to check the force of influence of regulatory TFs families, the links between TFs and their targets were counted (
Figure 9).
TFs are proteins that help to 'turn on' or 'turn off' certain genes by binding to the promoter, thereby regulating the functioning of the organism. The present study identified several TF families that majorly influence the expression of DEGs in male, female and hermaphrodite flowers. The most numerous family of regulatory TFs in all three networks were: AP2/ERF (total 101), MYB (73), NAC (44), bZIP (39) and bHLH (35) family. Other families were less numerous in the sum of the three comparisons (
Figure 9). However, some TFs are specified only for one comparison, namely FvM: YABBY, LFY, SRS, EIL or only for to comparison of FvH and FvM, such as: WRKY, CPP, FAR1 and for other set, FvM and MvH - HSF family. In terms of edge numbers, that is, family interactivity, which can be translated into power to influence DEGs, the most numerous families totally in three networks were also AP2/ERF (1069), DOF(465), MYB (296), MIKS – MADS (233) and BBR-BBC (219). Other families possess less than 200 connections.
Table 3 presents the top 10 TFs that have the most connections in each interaction network.
The question arises as to: how the families of TFs that have been identified affect sex determination; which processes they are involved in and with what interaction. The hormonal regulation plays a crucial role in the process of sex determination, as the genes primarily involved in this process are associated with ethylene synthesis, such as:
CsACS1,
CsACS2,
CsACS11 and they are linked with genetic loci
F,
M, and
A respectively [
21,
22,
23,
24]. Expression and interaction among all three genes help in the development of female flower in cucumber. TFs act as a very important factor thatwhich can be either activate or repress the gene expression. Ethylene is the principal hormone that is responsible for the formation of specific organs and genes responsible for ethylene biosynthesis have a direct association with the development of female flowers [
25,
26]. The ethylene production in shoot apex primordia can readily modify the male to female flower ratio on the plant. It is known that sex in cucumber is linked to hormonal regulation and ethylene plays an important role in the cucumber. Of the identified TF families, ten are associated with the ethylene response in other plants and these are: AP2/ERF [
27], MYB [
28], NAC [
29], bZIP [
30], bHLH [
31], WRKY [
32], TCP [
33], C2H2 [
34], TALE [
35] and MIKC_MADS [
36]. Additionally, other hormones such as auxin and cytokinin exert a positive effect on female sex determination through interaction with ethylene biosynthesis and signaling pathways[
37,
38]. The families connected with other hormones were also identified in this study: auxins – NAC [
39], bHLH [
40], LFY [
41], cytokinin bHLH [
40] and BBR-BPC [
42,
43]. The results of others studies demonstrated that gibberellins (GA) can have dual effects on sex expression in cucumber, inhibiting femaleness and inducing maleness and expression analysis has shown that CsACS1G transcription is promoted by auxins and inhibited by gibberellic acid [
44,
45]. According to the literature, there were five families: LFY [
46], YABBY [
47,
48], MYB [
49], BBR-BPC [
42,
43] and TALE [
50] which were correlated with this hormone. AP2/ERF are identified as one of the largest groups of TFs in this study in all three comparisions. AP2/ERF family members induce ethylene signalling and flowering [
51,
52,
53]. The
CsACS11 is one of the ethylene biosynthetic genes [
54] and also thought to be a sex gene (
a) in cucumber [
22]. So far, it is not clear how the hormonal signalling pathways influence sex at the molecular level, so further detailed characteristics of the link between the regulatory TFs is needed. The formation of the complex flower architecture involves the MADS family described above. For the families identified, we found numerous TFs - DEGs links to flower development. Several studies reported role of MICKS-MADS [
36,
55,
56], bHLH [
57,
58], bZIP [
59], and NAC [
60,
61,
62] in promoting or delaying flowering development. TALE family shows interaction with ethylene and cytokinin signalling [
35]. Together with floral development, timimg of flowering is also crucial in plants for fruits and seed production. MYB [
63], bZIP [
64], bHLH [
57,
58] and WRKY [
65] families are involved in regulation of flowering time, as desribed previously. Study in tomato, shows that bHLH acts with SFT or LFY and controls flowering time. It also influences ethylene biosynthesis genes, as the expression of ethylene biosynthesis genes is upregulated in the overexpression line of bHLH [
66]. In the forming gynoecium of the Arabidopsis flower, other hormones such as auxin and cytokinin interact with bHLH [
40]. BBR-BPC/GAGA has been described in Arabidopsis to regulate the phytohormonal signalling of cytokinins, brassinosteroids and ethylene [
42,
43]. The DOF TF has been reported to be involved in tissue differentiation, cell expansion, seed development, anther or pollen development and flowering in plants [
67,
68,
69]. DOF has been also implicated in the formation of vascular tissue in reproductive organs [
70]. The interaction between the bZIP member and the C2H2 member in melon inhibits the development of the carpel in male flowers [
71,
72]. It also represses the transcription of ethylene biosynthetic genes [
73]. The present study reveals that the transcription factor WRKY is present solely in the FvH and FvM comparisons, while being absent in MvH. WRKY TF interacts with various flowering genes to regulate flowering timing in plants [
65,
74]. Another transcription factor, YABBY, is exclusively found in FvM comparison. The YABBY TF was desribed to play a crucial role in the development of anthers and pollen sacs in cucumber, Arabidopsis and rice [
75,
76,
77,
78]. The TFs from the YABBY family, interacts with MADS-box to control its expression during carpel development [
79] In addition, only the LFY TF was identified in the FvM comparison, which is known to respond to auxin and regulate flowering initiation as presented in Arabidopsis [
41,
80].
A parallel and second approach to network analysis is to explore proteins (encoded by DEGs) to determine their reactivity with TFs regulators. In the compariosns of sex network such nodes were: for the FvH - DNA/RNA binding proteins, oxidoreductase proteins, for the FvM - pectinase, MADS box TFs and lipase proteins, and for the MvH - monooxygenase and triphosphate hydrolases.
Our analyses demonstrated the link between regulatory TFs and various developmental processes, including flower morphogenesis, flowering timing, and interactions with phytohormones. Gene expression governing specific functions involves the action of TFs. By conducting a detailed examination of interaction networks, we have identified regulatory TFs that have the potential to regulate a significant number of DEGs. These regulatory TFs act as central hubs in the network, and can influence a large portion of the nodes, thereby characterizing them as master regulators/hot links. The identification of these master regulators can serve as a valuable hub point for future investigations. By selectively focusing on these factors, their regulation or knockout, may be utilized to observe changes in DEGs within the context of sex comparison in cucumbers. This approach holds considerable potential for expanding our understanding of the complex regulatory networks underlying sex development in cucumber.