6.2. Proteomics, Transcriptomics and Genomics
The generation and processing of huge biological data sets (omics data) is made possible by technological and informatics advancements, which are driving a fundamental change in the field of biomedical sciences research. Even while the fields of proteomics, transcriptomics, genomics, bioinformatics, and biostatistics are gaining ground, they are still primarily evaluated separately using different methodologies, producing monothematic rather than integrated information. combining and applying (multi)omics data to improve knowledge of the molecular pathways, mechanisms, and processes that distinguish between health and disease[
73]. Within the field of proteomics, transcriptomics and genomics dynamic partners that provide distinct insights into the complex regulation of biological functions.
Proteomics is the scientific study of the proteome, or the whole set of proteins expressed and altered by a biological system. Proteomes are extremely dynamic and constantly changing both within and between biological systems. The word "proteomics" was coined by Marc Wilkins in 1996 to emphasise how much more complex and dynamic researching proteins is than studying genomes. Using techniques like mass spectrometry (MS), protein microarrays, X-ray crystallography, chromatography-based methods, and Western blotting, proteomics analyses a range of factors related to protein content, function, regulation, post-translational modifications, expression levels, mobility within cells, and interactions. Mass spectrometry has become an essential high-throughput proteomics technique these days, especially when paired with liquid chromatography (LC-MS/MS). The way that protein structure is predicted has fundamentally changed as a result of DL advancements like the AlphaFold algorithm [
74].
The use of AI technology has resulted in notable advancements in the field of proteomics. The exponential increase of biomedical data, particularly multiomics and genome sequencing datasets, has ushered in a new era of data processing and interpretation. AI-driven mass spectrometry-based proteomics research has progressed because of data sharing and open-access laws. At initially, AI was restricted to data analysis and interpretation, but recent advances in DL have transformed the sector and improved the accuracy and calibre of data. DL may be able to surpass the best-in-class biomarker identification processes that are currently available in predicting experimental peptide values from amino acid sequences. Proteomics and AI convergence presents a transformative paradigm for biomedical research, offering fresh perspectives on biological systems and ground-breaking approaches to illness diagnosis and treatment [
75].
Though proteomics is not expressly discussed, the story alludes to its consequences. Understanding protein-level manifestations becomes crucial at the intersection of genetics and proteomics, as highlighted by the focus on PPAR proteins and their therapeutic potential in colonic disorders. Additionally, the integration of flow cytometry and genomics in haematological malignancies suggests a proteomic component, highlighting the importance of assessing protein expression for accurate diagnosis [
76].
The transcriptome, or collection of all RNA transcripts, of an organism is studied by transcriptomics technology. An organism's DNA encodes its information, which is then expressed by transcription. Since its first attempt in the early 1990s, transcriptomics has seen substantial change. RNA sequencing (RNA-Seq) and microarrays are two important methods in the field. Measurements of gene expression in various tissues, environments, or periods of time shed light on the biology and regulation of genes. Understanding human disease and identifying wide-ranging coordinated trends in gene expression have both benefited greatly from the analysis [
77].
Since the late 1990s, technological advancements have transformed the sector. Transcriptomics has come a long way since its inception because to techniques like SAGE, the emergence of microarrays, and NGS technologies in the 2000s. Because the transcriptome is dynamic, it is challenging to define and analyse, which calls for the application of AI methods such ML techniques like Random Forests and Support Vector Machines. Neural networks and other DL technologies have shown to be crucial in enhancing transcript categorization by unveiling intricate biological principles. Understanding the molecular mechanisms underlying differential gene expression requires a thorough analysis of gene expression data. AI tools such as Random Forests and Deep Neural Networks (DNNs) analyse massive datasets to distinguish between clinical groups and detect diseases. Gene expression becomes more complex due to polyadenylation and alternative splicing; AI aids in predicting AS patterns and comprehending splicing code [
78].
Specific challenges are introduced by single-cell RNA sequencing (scRNA-seq), including a high proportion of zero-valued observations, or "dropouts." The visualisation and interpretation of high-dimensional scRNA-seq data has benefited from enhanced dimensionality reduction through the use of ML and DL techniques like Uniform Manifold Approximation and Projection (UMAP) and t-distributed Stochastic Neighbour Embedding (t-SNE) [
78].
Transcriptomics research has been greatly impacted by AI, especially in the field of cancer research. Significant transcriptome data is produced by RNA-seq technology and can be obtained by AI-based algorithms [
79].
The transcriptome is largely composed of non-coding RNAs (ncRNAs), which have become important actors with a variety of roles ranging from complex mRNA regulatory mechanisms to catalytic functions. The field of transcriptome analysis has continued to advance, as seen by the evolution of techniques from conventional Northern Blotting to sophisticated RNA sequencing (RNA-Seq) [
76]. By enhancing the accuracy of identifying cancer states and stages, AI in transcriptome analysis contributes to the development of precision medicine. AI techniques like denoising and dropout imputation tackle problems in scRNA-seq research such as high noise levels and missing data. AI algorithms are essential for separating biological signals from noise and integrating multi-omics data as the profiles get more complicated.
Immunotherapy issues are resolved by AI-assisted transcriptome analysis, which analyses tumour heterogeneity, predicts responses, and identifies different cell types. Technologies will be continuously created and used in immunotherapy research as the era of precision medicine dawns. Combining it could boost the effectiveness of immunotherapies and alter the course of cancer research [
79]. The integration of AI into transcriptomics has significantly enhanced our comprehension of the transcriptome, particularly considering the growing technologies and new challenges in single-cell research [
78].
The study of genomics reveals the complexities encoded within an organism's entire set of genes. It emphasizes the importance of single-nucleotide polymorphisms (SNPs) in defining genetic loci that contribute to complicated disorders by delving into them. It does, however, address the difficulties, such as false-positive connections, highlighting the importance of precision in experimental design. The combined global efforts in genomics, particularly in the quick identification of the SARS-associated coronavirus, demonstrate the discipline's real-world effect in tackling new health concerns [
76].
ML has made immense progress in genomics since the 1980s, particularly with the integration of DL techniques in the 2000s. In fact, ML has been instrumental in predicting DNA regulatory areas, annotating sequence elements, and discovering chromatin effects in genomics. To effectively handle the vast amount of sequences and diverse data sources, DL methods such as Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) have been utilized. The success of unsupervised learning, specifically GANs and Autoencoders (AEs), in identifying intricate patterns in biological sequences through the extraction of representative features from genomics data further demonstrates the potential of these powerful techniques in genomics research.
The integration of ML and CRISPR-Cas9 technology is a pivotal coming together of experimental and computational biology, expediting research on large-scale genetic interactions. Approaches utilizing ML and DL have displayed potential in identifying connections between diseases and genes, as well as forecasting the genetic susceptibility of intricate disorders. The scope of methods is evident, ranging from SVM-based classification to comprehensive frameworks employing CNNs, as demonstrated by CADD, DANN, and ExPecto [
74].
Studies have been greatly impacted by the emergence of AI, especially big data fields like functional genomics. Large amounts of data have been organised and abstracted using deep architectures, which has improved interpretability. Conversely, the lack of explainability in deep AI architectures casts doubt on the findings' applicability and transparency. The wider use of functional genomics depends on the free and open exchange of AI tools, resources, and knowledge. AI technologies are selected in the fields of biology and functional genomics to provide mechanistic understanding of biological processes. This enables systems biology to assess biological systems or develop theoretical models that forecast their behaviour. The use of AI in systems biology will see competition or cooperation between data-driven and model-driven methods. DeepMind's AlphaFold approach highlights the power of AI, particularly transformer-based models. Complex considerations of individual and communal rights are involved in functional genomics, and the application of AI necessitates navigating through different data, interpreting pertinent questions, and addressing legal, ethical, and moral issues. In the developing landscape of AI development, a cautious approach is required to ensure that the advantages outweigh the potential negative repercussions [
74].
Diverse Interaction provides insight on the vast synergy present in molecular biology. Proteomics reveals expressions at the protein level, transcriptomics reveals the functional RNA environment, and genomics lays the groundwork by interpreting genetic data. The integration of various disciplines offers a thorough understanding of diseases, emphasising the value of a multimodal approach in diagnosis and therapy.
Ultimately, the dynamic interplay of transcriptomics, proteomics, and genomes holds the key to understanding the complexity of illnesses. This discussion highlights the dynamic nature of molecular biology, where each specialty contributes in a different way to the overarching narrative of health and disease [
76].
While fields like proteomics, transcriptomics and genomics are making individual strides, they are often evaluated in isolation, leading to monothematic information. To overcome this limitation, efforts are being made to integrate (multi)omics data, aiming to enhance our understanding of molecular pathways, mechanisms, and processes related to health and disease. Within proteomics, transcriptomics, and genomics, synergistic partnerships provide unique insights into the intricate regulation of biological functions. the integration of proteomics, transcriptomics, and genomics offers a comprehensive understanding of diseases, emphasizing the significance of a multimodal approach in diagnosis and therapy. The dynamic interplay among these disciplines holds the key to unraveling the complexity of illnesses, showcasing the nuanced contributions of each specialty in the broader narrative of health and disease.