Preprint
Review

Current Applications of Artificial Intelligence in the Neonatal Intensive Care Unit

Submitted:

02 April 2024

Posted:

02 April 2024

You are already at the latest version

A peer-reviewed article of this preprint also exists.

Abstract
Artificial intelligence (AI) refers to computer algorithms that replicate the cognitive function of humans. Machine learning is widely applicable using structured and unstructured data, while deep learning is derived from the neural networks of the human brain that process and interpret information. During the last decades, AI has been introduced in several aspects of healthcare. In this review, we aim to present the current application of AI in the neonatal intensive care unit. AI-based models have been applied to neurocritical care, including automated seizure detection algorithms and electroencephalogram-based hypoxic-ischemic encephalopathy severity grading systems. Moreover, AI models evaluating magnetic resonance imaging contributed to the progress of the evaluation of the neonatal developing brain and the understanding of how prenatal events affect both structural and functional network topologies. Furthermore, AI algorithms have been applied to predict the development of bronchopulmonary dysplasia and assess the extubation readiness of preterm neonates. Automated models have been also used for the detection of retinopathy of prematurity and the need for treatment. Among others, AI algorithms have been utilized for the detection of sepsis, the need for patent ductus arteriosus treatmnet, the evaluation of jaundice, and the detection of gastrointestinal morbidities. Finally, AI prediction models have been constructed for the evaluation of the neurodevelopmental outcome and the overall mortality of neonates. Although the application of AI in neonatology is encouraging, further research in AI models is warranted in the future including retraining clinical trials, validating the outcomes, and addressing serious ethics issues.
Keywords: 
Subject: 
Biology and Life Sciences  -   Other

1. Introduction

During the last decades, artificial intelligence (AI) has been introduced in several aspects of human life, including the healthcare industry [1]. AI refers to computer algorithms that replicate the cognitive function of humans, using specified operational models produced from the statistical assessments of big data sets [2].
Machine learning (ML) and deep learning (DL) are subsets of AI that have been widely applied to the healthcare industry [3]. ML uses both unstructured data that are difficult to arrange using predetermined structures (e.g., clinical notes), as well as structured data that are easily organized into predefined structures. Furthermore, ML models generate software algorithms to develop AI decision-support systems [3]. The majority of these systems are created using standard algorithms, which consistently produce the same outcome for a given input, and thus, decision-support systems help healthcare professionals analyze enormous amounts of information [4,5,6]. Unlike this broader definition of ML, the fundamental idea behind DL is derived from the neural networks of the human brain that process and interpret information. To simulate this process, DL uses artificial neurons in a computer neural network, and when the number of layers is large (i.e., deep) simulates more intricate links between input and output [7,8]. By training, the network gradually obtains increasingly complex data representations. Finally, natural language processing is an AI technology that aids computers in comprehending and interpreting human language, organizing clinical notes and unstructured data, and thus, enabling better decision-making [9,10].
In this narrative review, we aim to present the current application of AI in the neonatal intensive care unit (NICU) and explore its future perspectives.

2. Application of Artificial Intelligence in Neonatology

2.1. Neurocritical Care

The previous decades have seen increased research on the neuromonitoring of critically ill neonates, thanks to the advancements in AI. AI, and especially ML, has made it possible for computer systems to examine and analyze massive amounts of data, including medical patterns, mainly applied to the electroencephalogram (EEG) and magnetic resonance imaging (MRI) [11].

2.1.1. Electroencephalography

Seizures are the most common neurological emergency in the neonatal population, and most likely occur during the first days of life [12]. Seizures are more common in neonates born at less than 30 and more than 36 weeks of gestation, with the frequency of seizures in neonates estimated to be around 8% [12]. Additionally, evidence suggests that treating seizures early on enhances the patient's response to medication [13], while it is well known that recurrent seizures are linked to worse long-term neurodevelopmental outcomes, regardless of the underlying cause [14,15]. Seizures are particularly difficult to diagnose in the neonatal population because they can be difficult to distinguish from normal infant movements even when they do occur, or they can be limited to electrographic episodes [16]. Although neonatal seizures need to be treated right away, it can be extremely challenging to recognize, since up to 85% of neonatal seizures may not have any clear clinical symptoms.
In the NICU, EEG has emerged as a crucial component of neurocritical care, as it is crucial to identify neonatal seizures and allows the distinction between epileptic seizures and nonepileptic episodes [17]. Additionally, EEG monitoring helps uncouple clinical and EEG seizures after antiseizure treatment [18], detecting the electrical discharge that may persist after therapy, while the clinical manifestation of the seizure that may have existed before treatment disappeared. EEG records non-invasively the electrical activity of the cerebral cortex allowing for the real-time evaluation of cortical background function; however, real-time review and implementation of EEG can be challenging. Moreover, continuous EEG (cEEG) increases the diagnostic and prognostic potential, since it allows the evaluation of the background activity over time [19]. Thus, cEEG monitoring is the recommended standard of care for identifying and treating all seizures quickly [20,21]. Due to the challenges in acquiring traditional EEG, NICUs have currently adopted a less precise but more straightforward method of EEG monitoring, the amplitude-integrated EEG (aEEG). As opposed to cEEG monitoring, aEEG is a bedside device that shows one or two channels of filtered, smoothed, and quantitatively converted EEG data, while the cortical electrical activity is compressed in duration and converted in a semi-logarithmic chart [22,23]. However, aEEG does not have an ideal sensitivity, specificity, and interobserver agreement for identifying seizures [24], and thus, it is recommended to serve as an adjunct to cEEG monitoring [19,25].
During the past few decades, research in AI, and particularly DL, has evolved in the field of the creation of automatic seizure detection algorithms [26]. These algorithms exhibit remarkable seizure detection accuracy, comparable to that of human specialists [27]. In 1992, Liu et al. proposed a computerized detection system for neonatal seizures, and thereafter, numerous methods have been documented, refined, and verified [28]. The performance of the initial automatic seizure detection algorithms was suboptimal for therapeutic use as they had been created by modifying algorithms intended for adult users [28,29]; however, to date many seizure detection algorithms have been developed mainly for full-term but also preterm neonates [30]. The development of these algorithms requires the labeling of seizures by several specialists as well as obtaining enough data for testing, training, and validation.
In 2020, a randomized clinical trial assessing the effect of ML on the real-time identification of neonatal seizures in a NICU was published [31]. According to that report, more seizures were recognized in real-time, when AI algorithms were applied in the NICU [31]. Following extensive training and offline analysis, the accuracy of the recognition of electrographic seizures both with and without the automatic seizure detection algorithms was tested in a multicenter clinical trial, suggesting that the algorithm could serve as a bebside tool in clinical practice [32,33]. The model greatly enhanced the recognition of seizure hours, even though the set aim of improving the detection of specific neonates with seizures was not fulfilled.
In addition to monitoring and treating newborn seizures, EEG is also a valuable diagnostic tool for neonatal encephalopathy, namely hypoxic-ischemic encephalopathy (HIE). AI research is being conducted to create algorithms, many of which use DL techniques, that can evaluate brain maturation, estimate sleep stages [34], and grade background EEG patterns in HIE [35]. Automated EEG interpretation based on ML technology has recently shown good performance in detecting HIE severity and can be helpful in the early severity grading of neonatal HIE [36,37]. Such an example of advanced signal processing included the convolutional neural network structures, which can self-extract convolutional features from raw EEGs [35]. Besides, the possible application of AI in predictive modeling for electrographic seizures in newborns with HIE was examined by Pavel et al., with the goal of early detection of infants most at risk of recurrent seizures [38]. ML algorithms were created for clinical and both qualitative and quantitative EEG characteristics. Notably, both the automated quantitative EEG analysis and the analysis carried out by a skilled neurophysiologist (qualitative) increased the predictive value of these models by incorporating clinical data. These studies highlight the possibility of using ML in evaluating the EEG background of neonates with HIE.

2.1.2. Magnetic Resonance Imaging

The application of AI to enhance the utility and inference from brain MRI has advanced significantly during the last few years. Technical advancements in AI techniques include methods to reduce movement artifact effects and boost information yield, as well as advancements in tissue classification [39]. These have made it possible for a deeper evaluation of the developing brain, and a new understanding of the effects of prenatal events on structural and functional network topologies [40].
One of the regions in the neonatal brain where myelination starts is the posterior limb of the internal capsule (PLIC). Crucially, both term and preterm newborns' neurological outcomes depend on the proper and timely maturation of the PLIC. Abnormalities in the PLIC detected on MRI have been linked to hemiplegia, and worse neurodevelopmental outcomes [41]. Over the past few decades, there has been a noticeable rise in the prevalence of cerebral palsy to over 2.0 per 1000 live births, which is inversely proportional to the gestational age and carries significant lifetime burdens [42,43]. An ML algorithm for the automated segmentation and quantification of the PLIC in preterm newborns undergoing MRI was proposed in a recent work [44], where authors demonstrated good accuracy for the ML model when compared to expert analysis, indicating the successful application of their algorithm to a large dataset. Although promising, it is necessary to evaluate how well this approach will work in clinical settings.
Identifying neuroanatomic phenotypes and predicting the outcome are the major areas in the clinical domain where AI is facilitating innovation. Preterms are characterized by a specific phenotype including abnormal brain development, cerebral palsy, autism spectrum disorder, attention deficit hyperactivity disorder, psychiatric illness, and issues with language, behavior, and socioemotional functions [45]. Abnormalities of structural and functional networks are frequent in preterm neonates as they have been obtained from structural, diffusion, and functional MRI [46]. Models that combine data from two or more imaging modalities into a single framework, can reveal previously unknown patterns of neuroanatomic variants in preterm neonates that are related to cognitive and motor outcomes [47]. Diffusion tensor metrics, neurite orientation dispersion, regional volumes, and density imaging measurements are among the several forms of MRI data that are integrated into a single model to compute morphometric similarity networks [48]. This kind of research helps identify the neural roots of cognition and behavior, identify the networks that most contribute to atypical brain development, and examine the drivers of brain dysmaturation and resilience.
Current research also aims to compare traditional computer vision approaches with efficient networks that generate reliable and accurate segmentation. To evaluate methods for segmenting newborn tissue, T1W, and T2W pictures were provided with manually segmented structures; segmenting myelinated from unmyelinated white matter is, nevertheless, still challenging [49]. The limited number of high-quality labeled data must also be acknowledged as a key limitation when comparing earlier attempts on newborn brain segmentation [50].

2.2. Respiratory System

One of the main causes of infant mortality and morbidity in preterm deliveries is bronchopulmonary dysplasia (BPD). Although several biomarkers have been associated with the emergence of respiratory distress syndrome (RDS), there are currently no meaningful prenatal diagnostic tests for BPD [51]. In a previous study, Ahmed et al. evaluated an ML technique also suitable for the analysis of other biological materials and created a helpful bedside point-of-care test approach for neonatal RDS [52]. According to the authors’ findings, following clinical validation, the use of ML-guided devices that can measure RDS biomarkers in real time may be used to direct therapies for preterm infants exhibiting respiratory symptoms. Moreover, Raimondi et al. concentrated on AI-assisted analysis of lung ultrasonography and its capacity to correlate with respiratory status in critically ill neonates with RDS [53]. The authors constructed a dataset of scans for texturing and a correlation between the oxygenation status, the ultrasound findings, and the mean grayscale intensity was established by an ML model. They enrolled a cohort of neonates of different origins and varying degrees of respiratory distress, and they demonstrated a significant correlation between blood gas indices and the grayscale ML analysis [53]; however, the relatively small sample size, the heterogeneous etiology of the respiratory distress, and the variable postnatal age suggested that further research on this topic with larger datasets is warranted.
Regarding BPD, Dai et al. investigated the combination of genetic and clinical factors, where exome sequencing was carried out for preterm neonates and integrated with clinical aspects [54]. The authors demonstrated that by using ML for the genomic analysis they could predict the development of BPD with an accuracy of 90% [54]. Also, the combination of gastric aspirate after birth and clinical information analysis could predict BPD development with a sensitivity of 88% [51]. Besides, Leigh et al., in a retrospective analysis of the perinatal and the respiratory factors in a sample of preterm neonates, created an ML algorithm that, after testing and training, could predict BPD-free survival well in terms of accuracy [55]. An AI approach has been proposed using DL and image segmentation, that can predict the severity of BPD by analyzing the segmentation of the lungs in chest X-ray taken on the 28th day of oxygen delivery [56]. The benefits of the aforementioned algorithm included non-invasiveness, speed, and independence from the experience of neonatologists, whereas demonstrated strong prediction performance.
Moreover, research on BPD with ML predictive models has shown that long-term invasive ventilation is one of the most significant risk factors for BPD and longer hospital stays. ML models using long-term invasive ventilation data could predict extubation failure with significant accuracy [57,58,59]. The risk stratification for BPD is a specific area of interest, aiming to identify infants who may benefit from preventive measures like corticosteroids or treatment for specific morbidities such as patent ductus arteriosus (PDA). The BPD Outcome Estimator is a predictive tool approved by the US National Institute of Child Health and Human Development useful in directing steroid treatment and family counseling [60]. The estimator was initially limited to White, Black, or Hispanic neonates, however, Patel et al. recently created a a web application based on an ML system for extremely preterm neonates of Asian descent [61]. Nonetheless, the study's conclusions were limited because the method was tested on a small dataset, requiring further comprehensive and prospective validation before being used in clinical practice.
Apnea of prematurity, another common morbidity in preterm neonates, is either obstructive (caused by airway obstruction), central (caused by cessation of respiratory drive), or mixed (a combination of both). Bedside monitors are programmed to sound an alarm when detect a decreased respiratory effort due to a decrease in thoracic motion [62]. A substantial number of false positive episodes have been observed in clinical tests indicating that this approach can identify central apneas with suboptimal accuracy [63]. Varisco et al. created an ML-based improved apnea detection model to automatically identify real apnea using data from the electrocardiographic monitoring of neonates [64]. The authors concluded that the AI algorithm resulted in better detection of apneas compared to traditional approaches with fewer false alarms, and they also showed that breathing patterns were altered more often in neonates with more frequent central apneas [64]. Although AI may drastically alter routine clinical practice, given that alarm fatigue is a growing problem in NICUs putting neonates in danger of missing alarms, the lack of external validation, along with the small sample size represents serious flaws in the suggested methodology.

2.3. Ophthalmology

ML models have been also applied is retinopathy of prematurity (ROP), which is a severe complication of prematurity and a major cause of childhood blindness in high- and middle-income countries. ROP affects mainly extremely preterm (less than 28 weeks), very preterm (28-32 weeks), or very low-birthweight (1500 g) neonates [65]. Telemedicine and AI are being considered as potential diagnostic tools for ROP, given the dearth of ophthalmologists who can treat neonates with ROP. Gaussian mixture models are among several ML techniques, to diagnose and categorize ROP from retinal fundus pictures [65,66]. In a previous study, the i-ROP system was shown to have a 95% accuracy in classifying pre-plus and plus illness. This performance was significantly better than the performance achieved by nonexperts (81%) and comparable to that achieved by experts (92% to 96%) [66].
Furthermore, a DL automated score model was generated in a recent multicenter trial, to identify one of the features of the affected retina [67]. This study showed how a DL comprehensive screening platform may enhance screening accessibility and objective ROP diagnosis. In another large-scale multicenter trial, a different group of scientists created a DL method for predicting ROP and its severity [68]. Retinal images from the initial ROP screening and neonatal clinical risk variables were obtained to develop an AI predictive algorithm. When compared to the traditional ROP score, the DL-based system demonstrated comparable accuracy, while it was found more effective in identifying and interpreting abnormal signs than the classical ophthalmoscopy.
Moreover, in previous studies, telemedicine has been compared with Binocular Indirect Ophthalmoscope, demonstrating that both techniques are equally sensitive in detecting zone disease, plus disease, and ROP, although Binocular Indirect Ophthalmoscope was more accurate in recognizing zone III and stage 3 ROP [69,70]. Besides, using DL algorithms, the accuracy of ROP examination was 94% for normal diagnosis and 98% for illness and diagnosis, outperforming ROP experts [71]. Finally, in previous studies, DL algorithms were constructed to estimate the clinical progression of the ROP by assigning vascular severity scores [72] or detect disease requiring therapy with an accuracy of 98% [73]. Overall, introducing AI into ROP screening programs might improve access to care for secondary ROP prevention [73]; however, despite the encouraging results, more extensive external validation using additional multicenter datasets is necessary. Additionally, the development of more advanced ML algorithms may be able to provide more significant prognostic information regarding the accurate staging, zone, and disease.

2.4. Vital Signs

In previous studies, ML analysis has been developed to analyze physiologic data that are electronically captured as signal data to identify artifact patterns [74], predict neonatal morbidity [75], or identify late-onset sepsis [76]. An ML algorithm using electronically recorded vital signs within the first three hours of life, including heart rate and respiration rate of preterm neonates with a birth weight ≤2000 grams and gestational age ≤34 weeks predicted overall morbidity with an accuracy of 91% [75]. Furthermore, Lyra et al. developed DL-based techniques that could result in a reliable, real-time assessment of crucial indicators, such as changes in body temperature [77]. Although the analysis proved difficult for several factors during the recording, the authors demonstrated the viability of using inexpensive, embedded graphics processing units to monitor neonates' temperatures in real-time, although more research is warranted to broaden the application of this technique in clinical settings [77].

2.5. Gastrointestinal System

Recently, an AI algorithm was created based on a large dataset about the clinical characteristics of neonates who developed intestinal perforation [78]. The suggested algorithm evaluated various clinical data, including vital signs, radiologic findings, biomarkers, and laboratory results, and led to a more accurate and early prediction of intestinal perforation of preterm neonates compared to all other traditional ML methods [78]. Furthermore, regarding nutrition, a previous study in England demonstrated that ML techniques can be used to evaluate nutritional practices that were found to be associated with body weight on discharge and the development of BPD [79]. Finally, Han et al. recently examined the potential application of AI to predict postnatal growth failure. Using a large dataset of very low birth weight neonates from several NICUs, ML models were created using a variety of methodologies, showing a strong predictive performance [80]. Nevertheless, the study's findings were limited since it lacked crucial information about enteral and parenteral feeding.

2.6. Jaundice

The application of ML and DL models was explored in a previous study investigating the potential of using a dataset made up of photos taken using a smartphone camera for the identification of neonatal jaundice in term and late preterm neonates. The authors used data from pictures of the skin and eyes to train a neural network to identify jaundice [81]. Furthermore, Guardalia et al. used an ML approach to analyze clinical data for a large neonatal population in order to develop a risk assessment tool for neonatal jaundice that did not rely on bilirubin readings, that performed well in the risk categorization of newborn jaundice [82].

2.7. Sepsis

Early and late-onset neonatal sepsis is a major cause of infant mortality and morbidity [83]. Diagnosing neonatal sepsis and starting antibiotics is challenging in clinical practice, which emphasizes the need for a comprehensive approach. Previous studies have explored the role of heart rate variability in predicting early-onset sepsis with an accuracy of 64–94% [84]. Also, regarding the detection of late-onset sepsis, ML decision algorithms have utilized clinical and laboratory biomarkers obtaining an optimal accuracy and a mean precision rate of 0.82 3 hours before the onset of sepsis [76].

2.8. Patent Ductus Arteriosus

The ductus arteriosus which is patent during the intrauterine life may have significant hemodynamic consequences in preterm neonates and is associated with higher rates of morbidity and mortality. Therefore, it should be assessed whether closing the PDA could increase survival chances relative to the risk of side effects [85]. ML techniques have been developed for the detection of PDA from electronic health records [86] and auscultation records [87]. This resulted in an accuracy of 76% for the prediction of PDA in very low birth weight infants based on the analysis of 47 perinatal factors using 5 different ML techniques [86] and 74% for the analysis of 250 auscultation records [87].

2.9. Neurodevelopmental Outcome

ML techniques have been widely used for the neurodevelopmental evaluation and follow-up of preterm neonates. Numerous studies used ML techniques to examine brain connections [88,89,90,91], brain structure analysis, and brain segmentation in preterm neonates [92,93]. Evidence suggests an association between lower brain volume, cortical folding, axonal integrity, and microstructural connectivity with preterm birth [94,95]. Additional effects of prematurity on the developing connectome have been found in studies examining functional markers of brain maturation [91,96].
Neurocognitive assessments are among the most significant domains of neurodevelopment outcomes at two years of age. Previous studies assessed how the brain's morphological alterations relate to neurocognitive outcomes [97,98,99] and the prediction of brain age [100]. It has been demonstrated that multivariate models combining near-term structural MRI findings and white matter microstructure on diffusion tensor imaging may help identify preterm neonates at risk for language impairment and guide early intervention [97,99]. Moreover, to predict neurodevelopmental impairment at two years of age, a self-training deep neural network model has been suggested, using MRI data obtained in very preterm neonates at term-equivalent age [101]. Besides, according to a study that used ML techniques to assess the impact of PPAR gene activity on brain development, a significant correlation was found between aberrant brain connectivity and PPAR gene signaling's role in aberrant white matter development [102].
ML models have been used to evaluate the association of the developmental outcome regarding language skills with the near-term MRI findings in previous studies. By examining MRI characteristics and perinatal clinical data, Valavani et al. employed ML to predict language skills at two years of corrected age in preterm neonates [103]. Language delay could be accurately predicted by delayed myelination patterns and specific clinical characteristics. The authors concluded that ML models could be useful for healthcare services and enhance the long-term outcomes of preterm neonates. Furthermore, in a recent study, Balta et al. proposed an AI-based automated monitoring of newborns' general motions, a crucial screening test for detecting neuromotor problems in children [104]. The authors created an automated model to analyze infants' overall motions, by processing videos taken with a simple camera at home. Certain patterns of spontaneous movements, such as the absence of fidgety movements or the presence of predominately contracted coordinated movements, were particularly indicative in predicting cerebral palsy in infants between the ages of 3 and 5 months of age. [104].

2.10. Mortality

Even with the recent advances in neonatal care, preterm neonates are still very vulnerable to death because of their immature organ systems, [105]. ML models have been developed for the prediction of neonatal mortality by exploring causative factors [106,107]. A recent review including term and preterm neonates between the gestational ages of 22 and 40 weeks reported that neural networks, random forests, and logistic regression were common models developed by the investigators [108]. Among the included studies, only two studies finished external validation, five studies published calibration plots, five studies reported sensitivity and specificity of their models that ranged from 63 to 80% and 78 to 98% respectively, and eight reported accuracy that ranged from 58.3 to 97.0% [108]. Despite having 17 features, the best model overall was linear regression analysis [108]. Recent studies exploring the application of AI models in severely low birthweight and preterm neonatal populations reported an accuracy of 68.9-93.3% [109,110]. Among the several limitations of these studies was the lack of inclusion of vital parameters to depict dynamic changes, while gestational age, birth weight, and Apgar scores were the most significant variables in the models [111,112]. These limitations suggest that further implementation, calibration, and external validation of AI healthcare applications is warranted in future studies.

3. Limitations and Future Perspectives

AI has been currently established as a useful component in several parts of neonatal care, to help physicians to provide improved, more effective, and safer care (Table 1). However, specific issues need to be addressed before the wide application of AI models. At first, healthcare providers need to improve their digital literacy, so that they can comprehend the fundamental principles and limitations of AI. That would help healthcare providers evaluate recently created AI tools and focus on their appropriate and safe application in clinical settings. Also, to develop and implement AI tools, cross-disciplinary, worldwide collaborations involving data scientists, computer scientists, healthcare providers, attorneys, and legislators are required. Additional drawbacks of AI include the lack of larger datasets to train the models, the heterogeneity of the data, generalizability problems, the lack of evidence-based guidelines for some diseases affecting neonates, and the cost. Applying AI to newborn care also involves addressing critical challenges such as the model's interpretability, the necessity of external validation to improve generalizability, and the necessity of appropriate evaluation of performance (Table 2).
Finally, there are serious ethical issues to be considered. Important decisions in neonatology are often accompanied by a complex and difficult ethical component, and multidisciplinary methods are necessary for advancement [113]. Informed consent, bias, safety, privacy of the patients, and allocation are among the ethical issues with AI applications in healthcare [114]. The use of AI in neonatology has become more challenging due to the necessary transparency, viability limitations, life-sustaining therapies, and various international restrictions [115]. To date, there hasn't been any reporting on how an ethics framework would be applied in neonatology yet.

4. Conclusions

AI is becoming more and more important in healthcare services following our contemporary culture that moves toward automated decision support systems. The main advantage of using AI in healthcare is its ability to evaluate large volumes of medical data from multidisciplinary studies. This type of data is too complex for medical professionals to study quickly enough to find the diagnosis and determine a treatment plan. When trained with the right data, AI models function like human neurons and can quickly and accurately solve problems. Finding the appropriate treatment strategy requires accuracy and time, especially in intensive care units. When integrating AI models into NICU clinical practices including treatment and transport, trust is a crucial component. AI-based solutions can be used in NICUs mainly to confirm the current treatment plans rather than implement their recommendations. The current evidence regarding the application of AI in neonatology is encouraging, however, further research is warranted including retraining clinical trials and validating the outcomes to make AI algorithms more useful in the future.

Author Contributions

Conceptualization, D.R. and V.G.; methodology, D.R.; investigation, D.R.; resources, D.R.; data curation, D.R.; writing—original draft preparation, D.R.; writing—review and editing, M.B., K.K., and V.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Price, W.N., 2nd; Gerke, S.; Cohen, I.G. Potential Liability for Physicians Using Artificial Intelligence. JAMA 2019, 322, 1765–1766. [Google Scholar] [CrossRef] [PubMed]
  2. Helm, J.M.; Swiergosz, A.M.; Haeberle, H.S.; Karnuta, J.M.; Schaffer, J.L.; Krebs, V.E.; Spitzer, A.I.; Ramkumar, P.N. Machine Learning and Artificial Intelligence: Definitions, Applications, and Future Directions. Curr Rev Musculoskelet Med 2020, 13, 69–76. [Google Scholar] [CrossRef]
  3. Adegboro, C.O.; Choudhury, A.; Asan, O.; Kelly, M.M. Artificial Intelligence to Improve Health Outcomes in the NICU and PICU: A Systematic Review. Hosp Pediatr 2022, 12, 93–110. [Google Scholar] [CrossRef] [PubMed]
  4. Choudhury, A.; Asan, O. Role of Artificial Intelligence in Patient Safety Outcomes: Systematic Literature Review. JMIR Med Inform 2020, 8, e18599. [Google Scholar] [CrossRef] [PubMed]
  5. Choudhury, A.; Renjilian, E.; Asan, O. Use of machine learning in geriatric clinical care for chronic diseases: a systematic literature review. JAMIA Open 2020, 3, 459–471. [Google Scholar] [CrossRef] [PubMed]
  6. Olive, M.K.; Owens, G.E. Current monitoring and innovative predictive modeling to improve care in the pediatric cardiac intensive care unit. Transl Pediatr 2018, 7, 120–128. [Google Scholar] [CrossRef] [PubMed]
  7. Piccialli, F.; Somma, V.D.; Giampaolo, F.; Cuomo, S.; Fortino, G. A survey on deep learning in medicine: Why, how and when? Information Fusion 2021, 66, 111–137. [Google Scholar] [CrossRef]
  8. Burt, J.R.; Torosdagli, N.; Khosravan, N.; RaviPrakash, H.; Mortazi, A.; Tissavirasingham, F.; Hussein, S.; Bagci, U. Deep learning beyond cats and dogs: recent advances in diagnosing breast cancer with deep neural networks. Br J Radiol 2018, 91, 20170545. [Google Scholar] [CrossRef]
  9. Kreimeyer, K.; Foster, M.; Pandey, A.; Arya, N.; Halford, G.; Jones, S.F.; Forshee, R.; Walderhaug, M.; Botsis, T. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J Biomed Inform 2017, 73, 14–29. [Google Scholar] [CrossRef]
  10. Nadkarni, P.M.; Ohno-Machado, L.; Chapman, W.W. Natural language processing: an introduction. J Am Med Inform Assoc 2011, 18, 544–551. [Google Scholar] [CrossRef] [PubMed]
  11. Brinkmann, B.H.; Bower, M.R.; Stengel, K.A.; Worrell, G.A.; Stead, M. Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data. J Neurosci Methods 2009, 180, 185–192. [Google Scholar] [CrossRef] [PubMed]
  12. Sheth, R.D.; Hobbs, G.R.; Mullett, M. Neonatal seizures: incidence, onset, and etiology by gestational age. J Perinatol 1999, 19, 40–43. [Google Scholar] [CrossRef] [PubMed]
  13. Williams, R.P.; Banwell, B.; Berg, R.A.; Dlugos, D.J.; Donnelly, M.; Ichord, R.; Kessler, S.K.; Lavelle, J.; Massey, S.L.; Hewlett, J.; et al. Impact of an ICU EEG monitoring pathway on timeliness of therapeutic intervention and electrographic seizure termination. Epilepsia 2016, 57, 786–795. [Google Scholar] [CrossRef] [PubMed]
  14. Payne, E.T.; Zhao, X.Y.; Frndova, H.; McBain, K.; Sharma, R.; Hutchison, J.S.; Hahn, C.D. Seizure burden is independently associated with short term outcome in critically ill children. Brain 2014, 137, 1429–1438. [Google Scholar] [CrossRef] [PubMed]
  15. Chapman, K.E.; Specchio, N.; Shinnar, S.; Holmes, G.L. Seizing control of epileptic activity can improve outcome. Epilepsia 2015, 56, 1482–1485. [Google Scholar] [CrossRef] [PubMed]
  16. Murray, D.M.; Boylan, G.B.; Ali, I.; Ryan, C.A.; Murphy, B.P.; Connolly, S. Defining the gap between electrographic seizure burden, clinical expression and staff recognition of neonatal seizures. Arch Dis Child Fetal Neonatal Ed 2008, 93, F187–F191. [Google Scholar] [CrossRef] [PubMed]
  17. Shellhaas, R.A.; Clancy, R.R. Characterization of neonatal seizures by conventional EEG and single-channel EEG. Clin Neurophysiol 2007, 118, 2156–2161. [Google Scholar] [CrossRef]
  18. Scher, M.S.; Alvin, J.; Gaus, L.; Minnigh, B.; Painter, M.J. Uncoupling of EEG-clinical neonatal seizures after antiepileptic drug use. Pediatr Neurol 2003, 28, 277–280. [Google Scholar] [CrossRef] [PubMed]
  19. McCoy, B.; Hahn, C.D. Continuous EEG monitoring in the neonatal intensive care unit. J Clin Neurophysiol 2013, 30, 106–114. [Google Scholar] [CrossRef] [PubMed]
  20. Shellhaas, R.A. Continuous long-term electroencephalography: the gold standard for neonatal seizure diagnosis. Semin Fetal Neonatal Med 2015, 20, 149–153. [Google Scholar] [CrossRef] [PubMed]
  21. Shellhaas, R.A.; Chang, T.; Tsuchida, T.; Scher, M.S.; Riviello, J.J.; Abend, N.S.; Nguyen, S.; Wusthoff, C.J.; Clancy, R.R. The American Clinical Neurophysiology Society's Guideline on Continuous Electroencephalography Monitoring in Neonates. J Clin Neurophysiol 2011, 28, 611–617. [Google Scholar] [CrossRef] [PubMed]
  22. de Vries, L.S.; Toet, M.C. Amplitude integrated electroencephalography in the full-term newborn. Clin Perinatol 2006, 33, 619–632. [Google Scholar] [CrossRef] [PubMed]
  23. de Vries, L.S.; Hellstrom-Westas, L. Role of cerebral function monitoring in the newborn. Arch Dis Child Fetal Neonatal Ed 2005, 90, F201–F207. [Google Scholar] [CrossRef] [PubMed]
  24. Rakshasbhuvankar, A.; Rao, S.; Palumbo, L.; Ghosh, S.; Nagarajan, L. Amplitude Integrated Electroencephalography Compared With Conventional Video EEG for Neonatal Seizure Detection: A Diagnostic Accuracy Study. J Child Neurol 2017, 32, 815–822. [Google Scholar] [CrossRef] [PubMed]
  25. Appendino, J.P.; McNamara, P.J.; Keyzers, M.; Stephens, D.; Hahn, C.D. The impact of amplitude-integrated electroencephalography on NICU practice. Can J Neurol Sci 2012, 39, 355–360. [Google Scholar] [CrossRef] [PubMed]
  26. Temko, A.; Lightbody, G. Detecting Neonatal Seizures With Computer Algorithms. J Clin Neurophysiol 2016, 33, 394–402. [Google Scholar] [CrossRef] [PubMed]
  27. O'Shea, A.; Lightbody, G.; Boylan, G.; Temko, A. Neonatal seizure detection from raw multi-channel EEG using a fully convolutional architecture. Neural Netw 2020, 123, 12–25. [Google Scholar] [CrossRef] [PubMed]
  28. Liu, A.; Hahn, J.S.; Heldt, G.P.; Coen, R.W. Detection of neonatal seizures through computerized EEG analysis. Electroencephalogr Clin Neurophysiol 1992, 82, 30–37. [Google Scholar] [CrossRef] [PubMed]
  29. Gotman, J.; Flanagan, D.; Zhang, J.; Rosenblatt, B. Automatic seizure detection in the newborn: methods and initial evaluation. Electroencephalogr Clin Neurophysiol 1997, 103, 356–362. [Google Scholar] [CrossRef] [PubMed]
  30. O'Shea, A.; Ahmed, R.; Lightbody, G.; Pavlidis, E.; Lloyd, R.; Pisani, F.; Marnane, W.; Mathieson, S.; Boylan, G.; Temko, A. Deep Learning for EEG Seizure Detection in Preterm Infants. Int J Neural Syst 2021, 31, 2150008. [Google Scholar] [CrossRef] [PubMed]
  31. Pavel, A.M.; Rennie, J.M.; de Vries, L.S.; Blennow, M.; Foran, A.; Shah, D.K.; Pressler, R.M.; Kapellou, O.; Dempsey, E.M.; Mathieson, S.R.; et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health 2020, 4, 740–749. [Google Scholar] [CrossRef] [PubMed]
  32. Stevenson, N.J.; Korotchikova, I.; Temko, A.; Lightbody, G.; Marnane, W.P.; Boylan, G.B. An automated system for grading EEG abnormality in term neonates with hypoxic-ischaemic encephalopathy. Ann Biomed Eng 2013, 41, 775–785. [Google Scholar] [CrossRef] [PubMed]
  33. Mathieson, S.R.; Stevenson, N.J.; Low, E.; Marnane, W.P.; Rennie, J.M.; Temko, A.; Lightbody, G.; Boylan, G.B. Validation of an automated seizure detection algorithm for term neonates. Clin Neurophysiol 2016, 127, 156–168. [Google Scholar] [CrossRef] [PubMed]
  34. Ansari, A.H.; Pillay, K.; Dereymaeker, A.; Jansen, K.; Van Huffel, S.; Naulaers, G.; De Vos, M. A Deep Shared Multi-Scale Inception Network Enables Accurate Neonatal Quiet Sleep Detection With Limited EEG Channels. IEEE J Biomed Health Inform 2022, 26, 1023–1033. [Google Scholar] [CrossRef] [PubMed]
  35. Raurale, S.A.; Boylan, G.B.; Mathieson, S.R.; Marnane, W.P.; Lightbody, G.; O'Toole, J.M. Grading hypoxic-ischemic encephalopathy in neonatal EEG with convolutional neural networks and quadratic time-frequency distributions. J Neural Eng 2021, 18. [Google Scholar] [CrossRef] [PubMed]
  36. Moghadam, S.M.; Pinchefsky, E.; Tse, I.; Marchi, V.; Kohonen, J.; Kauppila, M.; Airaksinen, M.; Tapani, K.; Nevalainen, P.; Hahn, C.; et al. Building an Open Source Classifier for the Neonatal EEG Background: A Systematic Feature-Based Approach From Expert Scoring to Clinical Visualization. Front Hum Neurosci 2021, 15, 675154. [Google Scholar] [CrossRef] [PubMed]
  37. Matic, V.; Cherian, P.J.; Koolen, N.; Naulaers, G.; Swarte, R.M.; Govaert, P.; Van Huffel, S.; De Vos, M. Holistic approach for automated background EEG assessment in asphyxiated full-term infants. J Neural Eng 2014, 11, 066007. [Google Scholar] [CrossRef]
  38. Pavel, A.M.; O'Toole, J.M.; Proietti, J.; Livingstone, V.; Mitra, S.; Marnane, W.P.; Finder, M.; Dempsey, E.M.; Murray, D.M.; Boylan, G.B.; et al. Machine learning for the early prediction of infants with electrographic seizures in neonatal hypoxic-ischemic encephalopathy. Epilepsia 2023, 64, 456–468. [Google Scholar] [CrossRef] [PubMed]
  39. Serag, A.; Blesa, M.; Moore, E.J.; Pataky, R.; Sparrow, S.A.; Wilkinson, A.G.; Macnaught, G.; Semple, S.I.; Boardman, J.P. Accurate Learning with Few Atlases (ALFA): an algorithm for MRI neonatal brain extraction and comparison with 11 publicly available methods. Sci Rep 2016, 6, 23470. [Google Scholar] [CrossRef] [PubMed]
  40. Blesa, M.; Galdi, P.; Cox, S.R.; Sullivan, G.; Stoye, D.Q.; Lamb, G.J.; Quigley, A.J.; Thrippleton, M.J.; Escudero, J.; Bastin, M.E.; et al. Hierarchical Complexity of the Macro-Scale Neonatal Brain. Cereb Cortex 2021, 31, 2071–2084. [Google Scholar] [CrossRef] [PubMed]
  41. De Vries, L.S.; Groenendaal, F.; van Haastert, I.C.; Eken, P.; Rademaker, K.J.; Meiners, L.C. Asymmetrical myelination of the posterior limb of the internal capsule in infants with periventricular haemorrhagic infarction: an early predictor of hemiplegia. Neuropediatrics 1999, 30, 314–319. [Google Scholar] [CrossRef] [PubMed]
  42. Odding, E.; Roebroeck, M.E.; Stam, H.J. The epidemiology of cerebral palsy: incidence, impairments and risk factors. Disabil Rehabil 2006, 28, 183–191. [Google Scholar] [CrossRef] [PubMed]
  43. Drougia, A.; Giapros, V.; Krallis, N.; Theocharis, P.; Nikaki, A.; Tzoufi, M.; Andronikou, S. Incidence and risk factors for cerebral palsy in infants with perinatal problems: a 15-year review. Early Hum Dev 2007, 83, 541–547. [Google Scholar] [CrossRef] [PubMed]
  44. Gruber, N.; Galijasevic, M.; Regodic, M.; Grams, A.E.; Siedentopf, C.; Steiger, R.; Hammerl, M.; Haltmeier, M.; Gizewski, E.R.; Janjic, T. A deep learning pipeline for the automated segmentation of posterior limb of internal capsule in preterm neonates. Artif Intell Med 2022, 132, 102384. [Google Scholar] [CrossRef] [PubMed]
  45. Dean, B.; Ginnell, L.; Boardman, J.P.; Fletcher-Watson, S. Social cognition following preterm birth: A systematic review. Neurosci Biobehav Rev 2021, 124, 151–167. [Google Scholar] [CrossRef] [PubMed]
  46. Batalle, D.; Edwards, A.D.; O'Muircheartaigh, J. Annual Research Review: Not just a small adult brain: understanding later neurodevelopment through imaging the neonatal brain. J Child Psychol Psychiatry 2018, 59, 350–371. [Google Scholar] [CrossRef] [PubMed]
  47. Ball, G.; Aljabar, P.; Nongena, P.; Kennea, N.; Gonzalez-Cinca, N.; Falconer, S.; Chew, A.T.M.; Harper, N.; Wurie, J.; Rutherford, M.A.; et al. Multimodal image analysis of clinical influences on preterm brain development. Ann Neurol 2017, 82, 233–246. [Google Scholar] [CrossRef] [PubMed]
  48. Galdi, P.; Blesa, M.; Stoye, D.Q.; Sullivan, G.; Lamb, G.J.; Quigley, A.J.; Thrippleton, M.J.; Bastin, M.E.; Boardman, J.P. Neonatal morphometric similarity mapping for predicting brain age and characterizing neuroanatomic variation associated with preterm birth. Neuroimage Clin 2020, 25, 102195. [Google Scholar] [CrossRef] [PubMed]
  49. Makropoulos, A.; Gousias, I.S.; Ledig, C.; Aljabar, P.; Serag, A.; Hajnal, J.V.; Edwards, A.D.; Counsell, S.J.; Rueckert, D. Automatic whole brain MRI segmentation of the developing neonatal brain. IEEE Trans Med Imaging 2014, 33, 1818–1831. [Google Scholar] [CrossRef] [PubMed]
  50. Ding, Y.; Acosta, R.; Enguix, V.; Suffren, S.; Ortmann, J.; Luck, D.; Dolz, J.; Lodygensky, G.A. Using Deep Convolutional Neural Networks for Neonatal Brain Image Segmentation. Front Neurosci 2020, 14, 207. [Google Scholar] [CrossRef] [PubMed]
  51. Verder, H.; Heiring, C.; Ramanathan, R.; Scoutaris, N.; Verder, P.; Jessen, T.E.; Hoskuldsson, A.; Bender, L.; Dahl, M.; Eschen, C.; et al. Bronchopulmonary dysplasia predicted at birth by artificial intelligence. Acta Paediatr 2021, 110, 503–509. [Google Scholar] [CrossRef]
  52. Ahmed, W.; Veluthandath, A.V.; Rowe, D.J.; Madsen, J.; Clark, H.W.; Postle, A.D.; Wilkinson, J.S.; Murugan, G.S. Prediction of Neonatal Respiratory Distress Biomarker Concentration by Application of Machine Learning to Mid-Infrared Spectra. Sensors (Basel) 2022, 22. [Google Scholar] [CrossRef] [PubMed]
  53. Raimondi, F.; Migliaro, F.; Verdoliva, L.; Gragnaniello, D.; Poggi, G.; Kosova, R.; Sansone, C.; Vallone, G.; Capasso, L. Visual assessment versus computer-assisted gray scale analysis in the ultrasound evaluation of neonatal respiratory status. PLoS One 2018, 13, e0202397. [Google Scholar] [CrossRef] [PubMed]
  54. Dai, D.; Chen, H.; Dong, X.; Chen, J.; Mei, M.; Lu, Y.; Yang, L.; Wu, B.; Cao, Y.; Wang, J.; et al. Bronchopulmonary Dysplasia Predicted by Developing a Machine Learning Model of Genetic and Clinical Information. Front Genet 2021, 12, 689071. [Google Scholar] [CrossRef] [PubMed]
  55. Leigh, R.M.; Pham, A.; Rao, S.S.; Vora, F.M.; Hou, G.; Kent, C.; Rodriguez, A.; Narang, A.; Tan, J.B.C.; Chou, F.S. Machine learning for prediction of bronchopulmonary dysplasia-free survival among very preterm infants. BMC Pediatr 2022, 22, 542. [Google Scholar] [CrossRef] [PubMed]
  56. Xing, W.; He, W.; Li, X.; Chen, J.; Cao, Y.; Zhou, W.; Shen, Q.; Zhang, X.; Ta, D. Early severity prediction of BPD for premature infants from chest X-ray images using deep learning: A study at the 28th day of oxygen inhalation. Comput Methods Programs Biomed 2022, 221, 106869. [Google Scholar] [CrossRef] [PubMed]
  57. Mueller, M.; Wagner, C.L.; Annibale, D.J.; Hulsey, T.C.; Knapp, R.G.; Almeida, J.S. Predicting extubation outcome in preterm newborns: a comparison of neural networks with clinical expertise and statistical modeling. Pediatr Res 2004, 56, 11–18. [Google Scholar] [CrossRef] [PubMed]
  58. Precup, D.; Robles-Rubio, C.A.; Brown, K.A.; Kanbar, L.; Kaczmarek, J.; Chawla, S.; Sant'Anna, G.M.; Kearney, R.E. Prediction of extubation readiness in extreme preterm infants based on measures of cardiorespiratory variability. Annu Int Conf IEEE Eng Med Biol Soc 2012, 2012, 5630–5633. [Google Scholar] [CrossRef] [PubMed]
  59. Mikhno, A.; Ennett, C.M. Prediction of extubation failure for neonates with respiratory distress syndrome using the MIMIC-II clinical database. Annu Int Conf IEEE Eng Med Biol Soc 2012, 2012, 5094–5097. [Google Scholar] [CrossRef] [PubMed]
  60. Laughon, M.M.; Langer, J.C.; Bose, C.L.; Smith, P.B.; Ambalavanan, N.; Kennedy, K.A.; Stoll, B.J.; Buchter, S.; Laptook, A.R.; Ehrenkranz, R.A.; et al. Prediction of bronchopulmonary dysplasia by postnatal age in extremely premature infants. Am J Respir Crit Care Med 2011, 183, 1715–1722. [Google Scholar] [CrossRef] [PubMed]
  61. Patel, M.; Sandhu, J.; Chou, F.S. Developing a machine learning-based tool to extend the usability of the NICHD BPD Outcome Estimator to the Asian population. PLoS One 2022, 17, e0272709. [Google Scholar] [CrossRef] [PubMed]
  62. Eichenwald, E.C.; Committee on, F.; Newborn, A.A.o.P. Apnea of Prematurity. Pediatrics 2016, 137. [Google Scholar] [CrossRef]
  63. Amin, S.B.; Burnell, E. Monitoring apnea of prematurity: validity of nursing documentation and bedside cardiorespiratory monitor. Am J Perinatol 2013, 30, 643–648. [Google Scholar] [CrossRef] [PubMed]
  64. Varisco, G.; Peng, Z.; Kommers, D.; Zhan, Z.; Cottaar, W.; Andriessen, P.; Long, X.; van Pul, C. Central apnea detection in premature infants using machine learning. Comput Methods Programs Biomed 2022, 226, 107155. [Google Scholar] [CrossRef] [PubMed]
  65. Barrero-Castillero, A.; Corwin, B.K.; VanderVeen, D.K.; Wang, J.C. Workforce Shortage for Retinopathy of Prematurity Care and Emerging Role of Telehealth and Artificial Intelligence. Pediatr Clin North Am 2020, 67, 725–733. [Google Scholar] [CrossRef] [PubMed]
  66. Ataer-Cansizoglu, E.; Bolon-Canedo, V.; Campbell, J.P.; Bozkurt, A.; Erdogmus, D.; Kalpathy-Cramer, J.; Patel, S.; Jonas, K.; Chan, R.V.; Ostmo, S.; et al. Computer-Based Image Analysis for Plus Disease Diagnosis in Retinopathy of Prematurity: Performance of the "i-ROP" System and Image Features Associated With Expert Diagnosis. Transl Vis Sci Technol 2015, 4, 5. [Google Scholar] [CrossRef] [PubMed]
  67. Redd, T.K.; Campbell, J.P.; Brown, J.M.; Kim, S.J.; Ostmo, S.; Chan, R.V.P.; Dy, J.; Erdogmus, D.; Ioannidis, S.; Kalpathy-Cramer, J.; et al. Evaluation of a deep learning image assessment system for detecting severe retinopathy of prematurity. Br J Ophthalmol 2018. [Google Scholar] [CrossRef] [PubMed]
  68. Wu, Q.; Hu, Y.; Mo, Z.; Wu, R.; Zhang, X.; Yang, Y.; Liu, B.; Xiao, Y.; Zeng, X.; Lin, Z.; et al. Development and Validation of a Deep Learning Model to Predict the Occurrence and Severity of Retinopathy of Prematurity. JAMA Netw Open 2022, 5, e2217447. [Google Scholar] [CrossRef] [PubMed]
  69. Chiang, M.F.; Melia, M.; Buffenn, A.N.; Lambert, S.R.; Recchia, F.M.; Simpson, J.L.; Yang, M.B. Detection of clinically significant retinopathy of prematurity using wide-angle digital retinal photography: a report by the American Academy of Ophthalmology. Ophthalmology 2012, 119, 1272–1280. [Google Scholar] [CrossRef] [PubMed]
  70. Biten, H.; Redd, T.K.; Moleta, C.; Campbell, J.P.; Ostmo, S.; Jonas, K.; Chan, R.V.P.; Chiang, M.F. Imaging; Informatics in Retinopathy of Prematurity Research, C. Diagnostic Accuracy of Ophthalmoscopy vs Telemedicine in Examinations for Retinopathy of Prematurity. JAMA Ophthalmol 2018, 136, 498–504. [Google Scholar] [CrossRef] [PubMed]
  71. Brown, J.M.; Campbell, J.P.; Beers, A.; Chang, K.; Ostmo, S.; Chan, R.V.P.; Dy, J.; Erdogmus, D.; Ioannidis, S.; Kalpathy-Cramer, J.; et al. Automated Diagnosis of Plus Disease in Retinopathy of Prematurity Using Deep Convolutional Neural Networks. JAMA Ophthalmol 2018, 136, 803–810. [Google Scholar] [CrossRef] [PubMed]
  72. Taylor, S.; Brown, J.M.; Gupta, K.; Campbell, J.P.; Ostmo, S.; Chan, R.V.P.; Dy, J.; Erdogmus, D.; Ioannidis, S.; Kim, S.J.; et al. Monitoring Disease Progression With a Quantitative Severity Scale for Retinopathy of Prematurity Using Deep Learning. JAMA Ophthalmol 2019, 137, 1022–1028. [Google Scholar] [CrossRef] [PubMed]
  73. Campbell, J.P.; Singh, P.; Redd, T.K.; Brown, J.M.; Shah, P.K.; Subramanian, P.; Rajan, R.; Valikodath, N.; Cole, E.; Ostmo, S.; et al. Applications of Artificial Intelligence for Retinopathy of Prematurity Screening. Pediatrics 2021, 147. [Google Scholar] [CrossRef] [PubMed]
  74. Tsien, C.L.; Kohane, I.S.; McIntosh, N. Multiple signal integration by decision tree induction to detect artifacts in the neonatal intensive care unit. Artif Intell Med 2000, 19, 189–202. [Google Scholar] [CrossRef] [PubMed]
  75. Saria, S.; Rajani, A.K.; Gould, J.; Koller, D.; Penn, A.A. Integration of early physiological responses predicts later illness severity in preterm infants. Sci Transl Med 2010, 2, 48ra65. [Google Scholar] [CrossRef] [PubMed]
  76. Cabrera-Quiros, L.; Kommers, D.; Wolvers, M.K.; Oosterwijk, L.; Arents, N.; van der Sluijs-Bens, J.; Cottaar, E.J.E.; Andriessen, P.; van Pul, C. Prediction of Late-Onset Sepsis in Preterm Infants Using Monitoring Signals and Machine Learning. Crit Care Explor 2021, 3, e0302. [Google Scholar] [CrossRef] [PubMed]
  77. Lyra, S.; Rixen, J.; Heimann, K.; Karthik, S.; Joseph, J.; Jayaraman, K.; Orlikowsky, T.; Sivaprakasam, M.; Leonhardt, S.; Hoog Antink, C. Camera fusion for real-time temperature monitoring of neonates using deep learning. Med Biol Eng Comput 2022, 60, 1787–1800. [Google Scholar] [CrossRef] [PubMed]
  78. Son, J.; Kim, D.; Na, J.Y.; Jung, D.; Ahn, J.H.; Kim, T.H.; Park, H.K. Development of artificial neural networks for early prediction of intestinal perforation in preterm infants. Sci Rep 2022, 12, 12112. [Google Scholar] [CrossRef]
  79. Greenbury, S.F.; Ougham, K.; Wu, J.; Battersby, C.; Gale, C.; Modi, N.; Angelini, E.D. Identification of variation in nutritional practice in neonatal units in England and association with clinical outcomes using agnostic machine learning. Sci Rep 2021, 11, 7178. [Google Scholar] [CrossRef] [PubMed]
  80. Han, J.H.; Yoon, S.J.; Lee, H.S.; Park, G.; Lim, J.; Shin, J.E.; Eun, H.S.; Park, M.S.; Lee, S.M. Application of Machine Learning Approaches to Predict Postnatal Growth Failure in Very Low Birth Weight Infants. Yonsei Med J 2022, 63, 640–647. [Google Scholar] [CrossRef]
  81. Althnian, A.; Almanea, N.; Aloboud, N. Neonatal Jaundice Diagnosis Using a Smartphone Camera Based on Eye, Skin, and Fused Features with Transfer Learning. Sensors (Basel) 2021, 21. [Google Scholar] [CrossRef] [PubMed]
  82. Guedalia, J.; Farkash, R.; Wasserteil, N.; Kasirer, Y.; Rottenstreich, M.; Unger, R.; Grisaru Granovsky, S. Primary risk stratification for neonatal jaundice among term neonates using machine learning algorithm. Early Hum Dev 2022, 165, 105538. [Google Scholar] [CrossRef] [PubMed]
  83. Shane, A.L.; Sanchez, P.J.; Stoll, B.J. Neonatal sepsis. Lancet 2017, 390, 1770–1780. [Google Scholar] [CrossRef] [PubMed]
  84. Adam, J.; Rupprecht, S.; Kunstler, E.C.S.; Hoyer, D. Heart rate variability as a marker and predictor of inflammation, nosocomial infection, and sepsis - A systematic review. Auton Neurosci 2023, 249, 103116. [Google Scholar] [CrossRef] [PubMed]
  85. El-Khuffash, A.; Bussmann, N.; Breatnach, C.R.; Smith, A.; Tully, E.; Griffin, J.; McCallion, N.; Corcoran, J.D.; Fernandez, E.; Looi, C.; et al. A Pilot Randomized Controlled Trial of Early Targeted Patent Ductus Arteriosus Treatment Using a Risk Based Severity Score (The PDA RCT). J Pediatr 2021, 229, 127–133. [Google Scholar] [CrossRef] [PubMed]
  86. Na, J.Y.; Kim, D.; Kwon, A.M.; Jeon, J.Y.; Kim, H.; Kim, C.R.; Lee, H.J.; Lee, J.; Park, H.K. Artificial intelligence model comparison for risk factor analysis of patent ductus arteriosus in nationwide very low birth weight infants cohort. Sci Rep 2021, 11, 22353. [Google Scholar] [CrossRef] [PubMed]
  87. Gomez-Quintana, S.; Schwarz, C.E.; Shelevytsky, I.; Shelevytska, V.; Semenova, O.; Factor, A.; Popovici, E.; Temko, A. A Framework for AI-Assisted Detection of Patent Ductus Arteriosus from Neonatal Phonocardiogram. Healthcare (Basel) 2021, 9. [Google Scholar] [CrossRef] [PubMed]
  88. Shang, J.; Fisher, P.; Bauml, J.G.; Daamen, M.; Baumann, N.; Zimmer, C.; Bartmann, P.; Boecker, H.; Wolke, D.; Sorg, C.; et al. A machine learning investigation of volumetric and functional MRI abnormalities in adults born preterm. Hum Brain Mapp 2019, 40, 4239–4252. [Google Scholar] [CrossRef]
  89. Chiarelli, A.M.; Sestieri, C.; Navarra, R.; Wise, R.G.; Caulo, M. Distinct effects of prematurity on MRI metrics of brain functional connectivity, activity, and structure: Univariate and multivariate analyses. Hum Brain Mapp 2021, 42, 3593–3607. [Google Scholar] [CrossRef] [PubMed]
  90. Ball, G.; Aljabar, P.; Arichi, T.; Tusor, N.; Cox, D.; Merchant, N.; Nongena, P.; Hajnal, J.V.; Edwards, A.D.; Counsell, S.J. Machine-learning to characterise neonatal functional connectivity in the preterm brain. Neuroimage 2016, 124, 267–275. [Google Scholar] [CrossRef] [PubMed]
  91. Smyser, C.D.; Dosenbach, N.U.; Smyser, T.A.; Snyder, A.Z.; Rogers, C.E.; Inder, T.E.; Schlaggar, B.L.; Neil, J.J. Prediction of brain maturity in infants using machine-learning algorithms. Neuroimage 2016, 136, 1–9. [Google Scholar] [CrossRef] [PubMed]
  92. Song, Z.; Awate, S.P.; Licht, D.J.; Gee, J.C. Clinical neonatal brain MRI segmentation using adaptive nonparametric data models and intensity-based Markov priors. Med Image Comput Comput Assist Interv 2007, 10, 883–890. [Google Scholar] [CrossRef] [PubMed]
  93. Zimmer, V.A.; Glocker, B.; Hahner, N.; Eixarch, E.; Sanroma, G.; Gratacos, E.; Rueckert, D.; Gonzalez Ballester, M.A.; Piella, G. Learning and combining image neighborhoods using random forests for neonatal brain disease classification. Med Image Anal 2017, 42, 189–199. [Google Scholar] [CrossRef] [PubMed]
  94. Sripada, K.; Bjuland, K.J.; Solsnes, A.E.; Haberg, A.K.; Grunewaldt, K.H.; Lohaugen, G.C.; Rimol, L.M.; Skranes, J. Trajectories of brain development in school-age children born preterm with very low birth weight. Sci Rep 2018, 8, 15553. [Google Scholar] [CrossRef] [PubMed]
  95. Keunen, K.; Counsell, S.J.; Benders, M. The emergence of functional architecture during early brain development. Neuroimage 2017, 160, 2–14. [Google Scholar] [CrossRef] [PubMed]
  96. Gao, W.; Lin, W.; Grewen, K.; Gilmore, J.H. Functional Connectivity of the Infant Human Brain: Plastic and Modifiable. Neuroscientist 2017, 23, 169–184. [Google Scholar] [CrossRef] [PubMed]
  97. Wee, C.Y.; Tuan, T.A.; Broekman, B.F.; Ong, M.Y.; Chong, Y.S.; Kwek, K.; Shek, L.P.; Saw, S.M.; Gluckman, P.D.; Fortier, M.V.; et al. Neonatal neural networks predict children behavioral profiles later in life. Hum Brain Mapp 2017, 38, 1362–1373. [Google Scholar] [CrossRef]
  98. Schadl, K.; Vassar, R.; Cahill-Rowley, K.; Yeom, K.W.; Stevenson, D.K.; Rose, J. Prediction of cognitive and motor development in preterm children using exhaustive feature selection and cross-validation of near-term white matter microstructure. Neuroimage Clin 2018, 17, 667–679. [Google Scholar] [CrossRef] [PubMed]
  99. Vassar, R.; Schadl, K.; Cahill-Rowley, K.; Yeom, K.; Stevenson, D.; Rose, J. Neonatal Brain Microstructure and Machine-Learning-Based Prediction of Early Language Development in Children Born Very Preterm. Pediatr Neurol 2020, 108, 86–92. [Google Scholar] [CrossRef] [PubMed]
  100. Li, Y.; Zhang, X.; Nie, J.; Zhang, G.; Fang, R.; Xu, X.; Wu, Z.; Hu, D.; Wang, L.; Zhang, H.; et al. Brain Connectivity Based Graph Convolutional Networks and Its Application to Infant Age Prediction. IEEE Trans Med Imaging 2022, 41, 2764–2776. [Google Scholar] [CrossRef] [PubMed]
  101. Ali, R.; Li, H.; Dillman, J.R.; Altaye, M.; Wang, H.; Parikh, N.A.; He, L. A self-training deep neural network for early prediction of cognitive deficits in very preterm infants using brain functional connectome data. Pediatr Radiol 2022, 52, 2227–2240. [Google Scholar] [CrossRef] [PubMed]
  102. Krishnan, M.L.; Wang, Z.; Aljabar, P.; Ball, G.; Mirza, G.; Saxena, A.; Counsell, S.J.; Hajnal, J.V.; Montana, G.; Edwards, A.D. Machine learning shows association between genetic variability in PPARG and cerebral connectivity in preterm infants. Proc Natl Acad Sci U S A 2017, 114, 13744–13749. [Google Scholar] [CrossRef] [PubMed]
  103. Valavani, E.; Blesa, M.; Galdi, P.; Sullivan, G.; Dean, B.; Cruickshank, H.; Sitko-Rudnicka, M.; Bastin, M.E.; Chin, R.F.M.; MacIntyre, D.J.; et al. Language function following preterm birth: prediction using machine learning. Pediatr Res 2022, 92, 480–489. [Google Scholar] [CrossRef] [PubMed]
  104. Balta, D.; Kuo, H.; Wang, J.; Porco, I.G.; Morozova, O.; Schladen, M.M.; Cereatti, A.; Lum, P.S.; Della Croce, U. Characterization of Infants' General Movements Using a Commercial RGB-Depth Sensor and a Deep Neural Network Tracking Processing Tool: An Exploratory Study. Sensors (Basel) 2022, 22. [Google Scholar] [CrossRef] [PubMed]
  105. Pearlman, S.A. Advancements in neonatology through quality improvement. J Perinatol 2022, 42, 1277–1282. [Google Scholar] [CrossRef]
  106. Podda, M.; Bacciu, D.; Micheli, A.; Bellu, R.; Placidi, G.; Gagliardi, L. A machine learning approach to estimating preterm infants survival: development of the Preterm Infants Survival Assessment (PISA) predictor. Sci Rep 2018, 8, 13743. [Google Scholar] [CrossRef] [PubMed]
  107. Ambalavanan, N.; Carlo, W.A.; Bobashev, G.; Mathias, E.; Liu, B.; Poole, K.; Fanaroff, A.A.; Stoll, B.J.; Ehrenkranz, R.; Wright, L.L.; et al. Prediction of death for extremely low birth weight neonates. Pediatrics 2005, 116, 1367–1373. [Google Scholar] [CrossRef] [PubMed]
  108. Mangold, C.; Zoretic, S.; Thallapureddy, K.; Moreira, A.; Chorath, K.; Moreira, A. Machine Learning Models for Predicting Neonatal Mortality: A Systematic Review. Neonatology 2021, 118, 394–405. [Google Scholar] [CrossRef] [PubMed]
  109. Hsu, J.F.; Yang, C.; Lin, C.Y.; Chu, S.M.; Huang, H.R.; Chiang, M.C.; Wang, H.C.; Liao, W.C.; Fu, R.H.; Tsai, M.H. Machine Learning Algorithms to Predict Mortality of Neonates on Mechanical Intubation for Respiratory Failure. Biomedicines 2021, 9. [Google Scholar] [CrossRef] [PubMed]
  110. Do, H.J.; Moon, K.M.; Jin, H.S. Machine Learning Models for Predicting Mortality in 7472 Very Low Birth Weight Infants Using Data from a Nationwide Neonatal Network. Diagnostics (Basel) 2022, 12. [Google Scholar] [CrossRef] [PubMed]
  111. Moreira, A.; Benvenuto, D.; Fox-Good, C.; Alayli, Y.; Evans, M.; Jonsson, B.; Hakansson, S.; Harper, N.; Kim, J.; Norman, M.; et al. Development and Validation of a Mortality Prediction Model in Extremely Low Gestational Age Neonates. Neonatology 2022, 119, 418–427. [Google Scholar] [CrossRef] [PubMed]
  112. Nascimento, L.F.; Ortega, N.R. Fuzzy linguistic model for evaluating the risk of neonatal death. Rev Saude Publica 2002, 36, 686–692. [Google Scholar] [CrossRef] [PubMed]
  113. Mercurio, M.R.; Cummings, C.L. Critical decision-making in neonatology and pediatrics: the I-P-O framework. J Perinatol 2021, 41, 173–178. [Google Scholar] [CrossRef] [PubMed]
  114. Katznelson, G.; Gerke, S. The need for health AI ethics in medical school education. Adv Health Sci Educ Theory Pract 2021, 26, 1447–1458. [Google Scholar] [CrossRef] [PubMed]
  115. Lin, M.; Vitcov, G.G.; Cummings, C.L. Moral equivalence theory in neonatology. Semin Perinatol 2022, 46, 151525. [Google Scholar] [CrossRef] [PubMed]
Table 1. Examples of the current evidence of artificial intelligence application in neonatology.
Table 1. Examples of the current evidence of artificial intelligence application in neonatology.
System Aim References Artificial intelligence approach Type of data analyzed Outcome
Neurocritical Care
Electro-encephalography Automated seizures detection O'Shea et al. [27], Liu et al. [28], Gotman et al. [29], O’Shea et al. [30], Pavel et al. [31], Mathieson et al. [33] DL detection models based on SVM system [27,30], scored autocorrelation moment analysis [28], spectral analysis [29], and algorithm for automated neonatal seizure recognition [31], utilizing the AUROC Continuous EEG data The developed seizure detection algorithms achieved a 56% relative improvement, sensitivity 81.3%-84%, specificity 84.4%-98%, overall accuracy of 93.3%-98.5%, and a false detection rate of 0.04-1.7/h
Severity grading of neonatal HIE Stevenson et al. [32], Raurale et al. [35], Moghadam et al. [36], Matic et al. [37], Pavel et al. [38] ML classifier models of automated grading system based on a multi-class linear analysis [32], quadratic time-frequency distribution with a CNN [35], SVM, multilayer feedforward neural network or recurrent neural network [36], utilizing matthews correlation coefficient [38] and the AUROC Continuous EEG and clinical data ML models of automated grading system had an accuracy of 83%-97%. The clinical and qualitative-EEG model significantly had an MCC of 0.470. The performance for quantitative aEEG was MCC 0.381, AUROC 0.696 and clinical and quantitative aEEG was MCC 0.384, AUROC 0.720
Sleep stage classification Ansari et al. [34] Convolutional neural network inception block (SINC) EEG data The SINC-based model significantly outperformed neonatal quiet sleep detection algorithms, with mean Kappa 0.75-0.77
Magnetic Resonance Imaging Automated segmentation and quantification of the PLIC Grubel et al. [44] CNN-based algorithm comprised of slice-selection modules and a multi-view segmentation model MRI data The method could identify a specific desired slice from the MRI volume data
Combination of structural and functional networks Ball et al. [47], Galdi et al. [48] A multivariate analysis combining multiple imaging modalities [47], or utilizing morphometric similarity networks [48] MRI and clinical data The model conformed the association between imaging markers of neuroanatomical abnormality and poor cognitive and motor outcomes. The regression model predicted post menstrual age at scan with an absolute error of 0.70 weeks, and an accuracy of 92%
Generation of reliable and accurate segmentation Makropoulos et al. [49], Ding et al. [50] AI model using a framework for accurate intensity-based segmentation [49], or DSC for each tissue type [50] MRI data The model achieved highly accurate results across a wide range of gestational ages. The dual-modality HyperDense-Net achieved the best DSC values. The single-modality LiviaNET processed better T2W than T1W images. Both neural networks achieved previously reported performance
Respiratory System RDS severity Ahmed et al. [52], Raimondi et al. [53] ML model using attenuated total reflectance fourier transform infrared spectroscopy, callibration of principal component, and PLRS [52], or SVM regressor [53] RDS biomarkers, lecithin and sphingomyelin (L/S ratio) and lung ultrasound both by visual and computer-assisted gray scale analysis A three-factor PLSR model of second derivative spectra predicted L/S ratios with signifincat accuracy (R2 0.967). Visual assessment correlated with PaO2/FiO2 (r -0.55; p<0.0001) and the A-a gradient (r 0.59; p<0.0001). Oxygenation indices were associted with and the gray scale analysis of lung ultrasound scans
Prediction of BPD Verder et al. [51], Dai et al. [54], Leigh et al. [55], Xing et al. [56], Laughon et al. [60], Patel et al. [61] Model using SVM [51], logistic regression [55], XSEG-Net model combining digital image processing and human-computer interaction [56], C statistic [60], RF algorithm [61], and the AUROC Perinatal, clinical, genetic, laboratory, X-ray imaging, and demographic data Algorithm combining perinatal data, and gastric aspirates analysis resulted to a sensitivity of 88% and a specificity of 91%. The predictive model combinig BPD with risk gene sets and basic clinical risk factors, showed discrimination of AUROC 0.915. The AI models performance showed AUROC 0.757-0.934. The deep CNN model had accuracy, precision, sensitivity, and specificity of 95.58%, 95.61%, 95.67%, and 96.98%, respectively. Prediction from C statistic was 0.793-0.854
Extubation readiness Mueller et al. [57], Precup et al. [58], Mikhno et al. [59] ML approach of ANN [57], SVM [58], using multivariate logistic regression and the AUROC Multiple clinical and laboratory, and measures of cardiorespiratory variability The optimal models achieved an AUROC of 0.87-0.871, sensitivity of 70.1%, and specificity of 90%. AI predictive models compared well with the clinician's expertise, accurately classified infants who would fail extubation
Automated detection of apneas Varisco et al. [64] Optimized algorithm for automated detection using logistic regression and the AUROC ECG, chest impedance and oxygen saturation signal features The apnea detection model returned AUROC of 0.88-0.90. Feature relevance was found to be the highest for features derived from the chest impedance
Ophthalmology Automated diagnosis of ROP Ataer-Cansizoglu et al. [66], Redd et al. [67], Wu et al. [68], Biten et al. [70], Brown et al. [71], Taylor et al. [72], Campbell et al. [73] DL computer-based image analysis system (i-ROP) [66,67], occurrence network and severity network of ROP [68], telemedicine diagnoses [70], deep CNN algorithm [71], quantitative severity scale for ROP [73], calculating the AUROC, accuracy, sensitivity, and specificity Retina image The i-ROP system had 95% accuracy for detecting preplus and plus disease. i-ROP had an AUROC of 0.960, 94% sensitivity, 79% specificity, 13% positive predictive value and 99.7% negative predictive value for detecting type 1 ROP. OC-Net had AUROC, accuracy, sensitivity, and specificity of 0.90, 52.8%, 100%, and 37.8%, respectively, while SE-Net 0.87, 68.0%, 100%, and 46.6%, respectively. Telemedicine had 78% sensitivity for zone I disease 79% for plus disease and 79% for type 2 ROP. Deep CNN algorithm had AUROC 0.94 for the diagnosis of normal and 0.98 for the diagnosis of plus disease, a sensitivity of 93% and specificity of 94%. The AI-based quantitative severity scale for ROP had AUROC of 0.98, with 100% sensitivity and 78% specificity
Vital Signs Detect artifacts Tsien et al. [74] Decision tree induction model Multiple physiologic data signals The classification system evaluating physiologic data may be a viable approach to detecting artifacts
Predict overall mortality Saria et al. [75] Prediction algorithm (PhysiScore) based on a physiological assessment score Apgar score and standard signals recorded noninvasively on admission PhysiScore had 86% sensitive and 96% specificity in predicting overall morbidity. PhysiScore had accuarcy of 90%-100% in precting morbidity related to infection, and 96%-100% to cardiopulmonary events
Temperature detection Lyra et al. [77] A combination of DL-based algorithms and camera modalities Thermographic recordings The detector showed a precision of 0.82. The evaluation of the temperature extraction revealed an absolute error of 0.55 oC
Gastrointestinal System Prediction of spontaneous intestinal perforation Son et al. [78] AI model of ANN using the receiver operating characteristic analysis Clinical data The ANN models showed an AUROC of 0.8797-0.8832 for predicting intestinal perforation
Prediction of postnatal growth failure Han et al. [80] ML models of extreme gradient boosting, random forest, support vector machine, and convolutional neural network, using multiple logistic regression Clinical data The model showed an AUROC of 0.74, and accuracy of 0.68
Jaundice Detection of jaundice Althnian et al. [81], Guedalia et al. [82] DL and ML model using a combined data analysis approach with the AUROC Eye, skin, and fused images DL models performed the best with skin images. The ML diagnostic ability to evaluate the risk for jaundice was 0.748
Sepsis Prediction of EOS Stocker et al. [85] ML was used in form of a random forest classifier Risk factors, clinical signs and biomarkers The ML model achieved an AUROC of 83.41% and an area under the precision recall curve 28.42%.
Prediction of LOS Cabrera-Quiros et al. [76] ML approaches of logistic regressor, naive Bayes, and nearest mean classifier Heart rate variability, respiration, and body motion data Using a combination of all features, classification of LOS showed a mean accuracy of 0.79 and mean precision rate of 0.82 three hours before the onset of sepsis
Patent Ductus Arteriosus Detection of PDA Na et al. [86], Gomez-Quintana et al. [87] ML algorithms of a RF, a decision tree-based theory, an L-GBM, a low-bias model formed by combining sequential weak models with a light computational algorithm, a multilayer perceptron, a feedforward ANN, a SVM, using multiple logistic regression Database of risk factors and heart sounds data L-GBM achieved an accuracy at predicting PDA of 0.77, AUROC of 0.82 and pecificity of 0.84. The RF model achieved an accuracy of 0.85, AUROC of 0.82 and sensitivity of 0.97 in determining sPDA therapy. ML-based on heart sounds system reached and AUROC of 77 % at detecting PDA
Neurodevelopmental outcome Detection of neonates with cognitive impairment Wee et al. [97], Ali et al. [101], Krishnan et al. [102] Clustering coefficients of individual structures using SVM and canonical correlation analysis [97], self-training DNN [101], and ML using sparse reduced rank regression [102] DTI tractography, brain functional connectome and cognitive assesment data, genomewide, SNP-based genotypes, and neurodevelopmental scales The clustering coefficient of the DTI tractography were associated with internalizing and externalizing behaviors at 24 and 48 months of age. The self-training DNN model achieved an accuracy of 71.0%, a specificity of 71.5%, a sensitivity of 70.4% and an AUROC of 0.75. SNPs in PPARG were significantly overrepresented, in introns or regulatory regions with predicted effects including protein coding and nonsense-mediated decay
Detection of neonates at risk of language impairment Vassar et al. [99], Valavani et al. [103] Multivariate models with leave-one-out cross-validation and exhaustive feature selection [99], and RF classifier [103] MRI diffusion tensor imaging and neurodevelopmental scales Model based on DTI had 89% and 86% sensitivity and specificity for composite, 100% and 90% for expressive, and 100% and 90% for receptive language, respectively. The RF classifier model achieved accuracy 91%, sensitivity 86%, and specificity 96%.
Detection of neuromotor problems and risk of cerebral palsy Balta et al. [104] Tracking software of DeepLabCut using a k-means algorithm Single videos of six poIs on the infant’s upper body The results suggested that models may be potentially used for early identification of movement disorders
Mortality Prediction of mortality Podda et al. [105], Ambalavanan et al. [107], Hsu et al. [109], Do et al. [110], Moreira et al. [111], Nascimento et al. [112] ML algorithms including ANN [105,107], RF, bagged classification, and regression tree model [109], methodsincluding ANN, RF, and SVM [110], and a linguistic fuzzy model with minimum of Mamdani inference method [112], using logistic regression models and AUROC Maternal, perinatal, clinical, and laboratory data The ANN model had an AUROC of 0.85 for regression and 0.84 for neural networks, respectively. RF model showed an AUROC of 0.939 for the prediction of neonates with respiratory failure, and the bagged classification and regression tree model demonstrated an AUROC of 0.915. The model performances of AUROC equaled ANN 0.845, RF 0.826, and SVM 0.631. The Fuzzy model was able to capture the expert knowledge with strong correlation (r 0.96)
1 DL, deep learning; SVM, support vector machine; AUROC, area under the receiver operating characteristic curve; EEG, Electroencephalogram; HIE, hypoxic-ischemic encephalopathy; ML, machine learning; CNN, convolutional neural network; MCC, Matthews correlation coefficient; PLIC, posterior limb of internal capsule; MRI, magnetic resonance imaging; DSC, dice similarity coefficient; RDS, respiratory distress syndrome; PLSR, partial least squares regression; A-a gradient, Alveolar-arterial gradient; RF, random forest; ECG, electrocardiography; ROP, retinopathy of prematurity; OC-Net, occurrence network; SE-Net, severity network; ANN, artificial neural networks; EOS, early-onset sepsis; LOS, late-onset sepsis; PDA, patent ductus arteriosus; L-GBM, light gradient boosting machine; DNN, deep neural network; DTI, diffusion tensor imaging; SNP, single-nucleotide polymorphism.
Table 2. Challenges of artificial intelligence in neonatology.
Table 2. Challenges of artificial intelligence in neonatology.
Challenges of artificial intelligence Areas of improvement
Quality of the dataset AI tools require high-quality data to be trained. Studies should address limitation including small sample sizes, improper management of missing information, and heterogeneity evaluation in various demographic subsets
Model performance evaluation Model performance should be continually evaluated on the entire dataset. Apart from the area under the receiver operating characteristics curve, additional performance metrics, such as the precision-recall curve, specificity/sensitivity, and calibration metrics should be assessed
Clinical impact and external validation External validation is crucial because, as in different dataset or in clinical practice, the tool's performance may degrade due to an over-modeling of the training data.Also, the effectiveness of AI should be evaluated in terms of calibration and discrimination quality as well as patient outcomes and the clinical workflow.
Comprehending Bed-side models should enhance intelligence, interpretability, and transparency
Guidelines for critical evaluation, regulation, and oversight methodological, critical appraisal, medicolegal problems, and necessary monitoring is required to guarantee the model’s safe and effective usage
Ethics Informed consent, bias, patient privacy, and allocation are among the ethical issues with health AI, and negotiating their solutions can be challenging. Important decisions in neonatology are often accompanied by a complex and difficult ethical component, and multidisciplinary methods are necessary for advancement
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Alerts
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated