3. Artificial Intelligence and subfields
Artificial Intelligence (AI) is the field of study that focuses on how computers learn from data and on the development of algorithms that enable this learning[
3].
AI involves numerous applications capable of processing information in non-conventional ways[
9]. In order to achieve the best performance, AI requires the management of large datasets known as “Big data”[
10]. “Big data” is a term that was introduced in the 1990s to include datasets too large to be managed by common software[
11]. The vast amount of information about patients' health in massive digital archives is the source of big data in healthcare. As a matter of fact in recent years there has been a progressive trend from paper-based to digitized data[
12,
13].
Big data in healthcare can be characterized by up to six main features, the so called “6 Vs”, according to different authors:
Volume: the continuous and exponentially incremental flow of data spanning from personal medical records up to 3D imaging, genomics, and biometric sensor readings ought to be carefully managed[
12]. Innovations in data management, such as virtualization and cloud computing, are enabling healthcare organizations to store and manipulate large amounts of data more efficiently and cost-effectively[
14];
Velocity: the prompt and rapid transmission of data is a pivotal item nowadays, especially in scenarios like trauma monitoring, anesthesia in operating rooms, and bedside heart monitoring, where timely data analysis can be life-saving[
12]. Besides, future applications, such as early infection detection and targeted treatments based upon real-time data, have the potential to notably decrease morbidity, mortality, and ultimately impact on outcome[
14,
15];
Variety: the ability to analyze large datasets, including multimedia and unstructured formats, represents an innovation in healthcare[
12]. The wide range of structured, unstructured, and semi-structured data analyzed, stands as a revolutionary change that adds complexity to healthcare data management[
16]. Structured data can be easily stored, recalled, elaborated and manipulated by machinery. They come from a variety of sources, including diagnoses, medications, instrument readings, and lab values, and can be sorted into numeric or categorical fields for easy analysis[
12,
17]. Unstructured data is commonly generated at the point of care, including free-form text such as medical notes or discharge summaries and multimedia content such as imaging[
12,
17]. The main challenge is to transform this data to make it suitable for AI analysis, but this process faces some obstacles. First, adding structure to unstructured data entails healthcare providers to manually review charts or images, sort the information out and enter it into the system[
18]. This makes the process slow, inefficient, and prone to bias. New powerful tools such as Natural Language Processing can speed up and streamline the information extraction process[
17]. Secondly, healthcare professionals' preference for the natural language simplicity of handwritten notes remains a major barrier to a widespread adoption of electronic health records, which require field coding at the point of care to provide structured inputs[
12].
Variability refers to the consistency of data over time[
16]. Data flows are unpredictable, they change often and vary widely. It's essential to know how to manage daily, seasonal, and event-driven data spikes[
19].
Veracity: ensuring that big data is accurate and trustworthy is critical in healthcare, where accurate information can mean the difference between life and death[
12]. Nevertheless, achieving veracity faces challenges, including variable quality and difficulties in ensuring accuracy, especially with handwritten prescriptions.
Value consists of the worth of information to various stakeholders or decision makers[
20].
Big data includes clinical data sourced from Computerized Physician Order Entry (CPOE) and Clinical Decision Support (CDS) systems, as well as patient information stored in electronic patient records (EPRs), and machine-generated/sensor data, including vital sign monitoring[
12]. Big data analytics may improve care and reduce costs by identifying connections and understanding patterns and trends among different items[
21]. In fact, it could potentially enhance healthcare outcomes through information elaboration, healthcare provider guidance, preventative care candidates identification, and disease profiling[
12].
In regular healthcare analytics, project analysis is typically performed using easy-to-use business intelligence tools on stand-alone systems; however, in big data analytics, the processing of large datasets is distributed across multiple nodes, requiring a shift in user interfaces[
22]. While traditional analytics tools are easy to use and transparent, new tools are complex, programming intensive, and require different skill sets to be most effective.
In order to guarantee an adequate output, this huge amount of data has to be verified by valid tools.
Blockchain is a technology characterized by the decentralization of entries, meaning that inputs are agreed upon by a peer-to-peer network through various consensus protocols, rather than a central authority controlling the content[
23]. Furthermore, many blockchains offer anonymity or pseudo-anonymity[
23,
24]. Specifically for healthcare data management, these features ensure data security and privacy through a network of secure blocks linked by cryptographic protocols[
23,
25]. Another key feature of blockchain is persistency: once data is inserted into a block and added to the chain, it cannot be deleted[
23]. This implies that if an inaccurate data is added to the blockchain, it becomes a permanent part of the ledger. Thus, it is important to ensure the accuracy of data before adding it to the blockchain[
26].
The integration of AI and blockchain is promising: AI tools could leverage information acquired from a secure, unchangeable, and decentralized system for storing sensitive data required by AI-driven techniques[
25]. An integration of AI and blockchain in the metaverse has been proposed in order to provide digital healthcare through realistic interactions[
27]. By using blockchain for data security and privacy, healthcare providers and patients engage in consultations in a virtual environment: participants are represented by avatars, and consultation data is securely recorded and stored on the blockchain[
27]. This data is then used by explainable AI models to predict and diagnose diseases, ensuring logical reasoning, trust, transparency, and interpretability in the diagnostic process[
27].
Machine learning (ML), a subset of computer science and artificial intelligence, seeks to identify patterns in data to boost the effectiveness of various tasks[
28]. In healthcare, ML uses automated, adaptive and computationally advanced techniques to recognize patterns within complex data structures [
12]. ML models improve their performance by means of a continuous auto-training process [
13,
14]. This approach differs from "traditional" methods and explicit human programming, which rely on certain statistical assumptions and require a predefined set of dimensions, functional relationships, and interactions [
12,
14] - an issue often avoided in ML.
To develop a reliable ML model, accurate training datasets are required; therefore, a preprocessing phase is usually needed[
3]. Most of the data is used to train the model with preliminary analyses performed to identify the strongest relationships between variables and study outcomes. The remaining data can be used for internal validation. At this stage, the model can be tested on different datasets[
3].
ML-aided tasks have already been incorporated into clinical practice, especially in imaging interpretation[
29,
30,
31]. Although they are still imperfect and require a skilled supervisor, they are considered acceptable when rapid image feedback is needed and local expertise is lacking[
9]. A growing number of applications have been developed. Some of them, combining clinical, genetic, and laboratory items are able to detect rare or common conditions that would otherwise be missed[
9]. ML is divided into three branches, which are selected according to one of the three required research tasks[
32,
33]: supervised ML for prediction, unsupervised ML for description, and reinforcement learning for causal inference.
Supervised ML (SML) is a predictive model, designed to estimate the likelihood of an event occurring[
3]. The predictive analytics applied span from basic computations such as correlation coefficients or risk differences to advanced pattern recognition techniques and supervised learning algorithms like random forests and neural networks, which serve as classifiers or predict the joint distribution of multiple variables[
33]. The supervised ML development process involves three subsets of data. First, a training set of labeled data (e.g. histological specimens that have already been labeled as normal or diseased by a human expert) is provided for the algorithm to learn by adjusting weights to minimize the loss of function which calculates the distance between the predicted and true outcome for a given data point[
32,
34]. Next, the model parameters are optimized using a second validation set[
32]. The validation set can also detect overfitting, which is observed when model performance is significantly better on the training set. Finally, a third set is used to evaluate the model's ability to generalize to new datasets[
32]. Once the training session upon labeled data is completed, then the system is applicable to unlabeled data. In this way the trained models predict outcomes through either classification or regression, respectively in categorical or continuous types of data[
34].
Decision trees (DTs) are non-parametric supervised learning algorithms used for classification[
35,
36]. They map attribute values to classes and have a tree-like structure, including a root node, branches, internal nodes (or decision nodes), and leaf nodes (or terminal nodes)[
36,
37]. Nodes conduct evaluations, based on the available information, and each node is linked to two or more subtrees or leaf nodes labeled with a class, representing a possible outcome[
36,
37]. Various types of DTs can be used both in classification and regression tasks[
36]. DTs are often preferred over other methods in fields such as healthcare due to their interpretability, despite being less accurate. This is because they are easier to understand and explain compared to other, more complex methods that might be more accurate but relatively uninterpretable[
35].
Unsupervised ML (UML) is used for descriptive tasks, with the goal of data clustering and revealing relationships within a data structure[
33]. Descriptive tasks provide quantitative summaries of specific features in a certain scenario and require analytics ranging from simple calculations to complex techniques[
33]. The main goal of unsupervised learning is to identify inherent groupings or clusters within a data structure, in order to find out data differences, similarities and distributions in feature space[
3,
28,
32]. In unsupervised ML systems, training is data-driven, rather than human-driven, and uses unlabeled data (compared to Supervised ML, whose training features labeled data and is driven by human experts)[
33]. This category lacks a guiding response variable during analysis[
28].
Reinforcement Learning (RL) is a computational approach where an agent learns to achieve a goal through a trial-and-error cycle in an interactive environment[
32]. The agent's decision-making strategy is improved through its interaction with the environment[
38]. The goal of RL is the selection of actions that will maximize future rewards[
38]. This is achieved through iterative learning cycles resulting in a reward or penalty in relation to a pre-defined target[
34]. For instance, since there is a need for blood glucose concentration monitoring and for an ideal determination of time and amount for insulin delivery in diabetic patients, RL algorithms are potentially capable of learning the individual glucose pattern of a diabetic patient in order to provide adaptive drug supply after a learning process[
39]. Changes in the glucose level lead to an action of the agent, in terms of insulin injection or no treatment. Subsequently the agent receives a numerical reward, which along with the next glucose level will impact upon the next action[
39].
AI’s subsets are represented in
Figure 1. ML’s subtypes and algorithms are summarized in
Figure 2.
Deep Learning (DL) refers to an even more complex subgroup of ML based on numerous processing layers, which may use supervised, unsupervised and reinforcement ML applications[
34,
40]. In a certain way, it mimics how the human brain builds its own model of the world by learning from large amounts of sensory-motor data acquired through interactions with the environment[
38]. DL differs from ML in a number of characteristics. ML requires "manual" feature extraction and processing[
41]; it reaches a "plateau" where the quality of performance no longer increases with the volume of data; its training time is somewhat "limited"[
32]. On the other hand DL is capable of automatically learning feature hierarchies; it requires a significant amount of data to make predictions, and because it is more computationally intensive than ML, it may require longer training times and state-of-the-art machines to run[
32,
34]. The complex architecture of DL, consisting of several processing layers that are mostly inaccessible to human users (the so-called “black box of AI”), may pose an issue for the model’s accountability in healthcare[
34].
The application of these models is potentially limitless, though not without risk.
Generative AI (GAI) is a type of DL technique that generates realistic facsimiles by evaluating training examples and learning their patterns and distribution[
42]. GAI can produce various types of content by using existing sources such as text, audios, images, and videos[
42]. One well-known example of its application is ChatGPT, an AI-driven chatbot. Its potential in supporting medical research and clinical practice is currently being assessed[
43]. ChatGPT is based on Generative Pre-trained Transformers (GPT - a type of DL model that enables natural language processing) that generate human-like text based on large amounts of data[
9,
44]. A 2023 systematic review has evaluated its utility in several fields such as data analysis, literature review, scientific writing, medical record storage and management, up to generating diagnoses and management plans[
45]. Frequently raised issues concerning generative AI in academic writing include bias, plagiarism, privacy, and legal concerns up to scientific fraud (e.g. fake image synthesis or convincing fraudulent articles resembling genuine scientific papers)[
46,
47,
48]. Therefore the World Association of Medical Editors advises authors and editors to disclose chatbot use in their work and to provide themselves with tools for detecting AI-generated content[
34,
42]. Furthermore, GAI can also generate non-textual items (images, videos and audios).
Finally when applied to complex analysis of high-dimensional data, including clinical applications, DL can achieve remarkable outcomes[
40], e.g. computer-assisted diagnosis of melanoma. In fact a Deep Convolutional Neural Network trained on images has achieved performances comparable to dermatology experts in identifying keratinocyte carcinomas and malignant melanomas[
49].
Neural Networks (NNs) are the baseline architecture of DL models[
50]. They are structured in multiple layers consisting of neuron-like interconnected nodes[
32] (
Figure 3a and 3b). Once inserted, data flows along the first layer to the structure of interconnected nodes in a "forward propagation" mechanism[
32]. The signal received by each node is a result of a weighted linear combination of node outputs from the prior layer, meaning that they are multiplied by a weight assigned to each connection and summed up. A nonlinear transformation is instead applied to the node’s output[
32,
50]. The final result from the output layer is compared to the true value and a “back propagation” algorithm optimizes results by using prediction error and adjusting weights[
32]. Because NNs are highly parametrized, they might “over-fit” models to data: thus, a series of regularization strategies have been implemented to prevent it[
40]. To name one, dropout is a regularization technique where random neurons are dropped, along with their connection, during training[
51]. This prevents units from co-adapting too much and helps it to generalize better to unseen data.
All AI models used in clinics and research are summarized in
Table 1.
5. Discussion
Although further validation is required, AI might represent an useful supporting tool for PED decision-making and has the potential to improve the timely allocation of resources and interventions[
72]. Recognizing and addressing a series of barriers is crucial for a safe development of efficient AI tools[
107,
108].The first key challenge is ensuring AI an accurate input. Expert clinicians ought to select features to be accurately included in datasets for ML training because inaccurate training datasets lead to suboptimal diagnostic accuracy[
109]. ML subsequent iterations may amplify these errors, reinforcing biases introduced in the early phases[
32]. In most cases monocentric datasets fail to correctly address the heterogeneity of pediatric conditions, particularly in the Emergency Department. Pediatric populations accessing different PEDs may differ heavily, particularly when comparing rural and urban PEDs, leading to significant biases in the training phase. Failure to account for this can lead to misdiagnosis[
110]. Additionally, the peculiarities of the emergency-urgency network of a certain region may lead to some centers being the reference hubs for a subset of pathologies (e.g., neurological disorders), leading to an overrepresentation of certain conditions in the dataset.
These datasets should be the result of a multidisciplinary team of professionals working towards a predefined goal. As a notorious quote says: “garbage in – garbage out”, meaning input datasets are crucial in determining the final outcome[
111]. Ensuring data quality requires a dedicated infrastructure, i.e. patient health records, diagnostic images, and real-time data monitoring. Relying excessively on paper-based systems and having operator-built databases introduces several potential pitfalls for error, potentially reducing the final accuracy and reproducibility of the algorithms[
112]. Confirmation bias is another item to be considered. AI recognizes patterns on which it has trained upon but unlikely identifies what it is not taught[
113].
To this purpose, ensuring external validation is vital for quality control. When applied to an external dataset a certain clinical support tool could result neither sensible or specific[
91]. Besides, the algorithm's performance can quickly reduce as clinical practices dynamically evolve. Hence, a continuous influx of data is pivotal to refine the model and keep it current[
32].
Some of these limitations are particularly true for pediatrics. Children stand only for a marginal portion of healthcare resources and their datasets are frequently small[
3]. Granular data (i.e., displaying a high level of detail in the way data is structured), ideal for ML applications, is rarely available in pediatrics. Targeting a right balance between granularity and simplicity is a key factor in optimizing AI performance and ensuring significant outcomes from complex datasets[
114]. Furthermore, in children data is not homogenously distributed resulting in inequality with some data sharply prevailing on others due to significative variations in features according to the patient’s age. AI models ought to account for variations and changes in disease risk that occur according to age[
115]. Vital and auxological parameters have to be interpreted according to age as well. Even radiologic images need to be interpreted taking into account the age and the resulting changes in anatomy, physiopathology and possible differential diagnosis. All the previously mentioned issues could be challenging in validating an automated analysis[
116].
The absence of evidence-based and variability in care are other factors limiting the application of AI in pediatrics. In fact, a wide range of pediatric conditions has no gold standard care universally shared and treatment significantly varies among different institutions[
117,
118].
The integration of AI and ML into pediatric emergency wards demands a workforce that is proficient in these technologies. Healthcare professionals must be upskilled or reskilled to understand the capabilities and limitations of AI and ML, interpret the outputs of these systems, and integrate this information into clinical decision-making processes[
119]. Training programs and continuous education initiatives are essential to equip healthcare professionals with the knowledge and skills needed to work alongside AI and ML technologies effectively[
120]. This not only enhances the quality of patient care but also ensures that healthcare professionals can remain competitive and adapt to the evolving landscape of healthcare technology.
Most of the studies retrieved by our review were conducted outside of the European Union. This can be partly linked to the attention that AI-assisted software and medical devices receive in the EU, e.g. compared to the United States. The EU’s recently approved AI Act[
121] categorizes AI as "high risk" when it is implemented in health and care; this wary approach is not far from what was previously observed in the General Data Protection Regulation and similar norms, which impose stricter regulations on the use of health data for research and training purposes[
122]. Most of the studies we reviewed were conducted in the U.S.A., whose Food and Drug Administration has issued several documents defining AI and ML software as “medical devices”[
123], providing additional guidance on good practice to develop them. A recent scoping-review suggested similar results on the broader field of clinical trials[
124].
Finally, AI poses ethical questions, especially when it comes to liability. Will pediatricians be responsible for eventual consequences that do not fit predictions? At present time clinicians may be liable for harm to patients if they observe AI indications to use nonstandard care methods. Current law only protects doctors from liability when they follow the standard care. Nevertheless, as AI gets integrated in gold standards, we could speculate physicians would probably avoid liability when following AI indications[
125].