Artificial Intelligence-Based Algorithms in Medical Image Scan Segmentation and Intelligent Visual-Content Generation

Preprint

Review

Artificial Intelligence-Based Algorithms in Medical Image Scan Segmentation and Intelligent Visual-Content Generation

Altmetrics

Downloads

162

Views

Comments

Zofia Rudnicka,

Janusz Szczepanski

Agnieszka Pregowska^*

Zofia Rudnicka,

Janusz Szczepanski

Agnieszka Pregowska^*

This version is not peer-reviewed

Submitted:

15 November 2023

Posted:

16 November 2023

You are already at the latest version

Alerts

Abstract

Recently, Artificial Intelligence (AI)-based algorithms have revolutionized the medical image segmentation processes. Thus, the precise segmentation of organs and their lesions may contribute to an efficient diagnostics process and a more effective selection of targeted therapies as well as increasing the effectiveness of the training process. Thus, AI may contribute to the automatization of the image scan segmentation process and increase the quality of the resulting 3D objects, which may lead to the generation of more realistic virtual objects. In this paper, we focus on the AI-based solutions applied in the medical image scan segmentation, and intelligent visual-content generation, i.e. computer-generated three-dimensional (3D) images in the context of Extended Reality (XR). We consider different types of neural networks used with a special emphasis on the learning rules applied, taking into account algorithm accuracy and performance, as well as open data availability. This paper also attempts to summarize the current development of AI-based segmentation methods in medical imaging and intelligent visual content generation that are applied in XR. Finally, this paper concludes with possible developments and open challenges in AI application in Extended Reality-based solutions. Finally, the future lines of research and development directions of Artificial Intelligence applications both in medical image segmentation and Extended reality-based medical solutions are discussed.

Keywords:

Subject: Computer Science and Mathematics - Artificial Intelligence and Machine Learning

1. Introduction

The human brain, a paramount example of evolutionary biological sophistication, transcends its anatomical categorization. Constituted by an estimated 86 billion neurons linked through an intricate web of synapses (ranging in the trillions), it is the epicenter of our cognitive, emotional, and consciousness-related functions [1]. This masterful structure of the central nervous system represents a nexus of myriad neurobiological processes, intricately overseeing sensory input conversion, motoric responses, and advanced cognitive functionalities. As a product of relentless evolutionary adaptations spanning millions of years, the brain epitomizes the apex of neurobiological optimization, synergizing complex neural circuitry with higher-order cognitive undertakings such as cognitive reasoning, emotional homeostasis, and the intricate processes of memory encoding, storage, and retrieval [1,2,3]. Thus, the human brain is a super-complex system whose functioning and intelligence depend rather on the type of neurons (depending on their role in the brain), their connections, and the way of supplying energy to neurons than the number of neurons [2]. It is an ideal reference model for the foundations of Artificial Intelligence (AI) [3,4].

Thus, processing and analysis of biomedical data for diagnostic purposes is a multidisciplinary field that combines AI, Machine Learning (ML), biostatistics, time series analysis as well as statistical physics and algebra (e.g. graph theory) [3]. Variables derived from biomedical phenomena can be described in several ways and in different domains (time, frequency, spectral values, spaces of states describing the biological system), depending on the characteristics and type of signal. Effective diagnosis of the early stages of the disease, as well as the determination of disease development trends, is a very difficult issue that requires taking into account many factors and parameters. Therefore, the state spaces of biomedical signals are huge and impossible to fully search, analyze, and classify even with the use of powerful computational resources. Therefore, it is necessary to use Artificial Intelligence, in particular, bio-inspired AI methods to limit research to a smaller but significant part of the state space.

Recently, computer-generated three-dimensional (3D) images have become increasingly important in medical diagnostics [5,6]. In particular, Extended Reality (XR) so-called Metaverse is increasingly used in health care and medical education, while it enables the deeper experience of the virtual world, especially through the development of depth perception, including the rendering of several modalities like vision, touch, and hearing [7]. In fact, medical images have different modalities and their accurate classification at the pixel level enables the accurate identification of disorders and abnormalities [7,8]. However, creating a 3D model of organs and/or their abnormalities is time-consuming and is often done manually or semi-automatically [10]. AI can automate this process and also contribute to increasing the quality of the resulting 3D objects [11,12] as well as visual content in the Metaverse [4,13]. To give the users a real sense of visual immersion, the developers should virtual objects of high quality [14]. In the context of medicine, it is combined with good quality medical data and their classification/segmentation algorithms with high accuracy, to faithfully reproduce the content in virtual three dimensions.

For that reason, this paper focuses on the overview of Artificial Intelligence-based algorithms in medical image scan segmentation and intelligent visual content generation in Extended Reality, including different types of neural networks used and learning rules, taking into account mathematical/theoretical foundations, algorithm accuracy, and performance, as well as open data availability.

2. Materials and Methods

The methodology of review methodology was based on the PRISMA Statement [15] and its extensions: PRISMA-S [16]. We considered recent publications, reports, protocols, and review papers from Scopus and Web of Science databases. The keywords: Artificial Intelligence, Machine Learning, Extended Reality, Mixed Reality (MR), Virtual Reality (VR), Metaverse, learning algorithms, learning rules, signal classification, signal segmentation, medical image scan segmentation, segmentation algorithms, classification algorithms, and their variations. The selected sources were analyzed in terms of compliance with the analyzed topic, and then their contribution to medical image scan segmentation. First, the obtained title and abstract were independently evaluated by the authors. The duplicated records have been removed. Moreover, we have considered the inclusion of criteria-like publication in the form of journal papers, books, and proceedings as well as technical reports. The search was limited to full-text articles in English, including electronic publications before printing. Also, the exclusion criteria like Ph. D. thesis and materials not related to medical image scan segmentation and Artificial Intelligence-based algorithms have been adopted. Subsequently, articles meeting the criteria were retrieved and analyzed. The documents used in this study were selected based on the procedure presented in Figure 1. Finally, 162 documents were taken into account.

3. Neural communication

Neurons, which are basic brain building blocks, function as the core computational units of the brain, underpinning the vast expanse of conscious and subconscious processes, and defining our neural identity with each electrochemical interaction [11]. The quintessence of neural communication is synaptic transmission. At these specialized junctions, the presynaptic neuron releases neurotransmitters, a diverse group of chemicals, into the synaptic cleft. Following the release, these compounds traverse the synaptic gap, interacting with receptors on the postsynaptic membrane, eliciting a series of intracellular events, potentially leading to the generation of an action potential, a transient depolarizing event propagated along the neuronal membrane. The multifunctionality of neurons is evident across physiological domains. While some mediate rudimentary autonomic functions, such as cardiac rhythm regulation, others participate in higher-order cognitive tasks, encompassing analytical reasoning and conceptual abstraction. Based on their anatomical localization and associated circuits, neurons can modulate affective states, dictate cognitive strategies, and contribute to individual behavioral phenotypes. Moreover, the synaptic connections between neurons exhibit plasticity, an inherent ability to modify their strength or form novel connections, representing experiential and learning-based adaptations. Such neural plasticity underscores the capacity for cognitive and behavioral adaptability, ensuring the brain's functional flexibility across an individual's lifespan.

Since the famous experiments of Adrian [17], it is assumed that in the nervous systems (including the brain), information is transmitted through weak electric currents (on the order of 100 (mV)), in particular employing action potentials (spikes) that are a transient, sudden (1-2 millisecond) change in the membrane potential of the cell/neuron associated with the transmission of information [18]. The stimulus for the creation of an action potential is a change in the electric potential in the cell's external environment. A wandering action potential is called a nerve impulse. In literature [19,20] it is assumed that the sequences of such action potentials, called spike-trains, play a key role in the transmission of information, and the times of appearance of these action potentials play a significant role. Mathematically, such time sequences can be and are modeled in particular after digitalization as trajectories (or their various variants) of certain stochastic processes (Bernoulli, Markov, Poisson, ...) [19,21,22,23,24,25,26,27].

4. Taxonomy of neural network applied in the medical image segmentation process

The Artificial Neural Networks (ANNs) are constructed with the perceptron neuron model [28] that is based on the binary decision rule. If the linearly weights

w_{i}

the sum of the input signals (input vector

x_{i}

) exceeds the threshold

t_{h r}

neuron fires (i.e. the output is equal to 1) or if not output is equal to 0.

The basic input function is described as follows

f (x) = \{\begin{matrix} 1, i f w_{1} x_{1} + w_{2} x_{2} + \dots + w_{n} x_{n} \geq t_{h r} \\ 0, o t h e r w i s e \end{matrix}

(1)

The output vector of all neurons in

l

-th layer can be expressed as well as the combination of the linear transformation and non-linear mapping (i.e. ANN activation values) [29].

a^{l} = h (W^{l} a^{l - 1}), i = 1, \dots, M

(2)

where

W^{l}

is the weight matrix between layer

l

and

l - 1

, and

h (∙)

denotes the activation function, in this case, Rectified Linear Unit (ReLU)

f (x) = x^{+} = \max (0, x)

and the vector

a^{l}

denotes the output of all neurons in

l

-th layer. The formula (2) has been quoted following the designations in the publication [29]. Neuron models from the Integrate-and-Fire family are among the simplest, however also the most frequently used. They are classified as spiking models. From a biophysical point of view, action potentials are the result of currents flowing through ion channels in the membrane of nerve cells. The Integrate-and-Fire neuron model [30,31] focuses on the dynamics of these currents and the resulting changes in membrane potential. Therefore, despite numerous simplifications, these models can capture the essence of neuronal behavior in terms of dynamic systems.

The concept of Integrate-and-Fire neurons is the following: the input ion stream depolarizes the neuron's cell membrane, increasing its electrical potential. An increase in potential above a certain threshold value

U_{t h r}

produces an action potential (i.e. an impulse in the form of Dirac's delta) and then the membrane potential is reset to the resting level. The leaky Integrate-and-Fire (LIF) neuron model [30,31] is an extended model of the Integrate-and-Fire neuron, in which the issue of time-independent memory is solved by equipping the cell membrane with a so-called leak. This mechanism causes ions to diffuse in the direction of lowering the potential to the resting level or another level

U_{0} \to U_{l e a k} < U_{t h r}

. Thus, the third generation of neural networks, i.e. the Spiking Neural Networks (SNN) [32] are mostly based on the LIF, where the membrane potential

U (t)

is determined by the equation

τ_{m} \frac{d U}{d t} = - [U (t) - U_{r e s t}] + R_{m} I (t),

(3)

where τ_m is the membrane time constant of the neuron, R_m is total membrane resistance, and I(t) is the electric current passing through the electrode. The spiking events are not explicitly modeled in the LIF model. Instead, when the membrane potential U(t) reaches a certain threshold U_th (spiking threshold), it is instantaneously reset to a lower value U_rest (reset potential) and the leaky integration process starts a new one with the initial value U_r. To mention just a little bit of realism to the dynamics of the LIF model, it is possible to add an absolute refractory period Δ_abs immediately after U(t) hits U_th. During the absolute refractory period, U(t) might be clamped to U_r, and the leaky integration process is re-initiated following a delay of Δ_abs after the spike. More generally, the membrane potential (3) can be presented as

U (t) = \sum_{i = 1}^{N} ω_{i} \sum_{t_{i} < t} u (t - t_{i})

(4)

where

u (t)

is a fixed casual temporal kernel that is an operation that allows scale covariance and scale invariance in a causal-temporal and recursive system over time [33] and

ω_{i}, i = 1, . ., N

denotes the strength of neuron synapses. Following Equation (2), the neuron's output

m^{l} (t)

(membrane potential after the neuron firing) can be described as follows [29]

m^{l} (t) = v^{l} (t - 1) + W^{l} x^{l - 1} (t) l = 1, \dots, N

(5)

where

v^{l}

denotes the membrane potential before the neuron fires,

W^{l}

is the weight in

l

-th layer (

l

denoted layer index), and

x^{l - 1} (t)

is the input from the last layer. Thus, to avoid the loss of information the reset-by-subtraction” mechanism was introduced [34]

v^{l} (t) - v^{l} (t - 1) = W^{l} x^{l - 1} (t) - (H (m^{l} (t) - θ^{l}) θ^{l})

(6)

where

v^{l} (t)

is membrane potential after firing,

m^{l} (t) -

membrane potential before firing,

H (m^{l} (t) - θ^{l})

refers to the output spikes of all neurons, and

θ^{l}

is a vector of the firing threshold

θ^{l}

. There are also some applications of the concepts of the meta-neuron model in SNNs [35]. The main differences between the LIF neuron and meta neurons stay in the integration process, where meta neurons use a 2nd-order ordinary differential equation and an additional hidden variable. The basic differences between ANN and SNN (taking into account the type of neuron models) are presented in Figure 1.

Figure 1. The scheme of the basic differences between ANN and SNN takes into account the type of neuron models.

4.1. Convolutional Neural Network

The most commonly used deep neural network (DNN) in medical image classification is the two-dimensional (2D) Convolutional Neural Network (CNN) [36,37]. In the Figure 2. The basic scheme of the SNN is presented. Its principle of operation is based on linear algebra, in particular matrix multiplication. CNNs consist of three types of layers: a convolutional layer, a pooling layer, and a fully connected layer. In fact, most computations are performed in the convolutional layer or layers. The image (pixels) is converted into binary values and patterns are searched. Every convolutional layer operates a dot product between two matrices, namely one matrix is a set of learnable parameters (kernel), and the second matrix is a limited part of the receptive field. Each subsequent layer contains a filter/kernel that allows you to classify features with greater efficiency. A pooling layer reduces the number of parameters in the input, which causes the loss of part of the information calculated in the common layer/layers, however, it allows for improvement in the efficiency of the CNN network. This operation is performed by sliding windows [38]. Next, the output of these two layers is transformed into a one-dimensional vector, i.e. input to the fully connected layer. In this last type of layer, image classification based on the features extracted in the previous layers is performed, i.e. the object in the image is recognized. The output

y_{i, j}^{(k)}

from CNN can be described as follows

y_{i, j}^{(k)} = σ (\sum_{l = 1}^{L} \sum_{m = 1}^{M} x_{i + l - 1, j + m - 1}^{(l)} w_{l . m}^{(k)} + b^{(k)})

(7)

where

x_{i, j}^{(l)}

denotes input to the network at the spatial location

(i, j)

σ

is the activation function,

w_{l . m}^{(k)}

is the weight of the

m

th kernel at the

l

th channel producing the

k

th feature map, and

b^{(k)}

) is the bias for

k

th feature map.

In the case of large datasets, CNN achieves high efficiency and is resistant to noise [39]. The crucial disadvantages of CNNs in image processing are high computational requirements and difficulties in achieving high efficiency in the case of small datasets (i.e. if the dataset is too small the network may overfit to training data, and poorly recognize new data).

Figure 2. The basic scheme of the simple Convolutional Neural Network.

4.2. Recurrent Neural Network

Another neural network commonly applied in medical data analysis is the Recurrent Neural Network (RNN) [40]. In the Figure 3. The basic scheme of the RNN is presented. This type of network contains at least one feedback connection. The output of RNN can be expressed as [41]

y_{i} = W_{h y} H (W_{h h} h_{i - 1} + W_{x h} x_{i} + b_{h}) h_{i} + b_{y}

(8)

where

x_{i}

i - 1, \dots, N

W_{x h}

W_{h y}

W_{h h}

denotes weight matrices,

b_{h}

b_{y}

are bias vectors, and

H

is the non-linear activation function, for example, ReLU, Sigmoid

f (x) = \frac{1}{1 + e^{- x}}

, Tanh Function (Hyperbolic Tangent)

f (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

. The network operation is recursive since the hidden layer state depends on the current input and the previous state of the network. Thus, the hidden state

h_{i - 1}

is the memory of past inputs.

Thus, the RNN can operate on the sequential dataset and has an internal memory. It may have many inputs. However, RNNs exhibit learning-related problems, namely vanishing gradients (i.e. in the case of small gradients the updates of parameters are irrelevant) or exploding gradients (i.e. superposition of large error gradients leading to large parameter updates). These contribute to the long training process, low level of accuracy, and low network performance.

Figure 3. The basic scheme of the simple Recurrent Neural Network.

4.3. Spiking Neural Networks

Besides the Artificial Neural Networks, i.e. CNNs, and RNNs, one can also be applied to the medical signals bio-inspired neural networks like Spiking Neural Networks [41,42]. In the Figure 4. The basic scheme of the SNN is presented. SNNs encode information taking into account spike signals, and shells are promising in effectuating more complicated tasks, while the more spatiotemporal information is encoded with spike patterns [43]. They are mostly based on the LIF neuron model. SNNs were formulated to map organic neurons, i.e. the appearance of the presynaptic spike at synapse triggers the input signal

i (t)

(the value of the current) that in the simplified cases can be written as follows

i (t) = \int_{0}^{\infty} S_{j} (s - t) \exp (\frac{- s}{τ_{s}}) d s

(9)

where

τ_{s}

denotes synaptic time constant,

S_{j}

is a presynaptic spike train,

t

is time [44]. In contrast, the majority of DNNs do not take into account temporal dynamics [45]. In fact, SNNs show promising capability in playing a similar performance as living brains. Moreover, the binary activation in SNNs enables the development of dedicated hardware for neuromorphic computing [46]. The potential benefits are low energy usage and greater parallelizability due to the local interactions.

Figure 4. The basic scheme of the simple Spiking Neural Network.

5. Learning algorithms

The heart of Artificial Intelligence is its learning algorithms. At their core, strive to automate the learning process, enabling machines to recognize patterns, make decisions, and predict outcomes based on data. Their design is often a balance between theoretical rigor and practical applicability. While mathematics and statistics provide the foundation, translating these into algorithms that can operate on vast and diverse datasets requires creative programming skills [22]. One can distinguish many types of network training algorithms [47]. Below we briefly discuss the most important of them.

5.1. Back Propagation Algorithm

The most commonly used learning algorithm is the back propagation (BP) algorithm. Ititers overweight optimizations via error propagation in the neural networks. BP plays a pivotal role in enabling neural networks to recognize complex and non-linear patterns from large datasets [23,48,49]. From the mathematical point of view, it is a calculation of the cost function, which minimizes the calculated error of the output using gradient descent or delta rule [50]. It can be split into three stages: forward calculation, backward calculation, and computing the updated biases and weights. The input to the hidden layer

H_{j}

is the weighted sum of the outputs of the input neurons and can be described as [51]

H_{j} = b_{i n} + \sum_{i = 1}^{n} x_{i} w_{i j}

(10)

where

x_{i}

is the input to the network (input layer),

n

is the number of neurons in the input layer,

b_{i n}

is the bias input layer, and

w_{i j}

denotes the weight associated with the

i

-th input neuron and the

j

-th hidden neuron. The output

y_{k}

is as follows

y_{k} = b_{h} + \sum_{j = 1}^{m} w_{j k} F (H (j))

(11)

where

F (H (j))

is a transfer function,

k

is the number of neurons in the hidden layer, and

b_{h}

is the bias of the hidden layer. The most commonly used transfer function is the sigmoid transfer function

F (H (j)) = \frac{1}{1 + e^{- (H (j))}}

. The back propagation algorithm is especially effective when used in multi-layered neural architectures such as feed-forward neural networks, convolutional neural networks, and recurrent neural networks [26]. In image recognition, CNNs, energized by BP, can independently identify hierarchical features, from basic edges to detailed structures. Similarly, RNNs, amplified by BP, are adept at sequence-driven tasks like machine translation or speech recognition, as they incorporate previous data to influence present outputs. It is one of the most effective deep learning methods. However, BP requires large amounts of data and enormous computational efforts.

5.2. ANN-SNN Conversion

Artificial Neural Networks and Spiking Neural Networks are both computational models inspired by biological neural networks. While ANNs have been the mainstream for most deep learning applications due to their simplicity and effectiveness, SNNs are gaining traction because they mimic the behavior of real neurons more closely by using spikes or binary events for communication. To obtain a similar accuracy of the SNN-based algorithm as the algorithm using ANN, for example, the BP-type training rule consumes a lot of hardware resources. And the already existing platforms have limited optimization possibilities. Thus, the conversion of ANNs to SNNs seeks to harness the energy efficiency and bio-realism of SNNs without reinventing the training methodologies [28], while it is based on the ReLU activation function and LIF neuron model [52]. The basic principle of the conversion of ANNs to SNNs is mapping the activation value of the ANN neuron to the average postsynaptic potential (in fact, firing Rate) of SNN neurons, and the change of the membrane potential (i.e. the basic function of spiking neurons) can be expressed by the combination of the Equation (2) and Equation (6)[29]

v^{l} (t) - v^{l} (t - 1) = W^{l} x^{l - 1} (t) - s^{l} (t) θ^{l}

(12)

Here

s^{l} (t)

refers to the output spikes of all neurons in layer

l

at time

t

Tuning the right thresholds is paramount for the SNN to effectively and accurately represent information. Incorrectly set thresholds could lead to either too frequent or too rare spiking, potentially affecting the accuracy of the SNN post-conversion [35]. On the other hand, the neuromorphic hardware platforms that support SNNs natively can primarily offer energy efficiency benefits by converting ANNs to SNNs. Due to their event-driven nature, SNNs can be more computationally efficient [36]. However, the challenge lies in maintaining accuracy post-conversion. Some information might be lost during the transition, and not all ANN architectures and layers neatly convert to their SNN equivalents. The conversion from ANNs to SNNs is a promising direction, merging the advanced training methodologies of ANNs with the energy efficiency of SNNs. As we delve deeper into the realm of neuromorphic computing, this conversion process will play a pivotal role in bridging traditional deep learning with biologically-inspired neural models [37,38].

5.3. Supervised Hebbian Learning (SHL)

Taking into account Artificial Intelligence, Supervised Hebbian Learning (SHL)can be described as a general methodology for weight changes [53]. Thus, this weight increases when two neurons fire at the same time, while it decreases when two neurons fire independently. According to this rule, the change in weight can be written

∆ w = η (t^{o u t} - t^{d})

(13)

where

η

is the learning rate (in fact, the small scalar that may vary with time,

η > 0

t^{o u t}

the actual time of the postsynaptic spike, while

t^{d}

is the time of firing of the second presynaptic spike [54,55]. The crucial disadvantage of Hebbian learning is the fact that when the number of hidden layers increases the efficiency decreases, while in the case of 4 layers is still competitive [56].

5.4. Reinforcement Learning with Supervised Models

According to the additional constraints in the SHL rule, Reinforcement Learning with Supervised Models (ReSuMe) was proposed [54]. ReSuMe, is a dynamic hybrid learning paradigm. It effectively combines the resilience of Reinforcement Learning (RL) with the precision of Supervised Learning (SL). This fusion empowers ReSuMe to leverage feedback-driven mechanisms inherent in RL and take advantage of labeled guidance typical for SL [37,38,39]. The difference between SHL is that the learning signal is expected not to have or have a marginal direct effect on the value of the postsynaptic somatic membrane potential [57], thus the synaptic weights are modified as follows

\frac{d}{d t} w_{j i} (t) = a [S_{d} (t) - S_{j} (t)] {\bar{S}}_{i} (t)

(14)

where

a

denoted learning rate,

S_{d}

is desired/targeted spike train,

S_{j} (t)

is the output of the network (spike train), and

{\bar{S}}_{i} (t)

expresses the low-pass filtered input spike train. ReSuMe guided one of its most salient features exploration. By leveraging labeled data via SL, ReSuMe can effectively steer RL exploration, ensuring agents avoid falling into the trap of suboptimal policies. The hybrid nature of ReSuMe also grants it a unique resilience, especially in the face of noisy data or in reward-scarce environments. Moreover, its adaptability is noteworthy, making it an ideal choice for tasks that combine immediate feedback (through SL) with long-term strategic maneuvers (through RL). However, like all things, ReSuMe is not without challenges. A potential bottleneck in ReSuMe is computational complexity, as managing both RL and SL can sometimes strain computational resources. Another challenge is the precise tuning of the λ coefficient. The key is to find a balance where neither RL nor SL overly dominates the learning process. By melding immediate feedback from supervised learning with a deep reinforcement learning strategy, ReSuMe establishes itself as a formidable tool in Machine Learning [49,50,52].

5.5. Chronotron

The Chronotron, by its essence, challenges and reshapes our understanding of how information can be encoded and processed in neural structures [50,55]. Traditional neural models have predominantly focused on the spatial domain, emphasizing the architecture and interconnections between neurons. While this spatial component is undeniably critical, it offers only a part of the full informational symphony that the brain plays. Just as the rhythm and cadence of a song contribute as much to its essence as its melody, in the vast theater of the brain, timing is not just a factor; it is a storyteller in its own right. The brilliance of the Chronotron lies in its ability to discern and respond to this temporal narrative. Unlike its counterparts, which often treat time as a secondary parameter, the Chronotron places it center stage. As a consequence, it acknowledges and leverages the intricate interplay of spatial and temporal dynamics in neural computation. This means that it doesn't just consider which neurons are firing, but also pays meticulous attention to when they fire concerning one another. Thus, the membrane potential is

u (t) = η (t) + \sum_{j} w_{j} \sum_{t_{j}^{f} \leq t} ε_{j} (t, t_{j}^{f})

(15)

Where the models the

η

model's refractoriness is caused by the past presynaptic spikes,

w_{j}

is the synaptic efficacy,

t_{j}^{f}

is the time of appearance of the

f

-th presynaptic spike on the

j

synapse,

ε_{j} (t, t_{j}^{f})

denotes normalized kernel [58]. When

u (t)

reaches the threshold level, a spike is fired. And

u (t)

is reset to the value of reset potential. In this approach, it is crucial to find the appropriate error functions, i.e. such an error function that enables the minimization with a gradient descent method [59]. The advantage of this learning rule is the fact that it uses the same coding for inputs and outputs. Chronotron's hallmark, its granularity, can sometimes surge computational demands, especially during intense training. And like many cutting-edge neural frameworks, harnessing Chronotron's full potential can be intricate, necessitatin' fine-tuned parameters and rich, well-timed data.

5.6. Bio-inspired Learning Algorithms

Brain-inspired Artificial Intelligence approaches, in particular spiking neural networks, are becoming a promising energy-efficient alternative to traditional artificial neural networks [60]. However, the performance gap between SNNs and ANNs has been a significant obstacle to the wild SNNs application (applicable SNNs). To fully use the potential of SNNs, including the detection of the non-regularities in biomedical signals, and designing more specific networks, the mechanisms of their training should be improved, one of the possible directions of development is the bio-inspiring learning algorithms. Below we briefly discuss the most important of them.

5.6.1. Spike Timing Dependent Plasticity

Spike Timing Dependent Plasticity (STDP) is rooted in the idea that the precise timing of neural spikes critically affects changes in synaptic strength [61]. This principle highlights the intricate dance between time and neural activity, showcasing the dynamics of our neural circuits. This biologically plausible learning rule is a timing-dependent specialization of Hebbian learning (13) [62]. STDP shed light on the intricate interplay between timing and synaptic modification. It is based on the change in synaptic weight function

∆ W = η (1 + ζ) H (W; t_{p r e} - t_{p o s t})

(16)

where

η

denotes the learning speed,

ζ

is Gaussian white noise with zero mean, while

H (W; t_{p r e} - t_{p o s t})

is the function, that determines the long-term potentiation (LTP, ie. presynaptic and postsynaptic neurons emit a high rate) and depression (LTD, i.e. presynaptic neurons emit a high rate) in the time window

t_{p r e} - t_{p o s t}

[63]

H (W; t_{p r e} - t_{p o s t}) \{\begin{matrix} a_{+} (W) e x p (- \frac{| t_{p r e} - t_{p o s t} |}{τ_{+}}) f o r t_{p r e} - t_{p o s t} < 0 \\ {- a}_{-} (W) e x p (- \frac{| t_{p r e} - t_{p o s t} |}{τ_{-}}) f o r t_{p r e} - t_{p o s t} > 0 \end{matrix}

(17)

where

a (W)

is a scaling function that determines the weight dependence, while

τ

denotes the time constant for depression [61,62,63]. STDP's significance is underpinned by its numerous advantages. Chiefly, it offers a biologically authentic model by 'mimicking the temporal dynamics observed in real neural 'systems. Furthermore, its event-centric nature promotes unsupervised learning, enabling networks to autonomously adjust based on the temporal patterns present in input data. This time-based sensitivity equips STDP to adeptly process data with spatiotemporal attributes and detect intricate temporal relationships within neuronal signals [64,65]. However, STDP is not without its complexities. A prominent challenge is the fine-tuning of parameters. The exact values assigned to constants like

a (w)

and

τ

can substantially dictate the behavior and efficacy of STDP-informed networks. Balancing these values requires a meticulous approach. Moreover, the precision demanded by STDP's time-centric nature often calls for higher computational rigor, especially within simulation contexts. STDP stands as a testament to the elegance and intricacy of neural systems. By emphasizing the role of spike timing, STDP offers a vivid depiction of how synaptic interactions evolve [66,67].

5.6.2. Spike-Driven Synaptic Plasticity

Spike-Driven Synaptic Plasticity (SDSP) offers the ability to elucidate the causality in neural communication. It operates on a fundamental principle: the sequence and timing of spikes determine whether a synapse strengthens or weakens. If a neuron consistently fires just before its downstream counterpart, it's a strong indication of its influential role in the latter's activity. This "pre-before-post" firing often leads to synaptic strengthening, cementing the relationship between the two neurons. Conversely, if the sequence is reversed, with the downstream neuron firing before its predecessor, the connection may weaken, reflecting a lack of causal influence [68,69]. This causative aspect of SDSP provides valuable insights into the learning mechanisms of the brain. It suggests that our neural circuits are continually evolving, adjusting their connections based on the flow of spike-based information. Such adaptability ensures that our brains remain receptive to new information, enabling us to learn and adjust to ever-changing environments. Moreover, SDSP emphasizes the significance of precise spike timing. In the realm of neural computation, milliseconds matter. Small shifts in spike timing can change a synapse's fate, showcasing the brain's precision and sensitivity. This meticulousness in spike-driven modifications underscores the importance of timing in neural computations, hinting at the brain's capacity to encode and process temporal patterns with remarkable accuracy [70]. In this learning rule the changes in synaptic weights can be expressed as [64]

∆ w = \{\begin{matrix} η^{+} + e^{\frac{- | ∆ t |}{τ^{+}}} i f ∆ t > 0 \\ η^{-} + e^{\frac{- | ∆ t |}{τ^{-}}}, o t h e r w i s e \end{matrix}

(18)

where

η_{+} > 0

and

η_{-} < 0

denotes the learning parameters,

τ_{+}

and

τ_{-}

are time constraints, and

∆ t

is the difference between post- and pre-synaptic spikes. This representation, while streamlined, encapsulates the principle that the mere presence of a spike can induce modifications in the synaptic weight, either strengthening or weakening the connection based on the specific neural context and the directionality of the spike's influence [71,72,73].

The appeal of Spkie-Driven Synaptic Plasticity is manifold its primary virtue is its biological relevance. Focusing on individual spike occurrences mirrors the granular events that take place in real neural systems. Such an approach facilitates the modeling of neural networks in scenarios where individual spike occurrences are of paramount importance. Furthermore, by anchoring plasticity on singular events, this model is inherently suitable for real-time learning and rapid adaptability in dynamic environments [74].

A crucial challenge lies in the accurate capture and interpretation of individual spikes, especially in densely firing neural environments. Moreover, the plasticity model's sensitivity to' single events 'means that it can' be susceptible to noise, requiring sophisticated filtering mechanisms to discern genuine learning events from spurious spikes. SDS elucidates the profound influence of singular neuronal events on the grand tapestry of neural learning and adaptation [73].

5.6.3. Tempotron Learning Rule

One of the most interesting biological-inspired learning algorithms is the tempotron principle [65,76,77] It is designed to adapt synaptic weights based on the temporal precise patterns of incoming spikes, rather than only the frequency of such spikes. While traditional neural models might emphasize synaptic weights or connection topologies, tempotron underscores that the 'when' of a neural event can be as informative, if not more so, than the 'where' or 'how often' [78,79,80]. The tempotron learning rule is based on the LIF neuron model. It fires when (4) exceeds the threshold (binary decision). Thus, one can define the potential of the neuron’s membrane as a weighted sum of postsynaptic potentials (PSPs) from all appearance spikes [77]

v (t) = \sum_{i} ω_{i} \sum_{t_{i}} K (t - t_{i}) + V_{r e s t}

(19)

where

ω_{i}

denotes synaptic efficacy,

t_{i}

is the firing time of the

i

th afferents,

V_{r e s t}

is resting potential, and

K

is the normalized PSP kernel

K (t - t_{i}) = V_{0} (\exp (\frac{- (t - t_{i})}{τ_{m}}) - e x p (\frac{- (t - t_{i})}{τ_{s}}))

(20)

where

τ_{m}

is the decay time constant of membrane integration, while

τ_{s}

denotes the decay time constant of synaptic currents.While the

V_{0}

normalized the PSP that the maximum kernel value is equal to 1. The neuron is fired when the value of the potential of the neuron’s membrane (19) is greater than the value of the firing threshold. Next, the potential of the neuron’s membrane (19) smoothly decreases to the value of

V_{r e s t}

. In the case of the segmentation/classification task, the input to the neuron may belong to one of two classes, namely

P^{+}

when a stimulus occurs (i.e. pattern is presented) the neuron should fire), and

P^{-}

when the pattern is presented neuron should not fired. Each input consists of

N

spike trains. In turn, the tempotron learning rules are as follows

∆ ω_{i} = λ \sum_{t_{i} < t_{m a x}} K (t_{m a x} - t_{i})

(21)

where

t_{m a x}

is the time when the potential of the neuron’s membrane (19) reaches a maximum value. While

λ

is the constant that is greater than zero in the case of

P^{+}

, and smaller than zero in the case

P^{-}

. In this operation, tempotron introduces gradient-decent dynamics, i.e. minimizing the cost function for each input pattern measures the maximum voltage that is generated by the erroneous patterns. In comparison to the STDP learning rule, tempotron can make the appropriate decision under a supervisory signal, by tuning fewer parameters than STDP. Thus, tempotron uses LTP and LTD mechanisms like STDP. The advantage of the tempotron learning rule is the speed of learning.

6. Neural networks and learning algorithms in the medical image segmentation process

Image segmentation has a crucial role in creating both, medical diagnosing supported by image analysis and virtual object creation like the medical digital twin (DT) of organs [66,67], holograms of the human organs [81,82], and virtual medical simulators [68,83]. One can split the image segmentation process into semantic segmentation (i.e. assigning a label or category to each pixel), instance segmentation (i.e. identifying and separating individual objects in an image and assigning a label to it), and panoptic segmentation (i.e. more complex tasks, which involves the two segmentations above) [77,78]. The application of AI enables to increase in the efficiency and speed of these processes [84]. In Table 1. the comparison of the AI-based algorithms applied in medical image scan segmentation taking into account the neuron model, the type of neural network, learning rule, and biological plausibility is shown. It turned out that the most commonly used in image segmentation are CNNs, in particular, Unet architecture and its variations [71,72,74,75,85]. In [73] the authors modified this neural network structure by adding dense and nested skip connections (UNet++), while [Yao et al., 2020] added the residual blocks and attention modules to enable the network to learn deeper features and increase the effectiveness of segmentation. To connect the efficiency of segmentation with access to global semantic information, often CNNs are combined with transformer blocks [85,86,87]. Another CNNs-based algorithm commonly used in medical image segmentation is You Only Look Once (YOLO), which is open-source software used under the GNU General Public License v3.0 license [88]. It uses one fully connected layer, the number (depending on the version) of convolution layers that are pre-trained with the CNN (YOLO v1 ImageNet, YOLO v2 Darknet-19, YOLO v3 Darknet-53, YOLO v4 CSPNet, YOLO v5 EfficientNet, YOLO v6 EfficientNet-L2, YOLO v7 ResNET, YOLO v8 RestNet), and pooling layer. The algorithm divides the input in the form of a photo into specific segmentations and then uses CNN to generate bounding boxes and class predictions. Recently, in image classification, SNN has become more popular [78,79] due to its low power consumption. However, SNN training rules require refinement to achieve ANN accuracy. Another interesting algorithm for natural image segmentation with was recently developed (April 2023) by Meta is the Segmentation Anything Model (SAM) [89,90]. This AI-based algorithm enables cutting out any object from the image with a single click. It uses CNNs and transformer-based architectures for image processing, in particular, transformers-based architectures are applied to extract the features, compute the embedding, and pomp the encoder. The first attempt has been made to apply it in the field of medical imaging, however, in medical segmentation, it is still not so accurate in comparison to other application fields [91,92]. The imperfections of the SAM algorithm in the field of medical image segmentation are mainly connected to insufficient numbers of training data. In [93], the authors proposed to apply the Med SAM Adapter to overcome the above limitations. The pre-training method like Masked Autoencoder (MAE), Contrastive Embedding-Mixup (e-Mix), and Shuffled Embedding Prediction (ShED) was applied. There is a lot of work in the area of medical image segmentation using machine learning, but relatively little addresses the issue related to the network learning process itself (along with data, a key element in achieving high accuracy of the process) [94], see Table 1. Thus, the most commonly used learning algorithms in medical image segmentation are still on the low level of biological plausibility. On the other hand, in other image segmentation, in particular, biologically plausible learning algorithms are applied, for example, in the field of the images of handwritten digits [77].

The segmented structures (in this case organs and their disorders) may be next applied to the development of the 3D virtual environment [105]. These 3D objects may be implemented through for example, holograms displayed in the head-mounted display (HDMs) like Mixed Reality glasses in medical diagnostics [113], pre-operative imaging [114], surgical assistance [115,116], robotics surgery [117], and medical education [81,82]. However, the crucial issue is connected with the quality of obtained segmented structures, and this process can be significantly accelerated and improved by the use of Artificial Intelligence.

7. Data availability

One of the key issues in the development of AI algorithms in the field of medicine is the availability and quality of data, i.e. access to electronic health records (EHRs) [118,119]. Thus, the medical data should be anonymized. In Table 2 a summary of publicly available retrospective image scan medical databases is presented. Some authors also provide anonymized data upon request. It is worth stressing that data, including medical image scans, are subjected to various types of biases [120].

8. Discussion and conclusions

The effectiveness of learning algorithms is compared among others in terms of the number of learning cycles, number of objective function calculations, number of floating-point multiplications, computation time, and sensitivity to local minima. In addition to the selection of appropriate parameters and network structure, the selection of an appropriate (effective) network learning algorithm is of key importance. The most commonly applied learning algorithm in ANNs is backpropagation, however, it has a rather slow convergence rate and as a consequence, ANN has more redundancy [146]. On the other hand, the training of the SNNs due to quite complicated dynamics and the non-differentiable nature of the spike activity remains a challenge [147]. The three types of ANN and SNN learning rules can be distinguished: unsupervised learning, indirect, supervised learning, and direct supervised learning. Thus, a commonly used learning algorithm in SNNs is the arithmetic rule SpikePropo, which is similar in concept to the backpropagation (BP) algorithm, in which network parameters are iteratively updated in a direction to minimize the difference between the final outputs of the network and target labels [148,149]. The main difference between SNNs and ANNs is output dynamics. However, arithmetic-based learning rules are not a good choice for building biologically efficient networks. Other learning methods have been proposed for this purpose, including bio-inspired algorithms like spike-timing-dependent plasticity [150], spike-driven synaptic plasticity [151], and the tempotron learning rule [65,76,77]. STDP is unsupervised learning, which characterizes synaptic changes solely in terms of the temporal contiguity of presynaptic spikes and postsynaptic potentials or spikes [152], while spike-driven synaptic plasticity is supervised learning and uses rate coding. However, still, ANN with BP learning achieves a better classification performance than SNNs trained with STDP. To obtain better performance the combination of layer-wise STDP-based unsupervised and supervised spike-based BP was proposed [153,154]. Other commonly used learning algorithms are ReSuMe [57], and Chronotron [58]. The tempotron learning rule implements gradient-descent dynamics, which minimizes a cost function that measures the amount by which the maximum voltage generated by erroneous patterns deviates from the firing threshold. Tempotron learning is efficient in learning spiking patterns where information is embedded in precise timing spikes (temporal coding). Instead, [155] proposed a neuron normalization technique and an explicitly iterative neuron model, which resulted in a significant increase in the SNNs' learning rate. However, training the network still requires a lot of labeled samples (input data). Another learning algorithm is indirect. It firstly trains ANN (created with perceptron’s) and thereupon transforms it into its SNN version with the same network structure (i.e., ANN-SNN conversion) [156]. The disadvantage of such learning is the fact, that reliably estimating frequencies requires a nontrivial passage of time, and this learning rule fails to capture the temporal dynamics of a spiking system. The most popular direct supervised learning is gradient descent, which uses the first-spike time to encode input [157]. It uses the first-spike time to encode input signals and minimizes the difference between the network output and desired signals, the whole process of which is similar to the traditional BP. Thus, the application of the temporal coding-based learning rule, which could potentially carry the same information efficiently using less number of spikes than the rate coding, can help to increase the speed of calculations. On the other hand, active learning methods, including bio-inspired active learning (BAL), bio-inspired active learning on Firing Rate (BAL-FR), and bio-inspired active learning on membrane potential (BAL-M) have been proposed to reduce the size of the input data [158]. During the learning procedure, the labeled data sets are used to train the empirical behaviors of patterns, while the generalization behavior of patterns is extracted from unlabeled data sets. It leverages the difference between empirical and generalization behavior patterns to select the samples unmatched by the known patterns. This approach is based on the behavioral pattern differences of neurons in SNNs for active sample selection, and can effectively reduce the sample size required for SNNs training.

The integration of AI and Metaverse is a fact and suggests that AI may become the dominant approach for image scan segmentation and intelligent visual-content generation in the whole virtual world, not just medical applications [6,159]. Recently, the Segment Anything Model (SAM) based on AI was introduced for natural images [89], in [160] SAM was proposed to be applied to medical images with a high level of accuracy. Better image segmentation contributes the higher-quality virtual objects. AI application in the context of the Metaverse is connected with the identification and categorization of meta-verse virtual items [161]. Moreover, AI may lead to more efficient cybersecurity solutions in the virtual world [162]. However, this is closely related to the accuracy of AI-based algorithms and, consequently, the accuracy of their training.

Author Contributions

“Conceptualization, A.P., and J.S.; methodology, A.P., J.S., and Z.R; software, A.P., Z. R.; formal analysis, A.P., J.S., and Z.R; investigation, A.P., J.S., and Z.R; resources, A.P. and Z.R.; data curation, A.P., and Z.R.; writing—original draft preparation, A.P., and Z.R.; writing—review and editing, A.P., J.S., and Z.R; visualization, A.P., and Z.R.; supervision, A.P.; project administration, A.P.; funding acquisition, J.S. All authors have read and agreed to the published version of the manuscript.”.

Funding

“This research received no external funding”.

Data Availability Statement

Not applicable.

Acknowledgments

This study was partially supported by the National Centre for Research and Development (research grant Infostrateg I/0042/2021-00.

Conflicts of Interest

“The authors declare no conflict of interest.”.

References

Herculano-Houzel, S. The remarkable, yet not extraordinary, human brain as a scaled-up primate brain and its associated cost. Proc. Natl. Acad. Sci. U S A 2012, 109, pp. 10661-10668. [CrossRef]
Shao, F.; Shen, Z. How can artificial neural networks approximate the brain? Front. Psychol. 2023, Jane 9, pp 13:970214. PMID: 36698593; PMCID: PMC9868316. [CrossRef]
Moscato, V.; Napolano, G.; Postiglione, M. et al. Multi-task learning for few-shot biomedical relation extraction. Artif. Intell. Rev. 2023. [Online ahead of print]. Available online. (accessed 21 October 2023). [CrossRef]
Wang, J.; Chen, S.; Liu, Y.; Lau, R. Intelligent Metaverse Scene Content Construction. IEEE Access 2023, 11, pp. 76222-76241. [CrossRef]
López-Ojeda, W.; Hurley, R.A. Digital Innovation in Neuroanatomy: Three-Dimensional (3D) Image Processing and Printing for Medical Curricula and Health Care. J. Neuropsychiatry Clin. Neurosci. 2023, 35, pp. 206-209. [CrossRef]
Kim, E.J.; Kim, J.Y. The Metaverse for Healthcare: Trends, Applications, and Future Directions of Digital Therapeutics for Urology. Int. Neurourol. J. 2023, 27, pp. S3-S12. [CrossRef]
Pregowska, A.; Osial, M.; Dolega-Dolegowski, D.; Kolecki, R.; Proniewska, K. Information and Communication Technologies Combined with Mixed Reality as Supporting Tools in Medical Education. Electronics 2023, 11, 3778. [CrossRef]
Fang, X.; Yan, P. Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Trans. Med. Imaging 2020, 39, pp. 3619-3629. [CrossRef]
Yuan, F.; Zhang, Z.; Fang, Z. An Effective CNN and Transformer Complementary Network for Medical Image Segmentation. Pattern Recognit. 2023, 136, 109228. [CrossRef]
Mazurowski, M.A.; Dong, H.; Gu, H.; Yang, J.; Konz, N.; Zhang, Y. Segment anything model form medical image analysis: An experimental study. Med. Image Anal. 2023, 89 , 102918. [CrossRef]
Sakshi, S.; Kukreja, V.; Image Segmentation Techniques: Statistical, Comprehensive, Semi-Automated Analysis and an Application Perspective Analysis of Mathematical Expressions. Arch. Computat. Methods Eng. 2023, 30, pp. 457–495. [CrossRef]
Moztarzadeh, O.; Jamshidi, M.; Sargolzaei, S.; Keikhaee, F.; Jamshidi, A.; Shadroo, S.; Hauer, L. Metaverse and Medical Diagnosis: A Blockchain-Based Digital Twinning Approach Based on MobileNetV2 Algorithm for Cervical Vertebral Maturation. Diagnostics. 2023, 13, 1485. [CrossRef]
Huynh-The, T.; Pham, Q.-V.; Pham, M.-T.; Banh, T.-N.; Nguyen, G.-P.; Kim, D.-S. Efficient Real-Time Object Tracking in the Metaverse Using Edge Computing with Temporal and Spatial Consistency. Comput. Mater. Contin. 2023, 71, pp. 341-356. [CrossRef]
Huang, H.; Zhang, C.; Zhao, L.; Ding, S.; Wang, H.; Wu, H. Self-Supervised Medical Image Denoising Based on WISTA-Net for Human Healthcare in Metaverse. IEEE J. Biomed. Health Inform. pp. 1 -9 2023. [CrossRef]
Liberati, A.; Altman, D.G.; Tetzlaff, J.; Mulrow, C.; Gøtzsche, P.C.; Ioannidis, J.P.A.; Clarke, M.; Devereaux, P.J.; Kleijnen, J.; Moher, D. The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration. PLoS Med. 2009, 6, e1000100. [CrossRef]
Rethlefsen, M.L.; Kirtley, S.; Waffenschmidt, S.; Ayala, A.P.; Moher, D.; Page, M.J.; Koffel, J.B. PRISMA-S: An Extension to the PRISMA Statement for Reporting Literature Searches in Systematic Reviews. Syst. Rev. 2021, 10, 39. [CrossRef]
Adrian, E.D.; Zotterman, Y. The Impulses Produced by Sensory Nerve Endings. J. Physiol. 1926, 61, pp. 465–483. [CrossRef]
Gerstner, W.; Kistler, W.M.; Naud, R.; Paninski, L. Neuronal Dynamics: From Single Neurons to Networks and Models of Cognition. Cambridge University Press, Cambridge, UK, 2014.
Rieke, F.; Warland, D.; de Ruyter van Steveninck, R.; Bialek, W. Spikes: Exploring the Neural Code. The MIT Press: Cambridge, MA, USA, 1997.
van Hemmen, J.L.; Sejnowski, T.J. 23 Problems in Systems Neuroscience. Oxford University Press: Oxford, UK, 2006.
Teich, M.C.; Khanna, S.M. Pulse-Number distribution for the neural spike train in the cat's auditory nerve. J. Acoust. Soc. Am. 1985, 77, pp. 1110–1128. [CrossRef]
Werner, G.; Mountcastle, V.B. Neural activity in mechanoreceptive cutaneous afferents: stimulus-response relations, Weber Functions, and Information Transmission. J. Neurophysiol. 1965, 28, pp. 359–397. [CrossRef]
Tolhurst, D.J.; Movshon, J.A.; Thompson, I.D. The dependence of Response amplitude and variance of cat visual cortical neurons on stimulus contrast. Exp. Brain Res. 1981, 41, pp. 414–419. [CrossRef]
Radons, G.; Becker, J.D.; Dülfer, B.; Krüger, J. Analysis, classification, and coding of multielectrode spike trains with hidden Markov models. Biol. Cybern. 1994, 71, pp. 359–373. [CrossRef]
de Ruyter van Steveninck, R.R.; Lewen, G.D.; Strong, S.P.; Koberle, R.; Bialek, W. Reproducibility and variability in neural spike trains. Science 1997, 275, pp. 1805–1808. [CrossRef]
Kass, R.E.; Ventura, V. A spike-train probability model. Neural Comput. 2001, 13, pp. 1713–1720. [CrossRef]
Wójcik, D. The kinematics of the spike trains. Acta Phys. Pol. B. 2018, 49, pp. 2127–2138. [CrossRef]
Rosenblatt, F. Principles of neurodynamics. Perceptrons and the theory of Bbain mechanisms. Tech. Rep., Cornell Aeronautical Lab Inc, Buffalo, NY, USA, 1961.
Bu, T.; Fang, W.; Ding, J.; Dai, P.L.; Yu, Z.; Huang, T. Optimal ANN-SNN Conversion for High-Accuracy and Ultra-Low-Latency Spiking Neural Networks. arXiv preprint arXiv:2303.04347, 2023. Available online. (accessed on 13 November 2023). [CrossRef]
Abbott, L.F.; Dayan, P. Theoretical Neuroscience Computational and Mathematical Modeling of Neural Systems, The MIT Press, 2000.
Yuan, Y.; Gao, R.; Wu, Q.; Fang, S.; Bu, X.; Cui, Y.; Han, C.; Hu, L., Li, X.; Wang, X.; Geng, L.; Liu, W. ACS Sensors 2023, 8 (7), 2646-2655. [CrossRef]
Ghosh-Dastidar, S.; Adeli, H. Third Generation Neural Networks: Spiking Neural Networks. In Advances in Computational Intelligence; Yu, W.; Sanchez, E.N., Eds.; Springer: Berlin, Heidelberg, 2009; Volume 116. [CrossRef]
Lindeberg, T. A time-causal and time-recursive scale-covariant scale-space representation of temporal signals and past time. Biol. Cybern. 2023, 117, pp. 21–59. PMID: 36689001 PMCID: PMC10160219. [CrossRef]
Rueckauer, B.; Lungu, I.A.; Hu, Y.; Pfeiffer, M.; Liu, S.C. Conversion of Continuous-Valued Deep Networks To Efficient Event-Driven Neuromorphic Hardware. Front. Neurosci. 2017, 11. [CrossRef]
Cheng, X.; Zhang, T.; Jia, S.; Xu, B. Meta neurons improve spiking neural networks for efficient spatio-temporal learning. Neurocomputing 2023, 531, pp. 217-225. [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE, 1998, 86, pp. 2278-2324.
Mehrish, A.; Majumder, N.; Bharadwaj, R.; Mihalcea, R.; Poria, S. A review of deep learning techniques for speech processing. Inf. Fusion 2023, 99, 101869. [CrossRef]
Nielsen, M.A. Neural Networks and Deep Learning. Determination Press, 2015; eBook. Available online: NeuralNetworksAndDeepLearning.com (accessed on 13 November 2023).
Yamashita, R.; Nishio, M.; Do, R.K.G.; et al. Convolutional neural networks: an overview and application in radiology. Insights Imaging 2018, 9, pp. 611–629. [CrossRef]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena, 2020, 404, 132306. [CrossRef]
Ghosh-Dastidar, S.; Adeli, H. Spiking neural networks. Int. J. Neural Syst. 2009, 19, pp., 295-308.
Yamazaki, K.; Vo-Ho, V.K.; Bulsara, D.; Le, N. Spiking Neural Networks and Their Applications: A Review. Brain Sci. 2022, 12,b p. 863. PMID: 35884670; PMCID. [CrossRef]
Dampfhoffer, A.; et al. Backpropagation-Based Learning Techniques for Deep Spiking Neural Networks: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1 – 16. [CrossRef]
Ponulak, F.; Kasinski, A. Introduction to spiking neural networks: Information processing, learning and applications. Acta Neurobiol Exp (Wars). 2011, 71, pp. 409-33. PMID: 22237491.
Wu, Y.; et al. Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks. Front Neurosci. 2018, 12, 331. [CrossRef]
Pei, J.; Deng, L.; Song, S.; Zhao, M.; Zhang, Y.; Wu, S.; Wang, G.; Zou, Z.; Wu, Z.; He, W.; et al. Towards artificial general intelligence with hybrid Tianjic chip architecture. Nature 2019, 572, pp. 106–111. [CrossRef]
Rathi, N.; Chakraborty, I.; Kosta, A.; Sengupta, A.; Ankit, A.; Panda, P.; Roy, K. Exploring Neuromorphic Computing Based on Spiking Neural Networks: Algorithms to Hardware. ACM Comput. Surv. 2023, 55, 243. [CrossRef]
Rojas, R. The Backpropagation Algorithm. In Neural Networks; Springer: Berlin, Heidelberg, 1996; pp. 1-50. [CrossRef]
Singh, A.; Kushwaha, S.; Alarfaj, M.; Singh, M. Comprehensive Overview of Backpropagation Algorithm for Digital Image Denoising. Electronics 2022, 11, 1590. [CrossRef]
Kaur, J.; Khehra, B.S.; Singh, A. Back propagation artificial neural network for diagnosis of heart disease. J. Reliable Intell. Environ. 2023, 9, PP. 57–85. [CrossRef]
Hameed, A.A.; Karlik, B.; Salman, M. S. Back-propagation algorithm with variable adaptive momentum. Knowledge-Based Systems 2016, 114, 79-87. [CrossRef]
Cao, Y.; Chen, Y.; Khosla, D. Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition. Int. J. Comput. Vis. 2015, 113, pp. 54–66. [CrossRef]
Alemanno, F.; Aquaro, M.; Kanter, I.; Barra, A.; Agliari, E. Supervised Hebbian Learning. Europhys. Lett. 2023, 141, 11001. [CrossRef]
Ponulak, F. ReSuMe—New Supervised Learning Method for Spiking Neural Networks. Technical Report, Technical Report, Poznań University of Technology, Poznań, Poland, 2005. https://www.semanticscholar.org/paper/ReSuMe-New-Supervised-Learning-Method-for-Spiking-Ponulak/b04f2391b8c9539edff41065c39fc2d27cc3d95a.
Shrestha, A.; Ahmed, K.; Wang, Y.; Qiu, Q. Stable Spike-Timing Dependent Plasticity Rule for Multilayer Unsupervised and Supervised Learning. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14-19 May 2017, pp. 1999–2006. [CrossRef]
Amato, G.; Carrara, F.; Falchi, F.; Gennaro, C.; Lagani, G. Hebbian Learning Meets Deep Convolutional Neural Networks. In Image Analysis and Processing – ICIAP 2019; Ricci, E., Rota Bulò, S., Snoek, C., Lanz, O., Messelodi, S., Sebe, N., Eds.; Lecture Notes in Computer Science, vol 11751 pp. 1-14,; Springer: Cham, 2019. [CrossRef]
Ponulak, F.; Kasinski, A. Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting. Neural Comput. 2010, 22, pp. 467–510. [CrossRef]
Florian, R.V. The Chronotron: A Neuron That Learns to Fire Temporally Precise Spike Patterns. PLoS One 2012, 7, e40233. [CrossRef]
Victor, J.D.; Purpura, K.P. Metric-space analysis of spike trains: theory, algorithms, and applications. Network 1997, 8, pp. 127–164. [CrossRef]
Huang, C.; Wang, J.; Wang, S.-H.; Zhang, Y.-D. Applicable artificial intelligence for brain disease: A survey. Neurocomputing 2022, 504, pp. 223–239. [CrossRef]
Markram, H.; Gerstner, W.; Sjöström, P.J. A history of spike-timing-dependent plasticity. Front. Synaptic Neurosci. 2011, 3, 4,. [CrossRef]
Merolla, P.A.; et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 2014, 345, pp. 668–673. [CrossRef]
Chakraborty, B.; Mukhopadhyay, S. Characterization of Generalizability of Spike Timing Dependent Plasticity Trained Spiking Neural Networks. Front. Neurosci. 2021, 15, 695357. [CrossRef]
Lagani, G.; Falchi, F.; Gennaro, C.; Amato, G. Spiking Neural Networks and Bio-Inspired Supervised Deep Learning: A Survey. arXiv preprint arXiv:2307.16235, 2023. Available online. (accessed on 22 October 2023). [CrossRef]
Gütig, R.; Sompolinsky, H. The tempotron: a neuron that learns spike timing-based decisions. Nat. Neurosci. 2006, 9, pp. 420–428. [CrossRef]
Cellina, M.; Cè, M.; Alì, M.; Irmici, G.; Ibba, S.; Caloro, E.; Fazzini, D.; Oliva, G.; Papa, S. Digital Twins: The New Frontier for Personalized Medicine? Appl. Sci. 2023, 13, 7940. [CrossRef]
Sun, T.; He, X.; Li, Z. Digital twin in healthcare: Recent updates and challenges. Digital Health 2023, 9. [CrossRef]
Uhl, J.C.; Schrom-Feiertag, H.; Regal, G.; Gallhuber, K.; Tscheligi, M. Tangible Immersive Trauma Simulation: Is Mixed Reality the Next Level of Medical Skills Training? In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), New York, NY, USA, 2023; Association for Computing Machinery: New York, NY, USA, 2023; Article 513. [CrossRef]
Kshatri, S.S.; Singh, D. Convolutional Neural Network in Medical Image Analysis: A Review. Arch Computat Methods Eng . 2023, 30, pp. 2793–2810. [CrossRef]
Li, X.; Guo, Y.; Jiang, F.; Xu, L.; Shen, F.; Jin, Z.; Wang, Y. Multi-Task Refined Boundary-Supervision U-Net (MRBSU-Net) for Gastrointestinal Stromal Tumor Segmentation in Endoscopic Ultrasound (EUS) Images. IEEE Access 2020, 8, pp. 5805-5816. [CrossRef]
Oktay, O.; Schlemper, J.; Le Folgoc, L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B. et al. Attention U-Net: Learning Where to Look for the Pancreas. 2018. arXiv preprint arXiv:1804.03999, Availabl. (accessed on 13 November 2023). [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 234-241.
Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Crimi, A., Bakas, S., Kuijf, H., Menze, B., Reyes, M., eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 3-11.
Alom, M.Z.; Yakopcic, C.; Hasan, M.; Taha, T.M.; Asari, V.K. Recurrent residual U-Net for medical image segmentation. J Med Imaging (Bellingham) 2019, 6(1), 014006. [CrossRef]
Ren, Y.; Zou, D.; Xu, W.; Zhao, X.; Lu, W.; He, X. Bimodal segmentation and classification of endoscopic ultrasonography images for solid pancreatic tumor. Biomed. Signal Process. Control. 2023, 83, 104591. [CrossRef]
Urbanczik, R.; Senn, W. Reinforcement learning in populations of spiking neurons. Nat. Neurosci. 2009, 12, 250–252. [CrossRef]
Yu, Q.; Tang, H.; Tan, K. C.; Yu, H. A brain-inspired spiking neural network model with temporal encoding and learning. Neurocomputing 2014, 138, 3-13. [CrossRef]
Kumarasinghe, K.; Kasabov, N.; Taylor, D. Brain-inspired spiking neural networks for decoding and understanding muscle activity and kinematics from electroencephalography signals during hand movements. Sci. Rep. 2021, pp. 11, 1–15. [CrossRef]
Niu, L.Y.; Wei, Y.; Liu, W.B.; et al. Research Progress of spiking neural network in image classification: A Review. Appl. Intell. 2023, 53, pp. 19466–19490. [CrossRef]
Yuan, F.; Zhang, Z.; Fang, Z. An Effective CNN and Transformer Complementary Network for Medical Image Segmentation. Pattern Recognition 2023, 136, p. 109228; ISSN 0031-3203. [CrossRef]
Pregowska, A.; Osial, M.; Dolega-Dolegowski, D.; Kolecki, R.; Proniewska, K. Information and Communication Technologies Combined with Mixed Reality as Supporting Tools in Medical Education. Electronics 2022, 11(22), 3778. [CrossRef]
Proniewska, K.; Dolega-Dolegowski, D.; Kolecki, R.; Osial, M.; Pregowska, A. Applications of Augmented Reality - Current State of the Art, In The 3D Operating Room with Unlimited Perspective Change and Remote Support. InTech: Rijeka, Croatia, 2023; pp. 1-23.
Suh, I.; McKinney, T.; Siu, K.-C. Current Perspective of Metaverse Application in Medical Education, Research and Patient Care. Virtual Worlds 2023, 2, pp. 115-128. [CrossRef]
Liu, X.; Song, L.; Liu, S.; Zhang, Y. A Review of Deep-Learning-Based Medical Image Segmentation Methods. Sustainability 2021, 13, p. 1224. [CrossRef]
Li, Y.; Zhang, Y.; Liu, J.-Y.; Wang, K.; Zhang, K.; Zhang, G.-S.; Liao, X.-F.; Yang, G. Global Transformer and Dual Local Attention Network via Deep-Shallow Hierarchical Feature Fusion for Retinal Vessel Segmentation. IEEE Trans. Cybern. 2022. PMID: 35984806. [CrossRef]
Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. Available online arXiv preprint arXiv:2102.04306, 2021. Available online. (accessed on 13 November 2023). [CrossRef]
Xiao, H.; Li, L.; Liu, Q.; Zhu, X.; Zhang, Q. Transformers in Medical Image Segmentation: A Review. Biomed. Signal Process. Control. 2023, 84(12), 104791. [CrossRef]
Yu, H.; Yang, L.T.; Zhang, Q.; Armstrong, D.; Deen, M.J. Convolutional Neural Networks for Medical Image Analysis: State-of-the-Art, Comparisons, Improvement, and Perspectives. Neurocomputing 2021, 444, pp. 92-110. [CrossRef]
Meta. https://segment-anything.com/ (accessed on 13 November 2023).
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.Y., et al. Segment Anything. arXiv preprint arXiv:2304.02643, 2023. Available online: https://segment-anything (accessed on 08 November 2023).
He, S.; Bao, R.; Li, J.; Stout, J.; Bjornerud, A.; Grant, P.E.; Ou, Y. Computer-Vision Benchmark Segment-Anything Model (SAM) in Medical Images: Accuracy in 12 Datasets. arXiv preprint arXiv:2304.09324, 2023; Available online: https://arxiv.org/abs/2304.09324 (accessed on 13 November 2023).
Zhang, Y.; Jiao, R. Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey. arXiv preprint arXiv:2305.03678, 2023; Available online: https://arxiv.org/abs/2305.03678 (accessed on 13 November 2023).
Wu, J.; Zhang, Y.; Fu, R.; Fang, H.; Liu, Y.; Wang, Z.; Xu, Y.; Jin, Y. Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation. arXiv preprint: arXiv:2304.12620, 2023; https://arxiv.org/abs/2304.12620, 2023 (accessed on 13 November 2023).
Yi, Z.; Lian, J.; Liu, Q.; Zhu, H.; Liang, D.; Liu, J. Learning Rules in Spiking Neural Networks: A Survey. Neurocomputing 2023, 531, pp. 163-179; ISSN 0925-2312. [CrossRef]
Avcı, H.; Karakaya, J. A Novel Medical Image Enhancement Algorithm for Breast Cancer Detection on Mammography Images Using Machine Learning. Diagnostics 2023, 13, p. 348. [CrossRef]
Ghahramani, M.; Shiri, N. Brain tumour detection in magnetic resonance Imaging using Levenberg–Marquardt backpropagation neural network. IET Image Process. 2023, 17, pp. 88–103. [CrossRef]
Zhang, J.; Gajjala, S.; Agrawal, P.; Tison, G. H.; Hallock, L. H.; Beussink-Nelson, L.; Lassen, M. H.; Fan, E.; Aras, M. A.; Jordan. C.; Fleischmann, K. E.; Melisko, M.; Qasim, A.; Shah, S. J.; Bajcsy, R.; Deo, R. C. Fully automated echocardiogram interpretation in clinical practice. Circulation 2018, 138, 1623–1635.
Sajjad, M.; Khan, S.; Khan M.; Wu, W.; Ullah, A.; Baik,S. W. Multi-grade brain tumor classification using deep CNN with extensive data augmentation. Journal of Computational Science 2021, 30, 174-182. [CrossRef]
Jun, T.J.; Kang, S.J.; Lee, J.G.; Kweon, J.; Na, W.; Kang, D.; Kim, D.; Kim, D.; Kim, YH. Automated detection of vulnerable plaque in intravascular ultrasound images. Medical & biological engineering & computing 2019, 57(4), 863-876. [CrossRef]
Ostvik, A.; Smistad, E.; Aase, S. A.; Haugen, B. O.; Lovstakken, L. Real-time standard view classification in transthoracic echocardiography using convolutional neural networks. Ultrasound in medicine & biology 2019, 45, 374–384. [CrossRef]
Lossau, T.; Nickisch, H.; Wisse, T.; Bippus, R.; Schmitt, H.; Morlock, M.; Grass M. Motion artifact recognition and quantification in coronary CT angiography using convolutional neural networks. Medical image analysis 2019, 52, 68–79. [CrossRef]
Emad, O.; Yassine, I. A.; Fahmy, A. S. Automatic localization of the left ventricle in cardiac MRI images using deep learning. In Conf Proc IEEE Eng Med Biol Soc, 2015, 683–686. [CrossRef]
Moradi, M.; Guo, Y.; Gur, Y.; Negahdar, M.; Syeda-Mahmood, T. A Cross-Modality Neural Network Transform for Semi-automatic Medical Image Annotation. In: Ourselin, S.; Joskowicz, L.; Sabuncu, M.; Unal, G.; Wells, W. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016. MICCAI 2016. Lecture Notes in Computer Science, vol 9901. Springer, Cham. [CrossRef]
Liskowski, P.; Krawiec K. Segmenting retinal blood vessels with deep neural networks. IEEE Trans Med Imaging 2016, 35, 2369-80. [CrossRef]
Yuan, J.; Hassan, S.S.; Wu, J.; et al. Extended reality for biomedicine. Nat Rev Methods Primers. 2023, 3, 14. [CrossRef]
Kakhandaki, N.; Kulkarni, S.B. Classification of Brain MR Images Based on Bleed and Calcification Using ROI Cropped U-Net Segmentation and Ensemble RNN Classifier. Int. J. Inf. Tecnol. 2023, 15, pp. 3405–3420. [CrossRef]
Manimurugan, S. Hybrid High Performance Intelligent Computing Approach of CACNN and RNN for Skin Cancer Image Grading. Soft Comput. 2023, 27, pp. 579–589. [CrossRef]
Yue, Y.; Baltes, M.; Abuhajar, N.; Sun, T.; Karanth, A.; Smith, C.D.; Bihl, T.; Liu, J. Spiking Neural Networks Fine-Tuning for Brain Image Segmentation. Front. Neurosci. 2023, 17, p. 1267639. https://doi: 10.3389/fnins.2023.1267639.
Liang, J.; Li, R.; Wang, C.; Zhang, R.; Yue, K.; Li, W.; Li, Y. A Spiking Neural Network Based on Retinal Ganglion Cells for Automatic Burn Image Segmentation. Entropy 2022, 24, 1526. [CrossRef]
Gilani, S.Q.; Syed, T.; Umair, M. et al. Skin Cancer Classification Using Deep Spiking Neural Network. J Digit Imaging. 2023, 36, pp. 1137–1147. [CrossRef]
Sahoo, A.K.; Parida, P.; Muralibabu, K.; Dash, S. Efficient Simultaneous Segmentation and Classification of Brain Tumors from MRI Scans Using Deep Learning. Biocybernetics and Biomedical Engineering 2023, 43(3), pp. 616-633; ISSN 0208-5216. [CrossRef]
Fu, Q.; Dong, H.; Breast Cancer Recognition Using Saliency-Based Spiking Neural Network. Wireless Communications and Mobile Computing 2022, 2022, 8369368. [CrossRef]
Tan, P.; Chen, X.; Zhang, H.; Wei, Q.; Luo, K. Artificial intelligence aids in development of nanomedicines for cancer management. Semin. Cancer Biol. 2023, 89, pp. 61-75. [CrossRef]
Malhotra, S.; Halabi, O.; Dakua, S.P.; Padhan, J.; Paul, S.; Palliyali, W. Augmented Reality in Surgical Navigation: A Review of Evaluation and Validation Metrics. Appl. Sci. 2023, 13, 1629. [CrossRef]
Wisotzky, E.L.; Rosenthal, J-C.; Meij, S.; et al. Telepresence for surgical assistance and training using eXtended reality during and after pandemic periods. J. Telemed. Telecare. 2023. [CrossRef]
Martin-Gomez, A.; et al. STTAR: Surgical Tool Tracking Using Off-the-Shelf Augmented Reality Head-Mounted Displays. IEEE Trans. Vis. Comput. Graph. 2022. [CrossRef]
Minopoulos, G.M.; Memos, V.A.; Stergiou, K.D.; Stergiou, C.L.; Psannis, K.E. A Medical Image Visualization Technique Assisted with AI-Based Haptic Feedback for Robotic Surgery and Healthcare. Appl. Sci. 2023, 13, 3592. [CrossRef]
Li, J.; Cairns, B.J.; Li, J.; et al. Generating synthetic mixed-type longitudinal electronic realth records for artificial intelligent applications. Digit. Med. 2023, 6, 98. [CrossRef]
Pammi, M.; Aghaeepour, N.; Neu, J. Multiomics, artificial intelligence, and precision medicine in perinatology. Pediatr. Res. 2023, 93, pp. 308–315. [CrossRef]
Vardi, G. On the Implicit Bias in Deep-Learning Algorithms. Commun. ACM. 2023, 66(6), pp. 86–93. [CrossRef]
PhysioNet. Available online: https://physionet.org/ (accessed on 13 November 2023).
National Sleep Research Resource. Available online: https://sleepdata.org/ (accessed on 13 November 2023).
Open Access Series of Imaging Studies - OASIS Brain. Available online: https://www.oasis-brains.org/ (accessed on 13 November 2023).
OpenNeuro. Available online: https://openneuro.org/ (accessed on 13 November 2023).
Brain Tumor Dataset. Available online: https://figshare.com/articles/dataset/brain_tumor_dataset/1512427?file=7953679 (accessed on 13 November 2023).
The Cancer Imaging Archive. Available online: https://www.cancerimagingarchive.net/ (accessed on 13 November 2023).
LUNA16. Available online: https://luna16.grand-challenge.org/ (accessed on 13 November 2023).
MICCAI 2012 Prostate Challenge. Available online: https://promise12.grand-challenge.org/ (accessed on 13 November 2023).
IEEE Dataport. Available online: https://ieee-dataport.org/ (accessed on 13 November 2023).
AIMI. Available online: https://aimi.stanford.edu/shared-datasets (accessed on 13 November 2023).
fastMRI. Available online: https://fastmri.med.nyu.edu/ (accessed on 13 November 2023).
Alzheimer's Disease Neuroimaging Initiative. Available online: http://adni.loni.usc.edu/ (accessed on 13 November 2023).
Pediatric Brain Imaging Dataset. Available online: http://fcon_1000.projects.nitrc.org/indi/retro/pediatric.html (accessed on 13 November 2023).
ChestX-ray8. Available online: https://nihcc.app.box.com/v/ChestXray-NIHCC (accessed on 13 November 2023).
Breast Cancer Digital Repository. Available online: https://bcdr.eu/ (accessed on 13 November 2023).
Brain-CODE. Available online: https://www.braincode.ca/ (accessed on 13 November 2023).
RadImageNet. Available online: https://www.radimagenet.com/ (accessed on 13 November 2023).
EyePACS. Available online: https://paperswithcode.com/dataset/kaggle-eyepacs (accessed on 13 November 2023).
Medical Segmentation Decathlon. Available online: http://medicaldecathlon.com/ (accessed on 13 November 2023).
DDSM. Available online: http://www.eng.usf.edu/cvprg/Mammography/Database.html (accessed on 13 November 2023).
LIDC-IDRI. Available online: https://wiki.cancerimagingarchive.net/display/Public/LIDC-IDRI (accessed on 13 November 2023).
synapse. Available online: https://www.synapse.org/#!Synapse:syn3193805/wiki/217789 (accessed on 13 November 2023).
Mini-MIAS. Available online: http://peipa.essex.ac.uk/info/mias.html (accessed on 13 November 2023).
Breast Cancer His-to-pathological Data-base (BreakHis). Available online: https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathologi-cal-database-breakhis/ (accessed on 13 November 2023).
Messidor. Available online: https://www.adcis.net/en/third-party/messidor/ (accessed on 13 November 2023).
Li, J.; Cheng, Jh.; Shi, Jy.; Huang, F. Brief Introduction of Back Propagation (BP) Neural Network Algorithm and Its Improvement. In Advances in Computer Science and Information Engineering; Jin, D., Lin, S., eds.; Springer: Berlin, Germany, 2012, 169, pp. 1-10. [CrossRef]
Johnson, X.Y.; Venayagamoorthy, G.K. Encoding Real Values into Polychronous Spiking Networks. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain, 18-23 July 2010. pp. 1–7. [CrossRef]
Bohte, S.M.; Kok, J.N.; La Poutre, H. Error-back propagation in temporally encoded networks of spiking neurons. Neurocomputing 2002, 48, pp. 17–37. [CrossRef]
Rajagopal, S.; Chakraborty, S.; Gupta, M.D. Deep Convolutional Spiking Neural Network Optimized with Arithmetic Optimization Algorithm for Lung Disease Detection Using Chest X-Ray Images. Biomed. Signal Process. Control. 2023, 79, 104197. [CrossRef]
Kheradpisheh, S.R.; Ghodrati, M.; Ganjtabesh, M.; Masquelier, T. Bio-Inspired unsupervised learning of visual features leads to robust invariant object recognition. Neurocomputing 2016, 205, pp. 382–392. [CrossRef]
Brader, J.M.; Senn, W.; Fusi, S. Learning real-world stimuli in a neural network with spike-driven synaptic dynamics. Neural Comput. 2007, 19, pp. 2881–2912. [CrossRef]
Masquelier, T.; Guyonneau, R.; Thorpe, S.J. Competitive STDP-based spike pattern learning. Neural Comput. 2009, 21, pp. 1259–1276. [CrossRef]
Lee, J.H.; Delbruck, T.; Pfeiffer, M. Training deep spiking convolutional neural Networks with STDP-based unsupervised pre-training followed by supervised fine-tuning. Front. Neurosci. 2018, 12, p. 435. [CrossRef]
Lee, J.H.; Delbruck, T.; Pfeiffer, M. Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures. Front Neurosci. 2020, 14, 119. [CrossRef]
Wu, Y.; Deng, L.; Li, G.; Zhu, J.; Shi, L. Direct training for spiking neural networks: Faster, Larger, Better. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019, 33, pp. 1311–1318. [CrossRef]
Neil, D.; Pfeiffer, M.; Liu, S.-C. Learning to be efficient: algorithms for training low-latency, low-compute deep spiking neural networks. In Proceedings of the 31st Annual ACM Symposium on Applied Computing (SAC '16). Association for Computing Machinery, New York, NY, USA, 293–298. [CrossRef]
Lee, J.H.; Delbruck, T.; Pfeiffer, M. Training deep spiking neural networks using backpropagation. Front. Neurosci. 2016, 10, 508. [CrossRef]
Zhan, K.; Li, Y.; Li, Q.; Pan, G. Bio-Inspired Active Learning Method in spiking neural network. Know.-Based Syst. 2023, 261, C. [CrossRef]
Wang, J.; Chen, S.; Liu, Y.; Lau, R. Intelligent Metaverse Scene Content Construction. IEEE Access 2023, 11, pp. 76222-76241. [CrossRef]
He, S.; Bao, R.; Li, J.; Stout, J.; Bjornerud, A.; Grant, P.E.; Ou, Y. Computer-Vision Benchmark Segment-Anything Model (SAM) in Medical Images: Accuracy in 12 Datasets. arXiv preprint arXiv: 2304.09324, 2023; Available online. (accessed on 13 November 2023). [CrossRef]
Himangi; Singla, M. To Enhance Object Detection Speed in Meta-Verse Using Image Processing and Deep Learning. Int. J. Intell. Syst. Appl. Eng. 2023, 11, pp. 176–184. [CrossRef]
Pooyandeh, M.; Han, K.-J.; Sohn, I. Cybersecurity in the AI-Based Metaverse: A Survey. Appl. Sci. 2022, 12, p. 12993. [CrossRef]

Figure 1. Literature search flowchart.

Table 1. The comparison of the AI-based algorithms applied in medical image scan segmentation.

Network Type	Neuron model	Average Accuracy [%]	Data sets - training/testing/validation sets [%] or training/testing sets [%]	Input parameters	Learning rule	Biological plausibility	Ref.
ANN	Perceproton	99.10	mammography images lack of information	mammography images – 33 features extracted by Region of Interest (ROI)	BP	low	[95]
CNN	Perceproton	98.70	Brain tumor, MRI color images 70/15/15	MRI image scan, 12 features (mean, SD, entropy, Energy, contract, homogeneity, correlation, variance, covariance, RMS, skewness, kurtosis)	BP	low	[96]
CNN	Perceproton	93.00	Echocardiograms 60/40	Disease classification, cardiac chamber segmentation, viewpoints classification in echocardiograms	lack of information	low	[97]
CNN	Perceproton	94.58	brain tumor images 50/25/25	brain tumor images	lack of information	low	[98]
CNN	Perceproton	91.10	IVUS frames, EA after OCT/IVUS registration	IVUS frames, EA after OCT/IVUS registration	lack of information	low	[99]
CNN	Perceproton	98.00	2-D ultrasound 49/49/2	Classification of the cardiac view into 7 classes	lack of information	low	[100]
CNN	Perceproton	99.30	coronary cross-sectional images 80/20	Detection of motion artifacts in coronary CCTA, classification of coronary cross-sectional images	lack of information	low	[101]
CNN	Perceproton	99.00	MRI image scan 60/40	Bounding box localization of LV in short-axis MRI slices	lack of information	low	[102]
CNN and doc2vec	Perceproton	96.00	Doppler US cardiac valve images 94/4/2	Automatic generation of text for Doppler US cardiac valve images	lack of information	low	[103]
Deep CNN + complex data preparation	Perceproton	97.00	Vessel segmentation lack of information	proposing a supervised segmentation technique that uses a deep neural network. Using structured prediction	lack of information	low	[104]
CNN and Transformer encoders	Perceproton	90.70	Automated Cardiac Diagnosis Challenge (ACDC), CT image scans from Synapse 60/40	CT image scans	BP	low	[105]
CNN, and RNN	Perceproton	95.24 (REs-Net50) 97.18(IncepnetV3) 98.03 (Dense-Net)	MRI image scan of the brain 80/20	MRI image scan of the brain, modality, mask images	BP	low	[106]
CNN, and RNN	Perceproton	95.74 (REs-Net50) 97.14(DarkNet-53)	skin image lack of information	skin image	BP	low	[107]
SNN	LIF	81.95	baseline T1-weighted whole brain MRI image scan lack of information	The hippocampus section of the MRI image scan	ANN-SNN conversion	low	[108]
SNN	LIF	92.89	burn images lack of information	256 × 256 burn image encoded into 24 × 256 × 256 feature maps	BP	low	[109]
SNN	LIF	89.57	skin images (melanoma and non-melanoma) lack of information	skin images converted into spikes using Poisson distribution	surrogated gradient descent	low	[110]
SNN	LIF	99.60	MRI scan of brain tumors 80/10/10	2D MRI scan of brain tumors	YO-LO-2-based transfer learning	low	[111]
SNN	LIF	95.17	microscopic images of breast tumor lack of information	microscopic images of breast tumor	Spike-Prop	low	[112]

Table 2. A summary of publicly available retrospective image scan medical databases.

Database	Data source	Data type	Amount of data	Availability
Physionet	[121]	EEG, x-ray images, polysomnographic,	Auditory evoked potential EEG-Biometric dataset – 240 measurements from 20 subjects The Brno University of Technology Smartphone PPG Database (BUT PPG) – 12 polysomnographic recordings CAP Sleep Database - 108 polysomnographic recordings CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images – 676 803 chest radiographs Electroencephalogram and eye-gaze datasets for robot-assisted surgery performance evaluation– EEG from 25 subjects Siena Scalp EEG Database – EEG from 14 subjects	Publics
Physionet	[121]	EEG, x-ray images, polysomnographic,	Computed Tomography Images for Intracranial Hemorrhage Detection and Segmentation – 82 CT After Traumatic Brain Injury (TBI) A multimodal dental dataset facilitating machine learning research and clinic service -574 CBCT images from 389 patients KURIAS-ECG: a 12-lead electrocardiogram database with standardized diagnosis ontology- EEG 147 subjects VinDr-PCXR: An open, large-scale pediatric chest X-ray dataset for interpretation of common thoracic diseases – adult chest radiography (CXR) 9125 subjects VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs - 10466 spine X-ray images from 5000 studies	Restricted access
National Sleep Research Resource	[122]	Polysomnography	Apnea Positive Pressure Long-term Efficacy Study – 1516 subject Efficacy Assessment of NOP Agonists in Non-Human Primates – 5 subjects Maternal Sleep in Pregnancy and the Fetus – 106 subjects Apnea, Bariatric surgery, and CPAP study – 49 subjects Best Apnea Interventions in Research – 169 subjects Childhood Adenotonsillectomy Trial – 1243 subjects Cleveland Children's Sleep and Health Study – 517 subjects Cleveland Family Study – 735 subjects Cox & Fell (2020) Sleep Medicine Reviews – 3 subjects Heart Biomarker Evaluation in Apnea Treatment – 318 subjects Hispanic Community Health Study / Study of Latinos – 16415 subjects Home Positive Airway Pressure – 373 subjects Honolulu-Asia Aging Study of Sleep Apnea – 718 subjects Learn – 3 subjects Mignot Nature Communications – 3000 subjects MrOS Sleep Study – 2237 subjects NCH Sleep DataBank – 3673 subjects Nulliparous Pregnancy Outcomes Study Monitoring Mothers-to-be – 3012 subjects Sleep Heart Health Study – 5804 subjects Stanford Technology Analytics and Genomics in Sleep – 1881 subjects Study of Osteoporotic Fractures – 461 subjects Wisconsin Sleep Cohort – 1123 subjects	Publics on request (no commercial use)
Open Access Series of Imaging Studies - Oasis Brain	[123]	MRI Alzheimer’s disease	OASIS-1 – 416 subjects OASIS-2 – 150 subjects OASIS-3 – 1379 subjects OASIS-4 – 663 subjects	Publics on request (no commercial use)
openeuro	[124]	MRI, PET, MEG, EEG, and iEEG data (various types of disorders, depending on the database)	595 MRI public datasets, 23 304 subjects 8 PET public datasets – 19 subjects 161 EEG public dataset – 6790 subjects 23 iEEG public dataset – 550 subjects 32 MEG public dataset – 590 subjects	Publics
brain tumor dataset	[125]	MRI, brain tumor	MRI - 233 subjects	Publics
Cancer Ima-ging Ar-chive (TCIA)	[126]	MR, CT, Positron Emission Tomography, Computed Radiography, Digital Radiography, Nuclear Medicine, Other (a category used in DICOM for images that do not fit into the standard modality categories), Structured Reporting Pathology Various	HNSCC-mIF-mIHC-comparison – 8 subjects CT-Phantom4Radiomics – 1 subject Breast-MRI-NACT-Pilot – 64 subjects Adrenal-ACC-Ki67-Seg – 53 subjects CT Lymph Nodes – 176 subjects UCSF-PDGM – 495 subjects UPENN-GBM – 630 subjects Hungarian-Colorectal-Screening – 200 subjects Duke-Breast-Cancer-MRI – 922 subjects Pancreatic-CT-CBCT-SEG – 40 subjects HCC-TACE-Seg – 105 subjects Vestibular-Schwannoma-SEG – 242 subjects ACRIN 6698/I-SPY2 Breast DWI – 385 subjects I-SPY2 Trial – 719 subjects HER2 tumor ROIs – 273 subjects DLBCL-Morphology – 209 subjects CDD-CESM – 326 subjects COVID-19-NY-SBU – 1,384 subjects Prostate-Diagnosis – 92 subjects NSCLC-Radiogenomics – 211 subjects CT Images in COVID-19 – 661 subjects QIBA-CT-Liver-Phantom – 3 subjects Lung-PET-CT-Dx – 363 subjects QIN-PROSTATE-Repeatability – 15 subjects NSCLC-Radiomics – 422 subjects Prostate-MRI-US-Biopsy – 1151 subjects CRC_FFPE-CODEX_CellNeighs – 35 subjects TCGA-BRCA – 139 subjects TCGA-LIHC – 97 subjects TCGA-LUAD – 69 subjects TCGA-OV – 143 subjects TCGA-KIRC – 267 subjects Lung-Fused-CT-Pathology – 6 subjects AML-Cytomorphology_LMU – 200 subjects Pelvic-Reference-Data – 58 subjects CC-Radiomics-Phantom-3 – 95 subjects MiMM_SBILab – 5 subjects LCTSC – 60 subjects QIN Breast DCE-MRI – 10 subjects Osteosarcoma Tumor Assessment – 4 subjects CBIS-DDSM – 1566 subjects QIN LUNG CT – 47 subjects CC-Radiomics-Phantom – 17 subjects PROSTATEx – 346 subjects Prostate Fused-MRI-Pathology – 28 subjects SPIE-AAPM Lung CT Challenge – 70 subjects ISPY1 (ACRIN 6657) – 222 subjects Pancreas-CT – 82 subjects 4D-Lung – 20 subjects Soft-tissue-Sarcoma – 51 subjects LungCT-Diagnosis – 61 subjects Lung Phantom – 1 subject Prostate-3T – 64 subjects LIDC-IDRI – 1010 subjects RIDER Phantom PET-CT – 20 subjects RIDER Lung CT – 32 subjects BREAST-DIAGNOSIS – 88 subjects CT COLONOGRAPHY (ACRIN 6664) – 825 sub-jects	Publics (Free access, registration required)
LUNA16	[127]	CT, Lung Nodules	LUNA16- 888 CT scans	Publics (Free access to all users)
MICCAI 2012 Prostate Challenge	[128]	MRI, Prostate Imaging	Prostate Segmentation in Transversal T2-weighted MR images - Amount of Data: 50 training cases	Publics (Free access to all users)
IEEE Dataport	[129]	Ultrasound Images, Brain MRI, Ultra-widefield fluorescein angiography images, Chest X-rays, Mammograms, CT, Lung Image Database Consortium and Image, Thermal Images	CNN-Based Image Reconstruction Method for Ultrafast Ultrasound Imaging: 31,000 images OpenBHB: a Multi-Site Brain MRI Dataset for Age Prediction and Debiasing: >5,000 - Brain MRI. Benign Breast Tumor Dataset: 83 patients - Mammograms. X-ray Bone Shadow Suppression: 4,080 images STROKE: CT series of patients with M1 thrombus before thrombectomy: 88 patients Automatic lung segmentation results Nextmedproject - 718 of the 1012 LIDC-IDRI scans PRIME-FP20: Ultra-Widefield Fundus Photography Vessel Segmentation Dataset -15 images Plantar Thermogram Database for the Study of Diabetic Foot Complications - Amount of data: 122 subjects (DM group) and 45 subjects (control group)	A part Public and a part restricted (Subscription)
AIMI	[130]	Brain MRI studies, Chest X-rays, echocardiograms, CT	BrainMetShare- 156 subjects CheXlocalize: 700 subjects BrainMetShare: 156 subjects COCA - Coronary Calcium and Chest CTs: Not specified CT Pulmonary Angiography: Not specified CheXlocalize: 700 subjects CheXpert: 65,240 subjects CheXphoto: 3,700 subjects CheXplanation: Not specified DDI - Diverse Dermatology Images: Not specified EchoNet-Dynamic: 10,030 subjects EchoNet-LVH: 12,000 subjects EchoNet-Pediatric: 7,643 subjects LERA - Lower Extremity Radiographs: 182 subjects MRNet: 1,370 subjects MURA: 14,863 studies Multimodal Pulmonary Embolism Dataset: 1,794 subjects SKM-TEA: Not specified Thyroid Ultrasound Cine-clip: 167 subjects CheXpert:224,316 chest radiographs of 65,240 subjects	Publics (Free access)
fast MRI	[131]	MRI	fast MRI Knee: 1,500+ subjects fast MRI Brain: 6,970 subjects fast MRI Prostate: 312 subjects	Publics (Free access, registration required)
ADNI	[132]	MRI, PET	Scans Related to Alzheimer's Disease	Publics (Free access, registration required)
Pediatric Brain Imaging Dataset	[133]	MRI	Pediatric Brain Imaging Data-set Over 500 pediatric brain MRI scans	Publics (Free access to all users
ChestX-ray8	[134]	Chest X-ray Images	NIH Clinical Center Chest X-ray Dataset - Over 100,000 images from more than 30,000 subjects	Publics (Free access to all users)
Breast Cancer Digital Repository	[135]	MLO and CC images	BCDR-FM (Film Mammography-based Repository) - Amount of Data: 1010 subjects BCDR-DM (Full Field Digital Mammography-based Repository)Amount of Data: 724 subjects	Publics (Free access, registration required
Brain-CODE	[136]	Neuroimaging	High-Resolution Magnetic Resonance Imaging of Mouse Model Related to Autism - 839 subjects	Restricted (Application for access is required and Open Data Releases)
RadImageNet	[137]	PET, CT, Ultrasound, MRI with DICOM tags	5 million images from over 1 million studies across 500,000 subjects	Publics subset available; Full dataset licensable; Academic access with restrictions
EyePACS	[138]	Retinal fundus images for diabetic retinopathy screening	Images for Training and validation set- 57,146 images Test set - 8,790 images	Available through the Kaggle competition
Medical Segmentation Decathlon	[139]	mp-MRI, MRI, CT	10 data sets Cases (Train/Test) Brain 484/266 Heart 20/10 Hippocampus 263/131 Liver 131/70 Lung 64/32 Pancreas 282/139 Prostate 32/16 Colon 126/64 Hepatic Vessels 303/140 Spleen 41/20	Open source license, available for research use
DDSM	[140]	Mammography images	2,500 studies with images, subjects info - 2620 cases in 43 volumes categorized by case type	Publics (Free access)
LIDC-IDRI	[141]	CT Images with Annotations	1018 cases with XML and DICOM files - Images (DICOM, 125GB), DICOM Metadata Digest (CSV, 314 kB), Radiologist Annotations/Segmentations (XML format, 8.62 MB), Nodule Counts by Patient (XLS), Patient Diagnoses (XLS)	Images and annotations are available for download with NBIA Data Retriever, usage under CC BY 3.0
synapse	[142]	CT scans, Zip files for raw data, registration data	CT scans- 50 scans with variable volume sizes and resolutions Labeled organ data -13 abdominal organs were manually labeled Zip files for raw data - Raw Data: 30 training + 20 testing; Registration Data: 870 training-training + 600 training-testing pairs	Under IRB supervision, Available for participants
Mini-MIAS	[143]	Mammographic images	322 digitized films on 2.3GB 8mm tape - Images derived from the UK National Breast Screening Programme and digitized with Joyce-Loebl scanning microdensitometer to 50 microns, reduced to 200 microns and standardized to 1024x1024 pixels for the database	free for scientific research under a license agreement
Breast Cancer Histopathological Database (BreakHis)	[144]	microscopic images of breast tumor	9,109 microscopic images of breast tumor tissue collected from 82 subjects	free for scientific research under a license agreement
Messidor	[145]	eye fundus color numerical images	1200 eye fundus color numerical images of the posterior pole	free for scientific research under a license agreement

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Artificial Intelligence-Based Algorithms in Medical Image Scan Segmentation and Intelligent Visual-Content Generation

Abstract

1. Introduction

2. Materials and Methods

3. Neural communication

4. Taxonomy of neural network applied in the medical image segmentation process

4.1. Convolutional Neural Network

4.2. Recurrent Neural Network

4.3. Spiking Neural Networks

5. Learning algorithms

5.1. Back Propagation Algorithm

5.2. ANN-SNN Conversion

5.3. Supervised Hebbian Learning (SHL)

5.4. Reinforcement Learning with Supervised Models

5.5. Chronotron

5.6. Bio-inspired Learning Algorithms

5.6.1. Spike Timing Dependent Plasticity

5.6.2. Spike-Driven Synaptic Plasticity

5.6.3. Tempotron Learning Rule

6. Neural networks and learning algorithms in the medical image segmentation process

7. Data availability

8. Discussion and conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe