2.1. The concept of system resilience
The concept of resilience has become widespread in systems engineering and the respective property is actively studied in technical systems. Resilience in this context expands the concept of dependability of technical systems, emphasizing the need to create systems that are flexible and adaptive [
15]. Cybersecurity experts define resilience as the ability to anticipate, withstand, recover from, and adapt to adverse conditions, external influences, attacks, or system disruptions [
16].
Since 2005, many definitions of system resilience have been proposed. In [
17], the resilience of a system was formulated as its ability to maintain its functions and structure in the face of internal and external changes and to degrade in a controlled manner when necessary. In [
18], resilience is defined as the ability of a system to withstand significant disturbances within acceptable degradation parameters and to recover within an acceptable time with balanced costs and risks. In [
19], the authors consider the property of system resilience to the disturbing event(s) as the ability of the system to effectively reduce the magnitude and duration of deviations from the target levels of system performance under the influence of this event(s). Other researchers [
20] formulate resilience as the ability of a system to maintain functionality and recover from losses caused by extreme events.
In [
21], the resilience of a system is understood as the internal ability of the system to adjust its functioning before, during and after changes or disturbances, or during changes or disturbances to maintain the necessary operations in both expected and unexpected conditions. In a more recent work [
22], resilience is understood as the ability of a constructed system to autonomously perceive and respond to adverse changes in the functional state, withstand failure events and recover from the consequences of these unpredictable events. Some researchers [
23] define resilience in a shorter way: the ability of the system to withstand stressors. In [
24], resilience was defined as the ability of a system to adapt to changing conditions, withstand disturbances and recover from them.
Therefore, there is a need to ensure the resilience of AI-algorithms, given their ability to continue to function under varying system requirements, thus changing the parameters of the physical and information environment, as well as the emergence of unspecified failures and malfunctions. The stages of disturbance processing by the resilient system are best described in the report of the US National Academy of Sciences in 2012 on the example of resilience to natural disasters. Four main stages were highlighted (
Figure 1) [
25]:
- -
planning and preparation of the system;
- -
absorption of disturbance;
- -
system recovery;
- -
system adaptation.
At the stage of planning and preparation for destructive disturbances, a resilient system can perform the following actions:
- -
risk assessment through system analysis and simulation of destructive disturbances;
- -
implementation of methods for detecting destructive disturbances;
- -
elimination of known vulnerabilities and implementation of a set of defense methods against destructive disturbances;
- -
ensuring appropriate backup and recovery strategies.
Figure 1.
Stages of resilience.
Figure 1.
Stages of resilience.
The absorption stage is designed to implement unpredictable changes in the basic architecture or behavior of the system, depending on what exactly is subject to destructive influence. Absorption mechanisms can have a multi-layered structure, implementing protection in depth, when the system determines which mechanism should be used if the threat cannot be absorbed at this level. If it is impossible to avoid degradation, then the mechanism of controlled degradation (graceful degradation) is implemented, when the core operations of the system take priority over non-essential services for as long as possible. The system can be pre-configured with an ordered set of less functional states that represent acceptable trade-offs between functionality, performance and cost effectiveness.
The recovery stage includes measures aimed at restoring the lost functionality and performance as quickly and cost efficiently as possible. The adaptation phase focuses on the ability of the system to change to better cope with future threats.
The principles of Affordable Resilience are often used in the design and operation of resilient systems, taking into account resource constraints [
26]. Affordable resilience involves achieving an effective balance between the life cycle cost and the technical characteristics of the system’s resilience. When considering the life cycle for Affordable Resilience, it is necessary to take into account not only the risks and challenges associated with known and unknown perturbations in time, but also the opportunities to find gains in known and unknown future environments.
In [
26,
27] it is proposed to balance cost and the benefits of obtained resilience to achieve Affordable resilience (
Figure 2).
After determining the affordable levels of resilience for each key performance indicator, the priorities of these indicators can be determined on the basis of the Multi-attribute Utility Theory [
28] or Analytical Hierarchy Process [
29]. As a rule, the priorities of performance indicators of the resilient system depend on the applied domain area.
In the papers [
30,
31], to optimize the parameters and hyperparameters g of the system, taking into account resource constraints, it is proposed to find a trade-off between the performance criterion J under normal conditions and the integral indicator of system resilience R under the influence of disturbances, that is
where
is coefficient that regulates the trade-off between the performance criterion and the integral resilience index of the system within the control period.
Researchers and engineers involved in the development of resilient systems have formulated a number of heuristics that should be relied on when designing resilient systems [
11,
15,
18]:
functional redundancy or diversity, which consists in the existence of alternative ways to perform a certain function;
hardware redundancy, which is the reservation of the hardware to protect against hardware failures;
the possibility of self-restructuring in response to external changes;
predictability of the automated system behavior to guarantee trust and avoid frequent human intervention;
avoiding excessive complexity caused by poor design practices;
the ability of the system to function in the most probable and worst-case scenarios of natural and man-made nature;
controlled (graceful) degradation, which is the ability of the system to continue to operate under the influence of an unpredictable destructive factor by transitioning to a state of lower functionality or performance;
implementation of a mechanism to control and correct the drift of the system to a non-functional state by making appropriate compromises and timely preventive actions;
ensuring the transition to a "neutral" state to prevent further damage under the influence of an unknown destructive disturbance until the problem is thoroughly diagnosed;
learning and adaptation, i.e. reconfiguration, optimization and development of the system on the basis of new knowledge constantly obtained from the environment;
inspectability of the system, which provides for the possibility of necessary human intervention without requiring unreasonable assumptions from it;
a human being should be aware of the situation when there is a need for "quick comprehension" of the situation and the formation of creative solutions;
implementation of the possibility of replacing or backing up automation by people when there is a change in the context for which automation is not prepared, but there is enough time for human intervention;
implementation of the principle of awareness of intentions, when the system and humans should maintain a common model of intentions to support each other when necessary.
Thus, resilience is a system property and is based on certain principles and stages of processing disturbing influences.
2.2. Vulnerabilities and threats of AISs
In general, AIS operate in imperfect conditions and can be exposed to various disturbances. AI-technology has numerous AI-specific vulnerabilities. In addition to AI-specific vulnerabilities, there are physical environment vulnerabilities that can lead to hardware failures in the deployment environment, as well as vulnerabilities related to safety and cybersecurity of information systems.
One of the AI-specific vulnerabilities is the dependence of the efficiency and security of AI on the quantity and quality of training data. An AIS can be highly effective only if the training data is unbiased. However, data collection or model building may be outsourced to an uncontrolled environment for economic reasons or to comply with local data protection laws. Therefore, the obtained result cannot always be trusted. The data collected in another environment (including synthetic data) may not be relevant to the application environment. In addition, AIS can only analyze correlations in data, but cannot distinguish false correlations from true causal relationships. It creates an additional possibility of data poisoning and data quality reduction.
The low interpretability of modern deep neural networks can also be seen as a vulnerability, since attacks on the AIS cannot be detected by analyzing model parameters and program code, but only by incorrect model behavior [
32]. Without interpreting the AIS-decisions, it is difficult for even an expert to understand the reason of AIS performance degradation.
Huge Input and State Spaces and Approximate Decision Boundaries can also be considered to be vulnerabilities. It has been shown that due to the high dimensionality of the feature space, it is possible to search in many directions to select imperceptible modifications to the original data samples that mislead the AIS. Also, the high dimensionality of the input feature space facilitates the presence of so-called non-robust features in the data, which improves the transferability of adversarial examples to other AIS [
32,
33]. In addition, the high dimensionality of State Spaces complicates the process of adapting to rapid changes in the dependencies between inputs and outputs of AIS. It is difficult to simultaneously avoid catastrophic forgetting and ensure high speed of adaptation of a large AIS to changes.
AIS vulnerabilities can be exploited by hackers, terrorists and all kinds of criminals. In addition, employees who face dismissal as a result of their replacement by AIS may try to discredit the effectiveness of AIS. As AIS is increasingly used in military vehicles, these machines can become a target for the opposing side of an armed conflict. In addition, as the AI-technologies evolves, some AISs can be directed to attack other AISs.
AIS have various resource constraints, which can be a source of threats, as sufficient or excessive resources are needed to implement reliable redundancy, self-diagnosis and recovery, as well as optimization (improvement).
The physical environment is also a source of threats. Such influences as EM Interference, Laser Injection, and Heavy-ion radiation can cause damage to the neural network weighs and cause AIS failures [
34]. Also, variations in the supply voltage or direct influence on the clock circuit can lead to a glitch in the clock signal. It leads to incorrect results of intermediate and final calculations in the neural network. In addition, the components of the deployment system may be damaged. If software and artificial intelligence algorithms do not take into account the following system faults when designing, this can lead to AIS failures.
The natural environment can also be a source of threats, as it can contain influences that were not taken into account during training and can be perceived as noise or novelty in the data. The high variability of the observed environment and limited resources for training data collection leads to insufficient generalization. In addition, the environment, in general, is not stationary, and the patterns of the observed process can change unpredictably. As a result, at certain times, the model of mapping inputs to outputs may become irrelevant. A compromised network, infected AIS software and remote access to AIS can be a source of threat to AIS in terms of the ability to acquire its data, structure and parameters, which facilitates the formation of attacks.
Among AIS threats, there are three main types of disturbances: drift, adversarial attacks, and faults. Each of these types has subtypes depending on their way of formation and specifics of impact.
The drift problem occurs when at a certain time point the test data begins to differ significantly from the training data in certain characteristics, which indicates the need to update the model to avoid performance degradation. Drift in machine learning is divided into real concept drift, covariance shift, and a priori probability shift.
Real concept drift means a change in the distribution of a posteriori probability
at time
compared to the a posteriori probability distribution
at time
, which is connected with principal change in the underlying target concept, that is
, where
is a set of input variables, and
is target variable (
Figure 3b). In the case of reinforcement learning, the real drift of concepts occurs as a result of environment context changes (environment shift). In other words, the agent functions under conditions of non-stationary rewards and/or non-stationary transition probabilities between system states [
35]. Ficle concept drift, which is a subcategory of real drift and occurs when some data samples belong to two different concepts or contexts at two different times, is considered separately. Subconcept drift or Intersect concept drift is also a subcategory of real drift and occurs when only a subset of the dataset changes its target variable or rewards feedback after drift has occurred. Full concept drift or Severe concept drift is a subcategory of real concept drift, which occurs when target variables of all data points change after the drift occurs. In the case of reinforcement learning, Full concept drift can be associated with changes in action-reward feedback and transition probabilities for all historical state-action pairs.
Covariate shift, or virtual concept drift as it is commonly called in the literature, occurs when input data distribution changes without affecting the target concept (
Figure 3c). In mathematical terms,
and
. Although in practice, changes in the data and the posterior probability distributions often happen simultaneously. So, covariate shift can be a component of overall drift or the initial stage of real concept drift. Out-of-distribution data can be considered one of the subcategories of Covariate shift. Out-of-distribution data may have an element of novelty and it do not ensure the reliability of the analysis. This may be related to the lack of training data in appropriate region of space, which increases epistemic uncertainty. Or it may be caused by the fact that the data is completely outside the training distribution and the model could not extrapolate effectively. It is a case of aleatoric uncertainty. Another subcategory Covariate shift is related to dynamically arising new attributes in the input space. This subcategory is also called Feature-evolution. Mathematically, this can be described as
and as a result
. This can occur if the AIS is evolving and new data sources are added. When AIS in inference mode encounters data that falls outside the distribution on which it was trained, AIS reaction can be unpredictable and have catastrophic consequences. Prior-probability shift is another type of drift and is associated with the appearance of data imbalance or a change in the set of concepts or contexts. In the case of data classification, prior-probability shift means that there is a change in probabilities
due to unbalanced class samples (
Figure 3d), emergence of novel classes (
Figure 3e), removal of existing classes (Concept deletion), concept fusion (
Figure 3f) or splitting certain classes into several subclasses (Concept splitting).
Figure 3.
Visual illustration of different types of concept drift for three-class classifier: a—original data distribution and decision boundary of classifier; b—real concept drift; c—virtual concept drift; d—imbalance of classes; e—the emergence of a new class; f—class merging.
Figure 3.
Visual illustration of different types of concept drift for three-class classifier: a—original data distribution and decision boundary of classifier; b—real concept drift; c—virtual concept drift; d—imbalance of classes; e—the emergence of a new class; f—class merging.
In terms of the time characteristics of concept drift, they can be classified into abrupt (sudden), gradual, incremental, re-occurring and blip [
36]. Abrupt concept drift is the rapid change of an old concept to a new one. In this case, the performance of the model suddenly decreases, and there is a need to quickly train a new concept to restore performance. Gradual drift has an overlapping concept, and after some period of time, the new concept becomes stable. In incremental concept drift, certain concept vanished from the observations at certain time and never occurred again. In a recurring type of drift, a concept reappears after a long period of time. А recurring change of concept occurs in the flow. Such drift can have cyclic and acyclic behavior. Cyclical drift occurs when there are seasonal fluctuations. For example, sales of cold clothes increase during the summer season. An acyclic phenomenon is observed when the price of electricity increases due to an increase in the price of gasoline and normally it returns to the previous price. A blip drift is a very rapid change in a concept or a rare event, so it is considered as an outlier in a stationary distribution. In other words, in general, blip drift is usually not even considered to be concept drift.
Significant destructive effects on AIS can be caused by various faults. Faults in a computer system can cause errors. An error is such manifestations of faults that leads to a deviation of the actual state of a system element from the expected one [
34]. If a fault does not cause an error, then such a fault is called a sleeping fault. As a result of errors, failures can occur, meaning that the system is unable to perform its intended functionality or behavior. In general, faults can be divided into four groups:
physical faults that lead to persistent failures or short-term failures of AIS hardware;
design faults, which are the result of erroneous actions made during the creation of AIS and lead to the appearance of defects in both hardware and software;
interaction faults that result from the impact of external factors on AIS hardware and software;
software faults caused by the effect of software aging, which lead to persistent failures or short-term failure of AIS software.
Failures of the system and its components can be caused by a single fault, a group of faults, or a sequential manifestation of faults in the form of a "pathological" chain.
Figure 4 illustrates the causal relationship between a hardware fault, error, and failure. A failure causes a violation of the predictable behavior of a neural network that is deployed to perform its task in a computing environment.
Hardware faults can be classified according to their time characteristics (
Figure 5) [
37]:
Figure 5.
Types of hardware faults in the computing environment.
Figure 5.
Types of hardware faults in the computing environment.
Permanent fault types can simulate many defects in transistors and interconnect structures at the logic level with a fairly high accuracy The most common model of permanent defects is the so-called "stuck-at", which consists in maintaining an exclusively high (stuck-at-1) or low (stuck-at-0) state on data or control lines. Also, to assess fault tolerance in computing systems, researchers necessarily consider faults such as "stuck-open" or "stuck-short" state [
34]. Faults of this type allow us to describe cases when a "floating" line has a high capacity and retains its charge for a considerable time in modern semiconductor technologies.
Transient faults cover the vast majority of faults that occur in digital computing systems built on modern semiconductor technology. Future technologies are expected to be even more susceptible to transient faults due to greater sensitivity to environmental influences and high material stresses in highly miniaturized media. Transient faults that recur at a certain frequency are usually caused by extreme or unstable device operation and are more difficult to detect than permanent faults. Transient faults are associated with the impact on the parameters of circuits that determine the time characteristics, rather than on the structure of circuits. Transient faults include an unpredictable delay in signal propagation, random bit switching in memory registers, impulsive changes in logic circuits [
37].
There are several physical methods of injecting faults with malicious intent. In practice, fault injection is realized due to a system clock failure, that is, circuit synchronization, power sag to a certain level, electromagnetic effects on semiconductors, irradiation with heavy ions, laser beam effects on memory, and software rowhammer attacks on memory bits [
2].
Laser beam can inject an error into static random access memory (SRAM). When a laser beam is applied to silicon, a temporary conductive channel is formed in the dielectric, which causes the transistor to switch states in a precise and controlled manner [
38]. By carefully adjusting the parameters of the laser beam, such as its diameter, emitted energy, and impact coordinate, an attacker can accurately change any bit in the SRAM memory. The laser beam has been widely and successfully used in conjunction with differential fault analysis to extract the private key of encryption chips.
Rowhammer-attacks can cause errors in DRAM memory. This type of attack takes advantage of the electrical interaction between neighboring memory cells [
39]. By quickly and repeatedly accessing a certain region of physical memory, the bit in the neighboring region can be inverted. By profiling bit inverting patterns in the DRAM module and abusing memory management functions, a row hammer can reliably invert a single bit at any address in the software stack. There are known examples of using the Rowhammer attack to break memory isolation in virtualized systems and to obtain root rights in the Android system.
Laser beam and Rowhammer attacks can inject errors into memory with extremely high accuracy. However, to inject multiple errors, the laser beam must be reconfigured, and a Rowhammer attack requires moving target data into memory. Reconfiguring the laser beam, as well as moving data, requires certain overhead. Therefore, the design of neural network algorithms should provide resistance to a certain level of inverted bits to make these attacks unusable from a practical point of view.
The injection of faults and errors into the digital system for deploying AI can be carried out in an adaptive manner, taking into account feedback (
Figure 6). In this case, attack success is monitored at the output of the neural network. At the same time, Single Bias Attack (SBA) or Gradient Descent Attack (GDA) can be used to adaptively influence on the system [
2].
The output of neural networks is highly dependent on the biases in the output layer, so SBA is implemented by increasing only one value of the bias of the neuron associated with the target category. SBA is designed for cases where hiddenness of the attack is not required. For cases where hiddenness is important, GDA is used, where gradient descent searches for a set of parameters that need to be changed for the attack success. The authors of [
2,
38] proposed to apply Alternating Direction Method of Multipliers (ADMM) to optimize the attack while ensuring that the data analysis other than the ones specified is unaffected and the modification in the parameters is minimum.
The importance of protecting against adaptive fault and error injection algorithms is related to the trend of moving real-time intelligent computing to edge devices. These devices are more likely to be physically accessible to an attacker, which increases the possibility of misleading devices. Modern cyber-physical systems and the Internet of Things are typical platforms for deployement of intelligent algorithms that require protection against fault injection.
An equally harmful destructive factor for machine learning systems is data corruption, missing values, and errors in training and test data. Missing values in features lead to loss of information, and errors in target data labels lead to misinformation and reduced learning efficiency. Missing values in data are often caused by software or hardware faults related to data collection, transmission, and storage.
Researchers have found that neural network algorithms are sensitive to so-called adversarial attacks, which involve manipulating data or a model to reduce the effectiveness of AIS [
2]. It was noted that adversarial attacks have the following properties:
imperceptibility, which consists in the existence of ways of such minimal (not visible to humans) modification of data that leads to inadequate functioning of AI;
the possibility of Targeted Manipulation on the output of the neural network to manipulate the system for your own benefit and gain;
transferability of adversarial examples obtained for one model in order to apply them to another model if the models perform a common task, which allows attackers to use a surrogate model (oracle) to generate attacks for the target model;
the lack of generally accepted theoretical models to explain the effectiveness of adversarial attacks, making any of the developed defense mechanisms not universal.
Machine learning model, deployment environment, and data generation source can be the target of attacks. Attacks can be divided into three main types according to the purpose:
availability attacks, which leads to the inability of the end user to use the AIS;
integrity attacks, which leads to incorrect AIS decisions;
confidentiality attacks, where the attacker's goal is to intercept communication between two parties and obtain private information.
According to strategy, adversarial attacks can be classified into: evasion attacks, poisoning attacks, and oracle attacks [
40].
Evasion is an attack on AI under inference by looking for modification of input sample to confuse the machine learning model. The main component of a adversarial attack is a adversarial data sample
with a small perturbation (adding a noise component)
, which leads to a significant change in the output of the network, described by the function
, thus
. The neural network weights
are treated as fixed and only the input test sample
is subject to optimization in order to generate the adversarial sample
. This attack involves an optimization process of finding a small perturbation
that will cause a wrong AIS decision. Evasion attacks are divided into gradient-based evasion and gradient-free evasion. Gradient-based evasion uses one-step or iterative gradient optimization algorithms with imperceptibility constraints to improve the effectiveness of these attacks. The most well-known gradient-based attacks are Fast Gradient Sign Method (FGSM), iterative-FGSM (iFGSM), Jacobian Saliency Map Attack (JSMA), Carlini and Wagner (C&W) attack and training dataset unaware attack (TrISec) [
41].
Gradient-free evasion attacks are divided into Score-based Evasion Attacks and Decision-based Evasion Attacks. Score-based Evasion Attacks uses output scores/probabilities to predict the direction and strength of the next manipulation of the input data. Decision-based Evasion Attacks start by generating stronger input noise that causes incorrect model decisions, and then the noise is iteratively reduced until it becomes undetectable. The goal of Decision-based Evasion Attacks is to explore different parts of the decision boundary and find the minimum amount of noise that misleads the model. However, the cost of these attacks in terms of the number of requests is very high. Therefore, FaDec attacks use adaptive step sizes to reduce the computational cost of Decision-based Evasion Attacks to achieve the smallest perturbation with the minimal number of iterations [
42].
Poisoning attacks involve corrupting the data or logic of a model to degrade the learning outcome [
43]. In this case, poisoning data before it is pre-processed is considered as indirect poisoning. Direct poisoning refers to the data injection or data manipulation, or model modification by means of logical corruption. The injection of adversarial data leads to a change in the distribution and a shift in the decision boundary based on linear programming or gradient ascent methods. Data manipulation can consist of modifying or replacing labels, feedback or input data. Logic Corruption is the intrusion into a machine learning algorithm to change the learning process or model in an adverse way.
An oracle attack involves an attacker using access to the software interface to create a surrogate model that retains a significant portion of the original model's functionality. Surrogate model provide efficient way to find an evasion attack which is transferred to the original model. Oracle attacks are divided into: extraction attacks, inversion attacks, and membership inference [
44]. The goal of the extraction attack is to extract the architectural details of the model from the observations of the original predictions and class probabilities. Inversion attacks are an attempt to recover training data. Membership inference attack allows the adversary to identify specific data points from the distribution of the training dataset.
The attacks can be categorized according to our knowledge of the data analysis model on:
white-box attacks, which are formed on the basis of full knowledge of the data, model and training algorithm, used by AIS developers to augment data or evaluate model robustness;
gray-box attacks based on the use of partial information (Model Architecture, Parameter Values, Loss Function or Training Data), but sufficient to attack on the AIS;
black-box attacks that are formed by accessing the interface of a real AI-model or oracle to send data and receive a response.
Various methods of creating a surrogate model, gradient estimation methods, and various heuristic algorithms are used to form adversarial black box attacks. This type of attack poses the greatest threat in practice.
Figure 7 shows the ontological diagram of AIS threats as summarization the above review.
Figure 7.
Ontological diagram of AIS threats.
Figure 7.
Ontological diagram of AIS threats.
Thus, all of the following can be considered AIS disturbances: fault injection, adversarial attacks and concept drift, including novelty (out-of- distribution), missing values, and data errors. There are many types and subtypes of AIS disturbance. The research of each disturbance type is still a relevant area of research, especially where the research of the complex impact of different types of disturbances is concerned.
2.3. Taxonomy and ontology of AIS resilience
The main elements of the taxonomic diagram of AIS resilience are: threats (drift, faults and adversarial attacks); phases (plan, absorb, recover, adapt) of AIS operational cycle; principles on which the AIS resilience is based; properties that characterize resilient AIS; indicators that can be used to assess AIS resilience; tools for ensuring AIS resilience (
Figure 8).
Certain resilience phases (stages) can be split and detailed. Based on the analysis of [
11,
12], the following phases of the operating cycle of resilient AIS should be implemented: disturbance forecast, degradation prevention, disturbance detection, response, recovery, adаptation and evolution. Disturbance Forecast is a proactive mechanism that provides knowledge about early symptoms of disturbance and readiness to a certain type of known disturbance.
Degradation prevention is the application of available solutions and knowledge about the disturbing factor to absorb the disturbance (ensure robustness) in order to minimize the impact of the disturbance on the AIS performance. Not every disturbance can be completely absorbed, but in order to produce an optimal AIS response, the disturbance should be detected and identified by its type. The purpose of disturbance detection is to optimally reallocate resources or prioritize certain decisions in the face of inevitable performance degradation. Moreover, an important stage of resilient AIS is the restoration of the initial performance and adaptation to the impact of disturbances. The last phase of the operational cycle involves searching for and implementing opportunities for evolutionary changes that ensure the best fit of the system architecture and parameters to new conditions and tasks.
The implementation of resilient AIS is based on the general principles of designing resilient systems with taking into account the specifics of modern neural network technologies. The systematization of papers [
11,
16] allows formulating the following principles of ensuring the AIS resilience: proactivity, diversity, redudancy, fault-tolerance, adversarial robustness, defense in depth, gracefull degradation, adaptability, plasticity-stability trade-off and evaluability.
The principle of diversity implies the inclusion of randomization and multi-versioning into implementation of system components [
15]. Different versions of the components can implement different architectures, subsamples and subspaces of features or data modalities, use different data augmentations, and different initializations of neural network weights. In this case, the use of the voting method for the diverse components of the AIS helps to reduce the variance of the complex AI-model in the inference mode. Diversity causes the redundancy of AIS and additional development overhead, but on the other hand it complicates the attack and mitigates any disturbing influence.
The principles of Fault-tolerance and Adversarial robustness provide for absorption of faults and adversarial attacks on AIS by using special architectural solutions and training methods. The Defense in Depth principle suggests combining several mechanisms of AIS defense, which consistently counteract the destructive impact. If one mechanism fails to provide defense, another is activated to prevent destructive effects.
The principle of gracefull degradation is the pre-configuration of the AIS with a set of progressively less functional states that represent acceptable trade-offs between functionality and safety. The transition to a less functional state can be smooth, providing a gradual decrease in performance without complete breakdown of the AIS. Concepts such as granularity of predictions and model hierarchy, zero-shot learning, and decision rejection are common ways to allow AIS to handle unexpected events while continuing to provide at least a minimum acceptable level of service.
The principles of Adaptability, Evaluability, and Plasticity-stability trade-off are interrelated and specify the ability of AIS to learn and, more specifically, engage in continual learning with regularization to avoid the effect of overfitting and catastrophic forgetting [
4,
6,
14]. The principle of AIS Evaluability implies the possibility of making necessary structural, architectural or other changes to the AIS, which will reduce vulnerability and increase resilience to potential future destructive impacts. These principles also include improving the speed of adaptation to new disturbances through meta-learning techniques.
A resilient AIS can be characterized by a set of properties: ability to recover, rapidity, resourcefulness, reliability, robustness, security, high confidence, maintainability, survivability, performance. These properties should be self explanatory from their names or from description in the previous subsections. A set of indicator (metrics) can be proposed to AIS resilience assess : rapidity indicator, robustness indicator, redundancy indicator, resourcefulness indicator, integral resilience indicator. Integral resilience indicator deserves special attention, as it simultaneously takes into account the degree of robustness and recovery rate. Resourcefulness can be conceptualized as consisting of the ability to apply resources to meet established priorities and achieve goals [
11]. An organizational resourcefulness can be measured by ratio of the increase in resilience to the increase in the amount of involved resources [
11,
45].
A certain set of tools should be used to ensure AIS resilience. The most important tools are resilience assessment methods, which allow to assess, compare and optimize the AIS resilience. To ensure robustness and peformance recovery, it is necessary to use various disturbance countermeasures. Meta-learning tools, methods of reconfiguring the AI model with regard to performance or resources privide certain level of evolvability. The implementation of AIS adaptation and evolution mechanisms requires the inclusion of constraints imposed by the principle of plasticity-stability trade-off [
6,
14].
Figure 9 shows an ontological diagram of the AIS resilience. Moreover, the diagram specifies the conditions for the emergence of the AIS evolution. AIS should be improved to quickly eliminate drift caused by these changes through domain adaptation and meta-learning. In addition, there is a need to initiate evolutionary improvement of the AIS as a response to influence of non-specified faults/failures, the effective processing of which was not provided in the current configuration.
In addition,
Figure 9 shows a list of the main disturbance countermeasures required for absorbing disturbances, graceful degradation and peformance recovery. This list includes the following countermeasures: data sanitization, data encryption, homomorphic encryption; gradient masking methods; robustness optimization methods, adversary detection methods, fault masking methods, methods of error detection caused by faults, method of active recovery after faults, drift detection methods, continual learning techniques, few-shot learning techniques, active learning techniques [
46,
47,
48].
Data sanitization and data encryption methods prevent data poisoning. Data Sanitization methods are based on the Reject on Negative Impact approach; they remove samples from the dataset which negatively affect the performance of the AIS [
33,
47]. Homomorphic Encryption methods provide defense against cyber attacks on privacy. Homomorphic Encryption encrypts data in a form that a neural network can process without decrypting the data. An encryption scheme is homomorphic for operation ∗; without the access to the secret key, the following holds [
33]:
where Enc(·) denotes the encryption function.
Gradient masking methods, Robustness optimization methods, and Adversary detection methods are used to defend the AIS against adversarial attacks in the inference mode. Fault masking methods, Methods of error detection caused by faults and Method of active recovery after faults are used to mitigate the effect of faults on AIS performance [
34,
49]. Drift detection methods, Continuous learning techniques, Few-shot learning techniques, and Active learning techniques are used for the effective functioning of AI under drift conditions [
4,
5,
6].