1. Introduction
According to Industry 4.0, wireless networked control systems (WNCSs) have been applied to industrial networks for various services such as industrial automation, smart manufacturing, and unmanned robot control [
1,
2]. WNCSs have been considered as a prominent solution in industrial networks to provide real-time and reliable actuation [
3]. WNCSs generally consist of sensors, actuators, and a controller. Sensors collect the latest samples of environment states and deliver them to the controller. After the controller computes control decisions for actuators, it sends the control command to the actuators. In addition, as mobile actuators such as mobile robots and automated guided vehicles have recently been deployed, wireless control for mobile actuators has been actively deployed. During the general process of WNCSs, there are two principal updates: 1) status updates from sensors to the controller and 2) actuation updates from the controller to actuators, which need to be timely updated due to the time-critical-control applications in WNCSs.
Since timeliness is an important metric in WNCSs, age of information (AoI) has been introduced as a novel metric to quantify the freshness of information updates [
4,
5]. AoI is defined as the amount of elapsed time since the latest delivered information (i.e., updates in WNCS) was generated. It is based on the perspective of destinations and therefore it linearly increases with time until an update is received at a destination. Since WNCS requires timely and fresh updates to improve the control performance, AoI has been applied to WNCS as a key performance metric [
6,
7].
After AoI was introduced, the research on AoI for status updates in industrial networks or WNCSs has been maturely studied [
2,
7,
8]. However, the research on AoI for actuation updates has not been focused yet even though it is critical for the control performance. For example, delayed actuation updates can result in production inefficiency, plant destruction, and casualties [
2,
8]. In other words, the timeliness of the actuation update should be controlled by the controller in WNCS. In addition, since there are different AoI requirements according to the control priorities for the actuators (i.e., robustness of AoI of the actuation update) [
9,
10], the priority needs to be considered when delivering the actuation update. For example, the priority can be defined to classify purposes concerning the criticality level at a particular moment [
10].
Meanwhile, in industrial environments, heterogeneous wireless networks such as cellular (e.g., 5G new radio (NR)) and WiFi networks [
11,
12] have been deployed. Accordingly, the type of network available for mobile actuators varies depending on the location. In this scenario, the actuation updates via cellular network need the monetary cost while the updates via WiFi network are usually free to use. To reduce the monetary cost, the control system prefers to use WiFi network for the actuation updates. However, since WiFi network is not always available, it can increase AoI, which results in a critical situation, especially for high-priority control commands. Consequently, it is important to determine the appropriate actuation update policy considering both the monetary cost and AoI with priority.
To address the AoI control problem in heterogeneous networks, there have been several works [
13,
14,
15,
16,
17,
18]. Pan
et al. [
13] determined the scheduling policy over an unreliable but fast channel or a slow reliable channel to minimize AoI. Altman
et al. [
14] and Raiss-el-fenni
et al. [
15] introduced the receiver’s policy to decide whether to receive updates from cellular or WiFi networks to minimize the costs. Bhati
et al. [
16] provided the optimal average AoI considering heterogeneous multiple servers with different capabilities. Fidler
et al. [
17] showed the effect of independent parallel channels on AoI based on the queuing models. Xie
et al. [
18] formulated the generalized scheduling problem in multi-sensor multi-server systems to minimize AoI. However, there is no previous work to jointly consider the monetary cost and the priority from the system operator’s perspective.
To address these challenges, this paper proposes a priority-aware actuation update scheme (PAUS) that jointly considers the cost and AoI with priority. In PAUS, the control system determines whether to deliver or delay the actuation update to the actuator based on AoI with priority and cost. We formulate a Markov decision process (MDP) model and determine the optimal policy based on Q-learning (QL). Simulation results demonstrate that PAUS reduces the cost while satisfying the required AoI.
The main contribution of this paper is as follows: 1) to the best of our knowledge, this is the first work to jointly consider AoI with priority and cost; and 2) extensive simulation results present the performance of PAUS under various settings, which can be utilized as the guidelines for the control system operator.
The remainder of this paper is organized as follows. The system model and problem formulation are provided in
Section 2 and
Section 3, respectively. The QL-based algorithm is presented in
Section 4. After simulation results are provided in
Section 5, this paper is concluded with future works in
Section 6.
5. Performance Analysis Results
To evaluate the performance, we conduct extensive simulations by means of a Python-based event-driven simulator where each simulation includes decision epochs, and the average values of 10 simulations are used for the average reward. We compare the proposed scheme (i.e, PAUS) with the following four schemes: 1) SEND where the control system delivers the actuation update immediately when a new actuation update occurs to minimize AoI, 2) TARGET where the control system delays the actuation update, and then delivers it right before exceeding the target AoI requirement, 3) PERIOD where the control system periodically delivers the actuation update, and 4) WAIT where the control system waits for WiFi to make the best use of WiFi.
The default parameter settings are as follows. The average probability of disconnection and connection between WAP and actuator is set to and , respectively. The default values of and w are set to 20 and , respectively. In addition, is assumed to be a linear function with a static coefficient (i.e., 1) according to i. Furthermore, we assume that there are 5 priorities where 1 is the lowest (i.e., less critical) and 5 is the highest (i.e., more critical). Moreover, , , and the period of PERIOD are set to 10, , and 10 decision epochs, respectively. It is assumed that and to use CBS are set to 4 and 1, respectively, while those to use WAP are set to 0 and 1, respectively.
Figure 3 shows the overall performance of the accumulated reward, AoI satisfaction ratio, and total monetary cost according to the simulation time. In
Figure 3a, as the simulation time increases, the accumulated rewards for all schemes decrease because AoI and the monetary cost are accumulated. Among them, PAUS achieves the highest accumulated reward because it jointly considers AoI and the monetary cost. On the other hand, WAIT has the lowest accumulated reward because it waits for WiFi, which leads to increased AoI. Meanwhile, in
Figure 3b, it is found that PAUS, SEND, and TARGET can guarantee the AoI requirement (i.e.,
satisfaction ratio) while PERIOD and WAIT cannot. This is because PERIOD and WAIT deliver the actuation update periodically and only at WiFi, respectively, without consideration of AoI. In addition,
Figure 3c shows the accumulated cost among them. Among PAUS, SEND, and TARGET which have
satisfaction ratio, it can be noted that PAUS has the lowest accumulated cost. This means that PAUS can minimize the monetary cost while maintaining AoI within the required value.
Figure 4 shows the average reward and AoI satisfaction ratio according to weight factor
w. In
Figure 4a, as
w increases, the average rewards of PERIOD and WAIT decrease because of the increasing AoI. Between them, the average reward of WAIT is higher than that of PERIOD because it tries to reduce AoI whenever WiFi is available. On the other hand, as
w increases, the expected rewards of SEND and TARGET increase due to the reduced AoI. Between them, the increasing rate of SEND is higher than that of TARGET because SEND can minimize AoI with increasing
w. Meanwhile, PAUS achieves the highest average reward. This is because PAUS can reduce the monetary cost at lower
w and AoI at higher
w. In
Figure 4b, it can be noted that PAUS cannot guarantee the AoI requirement at lower
w. This is because PAUS aims to focus on reducing the monetary cost, which can increase AoI at lower
w. Consequently, it is found that
w needs to be set higher than
to make AoI below the required value.
Figure 5 shows the average reward and AoI satisfaction ratio according to the actuation update arrival rate
. In
Figure 5a, as
increases, the average rewards of all schemes decrease because increasing
increases the number of deliveries, which can lead to monetary costs or delayed updates. Among them, the decreasing rate of SEND and PERIOD is higher than that of others. In the case of SEND, this is because as
increases, the number of updates via CBS becomes higher, which increases the monetary cost. On the other hand, in the case of PERIOD, the periodical actuation update is still used even when
increases, which results in delayed updates. Overall, PAUS achieves the highest average reward because it aims to minimize the cost jointly considering the monetary cost and AoI. In addition, from
Figure 5b, even when
increases, PAUS, SEND, and TARGET can guarantee the AoI requirement. On the other hand, PERIOD and WAIT cannot guarantee the AoI requirement because PERIOD still uses the periodical actuation update and WAIT delays the actuation update and waits for WiFi irrespective of
changes.
Figure 6 shows the average reward and AoI satisfaction ratio according to the monetary cost
. In
Figure 6a, as
increases, the average rewards of all schemes decrease because increased
leads to higher monetary cost. Among them, the decreasing rate of SEND is higher than that of others because SEND immediately tries to deliver the actuation update even when CBS is only available. On the other hand, WAIT has the lowest decreasing rate because WAIT always prefers to use WAP. Overall, PAUS achieves the highest average reward. This is because PAUS can fully utilize either CBS at lower
or WAP at higher
. In
Figure 6b, it can be noted that PAUS cannot guarantee the AoI requirement at higher
. This is because PAUS reduces the monetary cost at higher
, which can increase AoI, to maximize the total reward function defined in (
17). Note that if the system operator needs to enhance AoI satisfaction ratio even at higher
, the weight factor
w in the total reward function can be adjusted.
Figure 7 shows the average reward and AoI satisfaction ratio according to the WAP connection probability
. In
Figure 7a, as
increases, the expected rewards of all schemes increase because increased
leads to lower monetary cost. Among them, the increasing rate of WAIT is higher than that of others because the increasing
results in more opportunities to deliver updates via WAP, which can reduce AoI as also shown in
Figure 7b. Overall, as presented in
Figure 7a and
Figure 7b, PAUS achieves the highest average reward while guaranteeing the AoI requirement. This is because PAUS can fully utilize either CBS at lower
or WAP at higher
.
Figure 8 shows the average reward and AoI satisfaction ratio according to the AoI requirement
. In
Figure 8a, as
increases, the average rewards of all schemes except for SEND (i.e., PAUS, WAIT, PERIOD, and TARGET) increase because there is enough time to wait for WiFi, which can reduce the monetary cost. However, because SEND delivers actuation updates irrespective of the AoI requirement, the average reward of SEND does not change according to the AoI requirement. From
Figure 8b, although AoI satisfaction ratios of WAIT and PERIOD increase, they still cannot guarantee the AoI requirement.