IoT devices are typically constrained in terms of power, memory, computation, or combinations thereof to keep the cost of the IoT device down, increase the operating lifetime by extending or maximizing battery life, or both. In this section, we summarize SEI works that consider these constraints.
7.1. SEI On Resource Constrained Devices
The authors of [
162] present one of the earliest SEI approaches designed to reduce or alleviate computation, energy, and communications overheads associated with performing SEI-based security approaches on resource-constrained IoT devices. The authors accomplish this by offloading the SEI task to cloud and edge devices or resources. Initial CNN training is performed in the cloud and once trained the CNN is mutually re-trained and made sparse through the use of a progressive weight pruning algorithm [
163], thus the authors focus on reducing computation and energy requirements during the testing or inference stage instead of during CNN training. The re-trained and pruned CNN is then deployed to the edge to run on a gateway or another IoT device that is responsible for relaying information. The authors assess the pruned model’s computation and energy reduction performance using a Samsung Galaxy S10, an NVIDIA Jetson TX2 Module, and a Xilinx-ZCU104 Field Programmable Gate Array (FPGA). The authors use two data sets comprised of 500 IEEE 802.11 b/g/n Wi-Fi emitters with 273 signals per emitter and fifty ADS-B emitters with 273 signals per emitter. The authors perform SEI using a derivative of the ResNet architecture [
164] and assess it and its pruned versions in terms of average percent correct performance, pruning rate, and the number of Floating Point Operations Per Second (FLOPS). Using a pruning rate of 5.4× results in a 0.96% drop in average percent correct classification performance–61.4% for the full ResNet versus 60.44% using the pruned ResNet–while reducing the number of FLOPS by 20% when identifying the 500 Wi-Fi emitters. For the ADS-B data set, a pruning rate of 5.4× results in a 0.25% drop in average percent correct classification performance–88.53% for the full ResNet versus 88.25% using the pruned ResNet–while reducing the number of FLOPS by 19.3%. Regarding the three hardware platforms, the presented approach increases classification speeds by as much as three times on the Samsung Galaxy S10 and eleven and a half times on the FPGA. The authors only present average classification performance results. Hence, it is difficult to determine how evenly distributed the performance is across individual emitters and do not assess their approach under degrading SNR conditions, but do pose techniques to address the latter under future efforts. However, the biggest concern surrounding the approach in [
162] is their use of CFO. CFO is estimated and removed before channel equalization but reinserted once equalization is concluded. CFO’s presence is a vulnerability that SEI adversaries can exploit, see
Section 4 for details. So, it would be interesting to see how the results presented in [
162] would change if CFO is not reinserted. Despite this, the work in [
162] does provide a viable means of reducing SEI’s burden on edge, IoT devices and shows that SEI can be successfully employed on smartphones.
In [
165], the authors present an IoT resource allocation approach that leverages SEI-based security. In particular, the authors focus on IoBT deployments but as previously stated IoBT is a form of IoT, thus we only use IoT in our article. The authors’ approach aims to improve IoT network Quality of Service (QoS) through the use of SEI and by optimizing network performance. The former is of particular interest here and the authors propose SEI as a means of controlling user access in .lieu of traditional cryptographic approaches because cryptography systems have higher computational requirements, which limit their use in IoT deployments. Specifically, SEI is used to improve IoT network QoS by identifying and removing malicious users/devices that launch Distributed Denial-of-Service (DDoS) attacks aimed at reducing available network resources (a.k.a., power and channel allocations). Such resources are often limited in IoT deployments and their reduction negatively impacts network performance optimization. The authors optimize network performance by calculating the IoT network’s utility. The utility is calculated using many parameters including but not limited to the number of sensing devices, power consumption of RF circuits, transmit power of the considered devices, data rate(s), and per-device channel allocation. However, the authors do not seem to consider how performing SEI impacts IoT network utility. It may be that performing SEI is “lumped” into another parameter such as RF circuit power consumption. If it is, then it is unclear whether SEI is to be performed on individual IoT devices or the proposed centralized server. Either way, a specific parameter or parameters associated with the performance of SEI should be integrated into the IoT utility optimization calculation and contrasted against the use of cryptography–instead of SEI–to highlight the benefit of SEI-based security in IoT deployments. Lastly, the authors do not present any SEI-related results, which–from a purely SEI viewpoint–seems to limit the contributions of the approach. However, this can be easily remedied by integrating SEI into the IoT network utility optimization calculation.
The authors of [
166] make note of the fact that recent advancements in SEI have been made through the use of DL at the cost of large numbers of hyperparameters that are updated via time-consuming backpropagation along with the fact that DL structures are not scalable making them computationally expensive, thus limiting the practicality of DL-based SEI in IoT devices and infrastructure. The authors attempt to address these DL-related issues by proposing an SEI approach built on a Broad Learning System (BLS) called Adaptive Broad Learning (ABL). ABL trades the depth of a DNN for width by increasing the number of nodes that comprise the node layer, replaces time-consuming backpropagation with the pseudo-inverse calculated from the node layer to the output layer, and updates only the new nodes instead of all of the nodes when the network needs to be modified. ABL consists of a node layer–that is comprised of feature and enhancement nodes–and an output layer. The feature nodes work directly on the signal’s raw IQ samples while the enhancement nodes’ inputs are the output of the feature nodes. The authors assess ABL’s SEI effectiveness using two publicly available data sets from [
167,
168]. The authors’ results show that ABL’s average SEI performance is on par with the best DL approaches using the first data set [
167] and slightly above average when using the second data set [
168]. Regarding the second data set performance, the authors attribute the poorer performance to the higher sampling rate, which creates feature redundancy that provides an edge to the DNN architectures. The real benefit of ABL is the reduction in training time, which is a fraction of the time needed to train the DNNs. The authors’ approach is novel but a few considerations must be made. First, the authors note that ABL requires massive, labeled data sets for training, thus limiting its usefulness in which previously unseen emitters are present in the operating environment. In today’s environment of increasing and ubiquitous IoT device deployments, the presence of previously unseen emitters seems inevitable. Second, the authors state that the presence of redundant information within the signals causes ABL to overfit. Lastly, one must keep in mind that training does not have to be performed on the IoT device but instead can be performed at a central location initially, and to perform updates, thus the SEI-performing device would only need the latest, trained model. Despite this, ABL is an interesting approach and further research is warranted due to its novelty within the SEI space.
Similar to the work in [
166], the authors of [
169] approach SEI using a broad learning network to lower computational load on the end device to address resource constraints associated with IoT integration. However, the work in [
169] differs through the use of signal feature embedding instead of the node expansion approach in [
166]. The authors [
169] call their broad learning-based SEI approach Signal Feature Embedded Broad Learning Network (SFEBLN). Signal feature embedding intends to approximate the SEI features through the use of a nonlinear transformation with the intent of improving SEI performance. Signal features are generated by performing signal processing before and within the broad learning network. Signal convolution, windowed pooling, and signal shifting are performed before the broad learning network while internal signal processing consists of calculating the Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), and the Short-Time Fourier Transform (STFT). Additionally, the authors perform broad learning-based SEI using a Central Processing Unit (CPU) to show the feasibility of performing SEI without the need for a Graphical Processing Unit (GPU). Assessment of SFEBLN is conducted using a set of ADS-B signals and compared with SEI performed using three DL-based approaches, the ABL approach from [
166], and two traditional, handcrafted SEI approaches. The DL-based SEI approaches are a real-valued CNN, complex-valued CNN, and the multi-scale CNN from [
170]. Traditional, handcrafted SEI is performed using random forest and Support Vector Machines (SVM). The authors consider six scenarios in which the SEI processes each identify 10, 20, 30, 50, 100, or 200 individual ADS-B emitters. SFEBLN results in superior average percent correct classification over both handcrafted, real-valued CNN, and multi-scale CNN approaches for all six identification scenarios. SFEBLN is also superior to the complex-valued CNN–in terms of average percent correct classification performance–when identifying 40 or fewer ADS-B emitters. The true benefit to SFEBLN is its time advantage over the six alternate SEI approaches. SFEBLN can be trained in less than 10 seconds when 100 or fewer ADS-B emitters are represented in the training set and in less than 13 seconds when 200 emitters are represented. For SNRs of -10, -5, 0, 5, and 10 dB, the average percent correct classification performance of SFEBLN is superior to all alternate SEI approaches except for the case of 200 ADS-B emitters at an SNR of -10 dB. For this exceptional case, the complex-valued CNN proves to be roughly 10% better but at the expense of huge computing overhead (roughly 2,000 times). The authors do identify a drawback to SFEBLN, that it is susceptible to instability in its results due to the random initialization of the single-layer weight, and that this instability is exacerbated by such things as temperature, multi-threading, and other unspecified factors. To address this, the authors assess SFEBLN stability using Monte Carlo simulation while considering impacts on accuracy, training, and testing times. The authors show that SFEBLN stability remains within acceptable limits but they do not perform the stability assessment under degrading/low SNR conditions, thus it is difficult to determine if SFEBLN will remain stable as SNR decreases. Also, the authors of [
169] do not address any of the concerns raised by the authors of [
166] and highlighted in the previous paragraph.
The authors of [
171] design their SEI approach using a systems view to make it better suited for real-world operations. The authors consider training data availability, robustness to unknown operational conditions and uncertainty, channel conditions, and computation limitations. The authors address limitations associated with training data availability by acknowledging that data distributions will change between the training and testing phases, using simulated data for training and fine-tuning using real-world data, integrating detection of unknown emitters, and using a limited number of training examples per emitter. The robustness to unknown operational conditions and uncertainty limitations are addressed during training by using a large data set and then tuning or adapting the deployed version, a data set comprised of multiple, distinct signal types (e.g., Wi-Fi and ZigBee), and a data set containing signals that represent spoofers, or other signals in the area of deployment. Channel condition limitations are addressed by including corrupted signals in the training data. In particular, the authors include signals that are overlapped in time and spectrum with other signals of the same type and SNR with only the amount of overlap changing, and other information to improve performance by leveraging other receiver capabilities such as the direction of arrival. The authors address the final limitation–computation–by lowering the bit precision of the network weights, the network’s depth, the number of filters per layer, and network pruning. The work in [
171] is not without its limitations. Primarily, the data they used is not publicly available [
149] and the DL network used is developed by a private company, BAE Systems Inc., which could limit the SEI research community’s access to it.
The authors of [
172] present a data reduction approach that leverages entropy to select the most informative portions of a signal’s Time-Frequency (TF) representation. The signal’s TF representation is a normalized, grayscale image generated from its GT’s complex-valued coefficients. A GT image’s most informative portions are selected by comparing a portion’s (a.k.a., patch) entropy value to the entropy value of the entire image. If a patch’s entropy is equal to or greater than the image’s entropy value, then that patch is retained for subsequent SEI. If it is lower than that of the image’s entropy value, then the patch is discarded. The presented entropy-informed SEI approach outperforms SEI processes that use the signal’s raw IQ samples and are comparable to those that use the full, GT image at SNRs of 15 dB or greater. When compared to the whole, GT image-based SEI approach, memory usage and CNN training times are reduced by 93% and 81%, respectively.
The authors of [
173] look to lower the SEI burden on IoT devices and the supporting network by investigating DL-based upsampling impacts on SEI performance. In particular, the authors investigate using a Conditional Generative Adversarial Network (CGAN) to upsample the signals collected by IoT devices that measured them using a lower sampling rate. Allowing the IoT device to collect the signals at a lower sampling rate aligns with current IoT design practices [
174]. The authors use the CGAN to upsample IEEE 802.11a Wi-Fi preambles collected at sampling frequencies of 2.5 MHz, 5 MHz, or 10 MHz to a sampling frequency of 20 MHz and compare results generated from the upsampled signals to those generated using signals sampled at 20 MHz during collection. They compare SEI results generated using signals upsampled using two other conventional interpolation methods known as piece-wise Linear Approximation Interpolation (LAI) and Cubic-Spline Interpolation (CuSI). Additionally, the CGAN upsampled signals’ SEI results are compared to those generated using a CNN and the organic sampled signals (i.e., the signals are not upsampled before conducting SEI). The greatest improvement in average percent correct classification performance is achieved when the signals collected at a sampling rate of 5 MHz are upsampled to 20 MHz. The 5 MHz sampled signals result in an average percent correct classification performance between 84% and 95% for SNR values ranging from 9 dB to 30 dB, respectively. While the CGAN upsampled version of the 5 MHz signals results in an average percent correct classification performance between 92% and 98% over the same range of SNR values. However, the use of signals collected at a sampling frequency of 20 MHz results in better performance over those upsampled by the CGAN from 5 MHz to 20 MHz, especially at SNR values below 21 dB. Despite this, the work in [
173] provides a potential approach for lowering the resource demands (e.g., memory, power, computation) placed on individual IoT devices; however, SEI performance improvements are needed and the number of devices increased.
In [
175], the authors present an active Distinct Native Attribute (DNA) fingerprinting process capable of identifying legitimate and counterfeit Wireless Highway Addressable Remote Transducer (HART) adapters using sub-Nyquist sampled signals. The work in [
175] differs from that in [
173] in that the signals are never upsampled or interpolated. It also differs from the other papers cited in this survey in that the emitters under test are stimulated and DNA fingerprints are generated or learned from the resulting response(s). In other words, the collected responses are not necessarily produced during normal, unstimulated operations, and are collected using a wired setup although assessment is conducted under simulated, degrading SNR conditions. It is also worth noting that the active DNA fingerprinting process in [
175] serves a different purpose than passive SEI processes–including those cited in this survey–in that the work in [
175] focuses on identifying counterfeit Wireless HART emitters within the pre-deployment portion of their life cycle to ensure or maintain supply chain integrity. In contrast, passive SEI approaches are primarily focused on securing communications networks during the deployed/operating period of the emitters’ life cycles. The authors of [
175] perform DNA fingerprinting using a traditional Multiple Discriminant Analysis (MDA) and CNN-based classifier. For MDA-based DNA fingerprinting, the sub-Nyquist signals’ time domain representations of magnitude, phase, and frequency are calculated, each is subdivided into eighteen equal-length sub-regions, the statistics of variance, skewness, and kurtosis calculated for each sub-region, statistics of each sub-region are sequentially concatenated together along with the statistics calculated across the entirety of each time domain representation, and all statistics from each time domain representation concatenated together to form a DNA fingerprint. For CNN-based DNA fingerprinting, the same time domain representations are calculated for each sub-Nyquist signal but no further processing is conducted, which leaves feature learning and selection to the CNN. The authors of [
175] consider both one- and two-dimensional CNN-based DNA fingerprinting. The one-dimensional case uses only the time domain DNA fingerprints. While the two-dimensional CNN-based DNA fingerprinting uses both the time and frequency domain representations of the sub-Nyquist signals. The authors do include results generated from Nyquist-sampled signals to facilitate comparative assessment. In the end, the two-dimensional CNN-based DNA fingerprinting process proves superior in identifying legitimate Wireless HART emitters at an average percent correct classification rate of 91.6% or better at SNRs of -9 dB and higher. This same process correctly identifies counterfeit emitters at an average rate of 91.5% or higher at SNRs of -9 dB and higher. These results are achieved at a sub-Nyquist sampling rate of 1/205
th that of the Nyquist rate. The Wireless HART signals are collected at a sampling rate of 1 GHz, which yields a sub-Nyquist rate of roughly 4.88 MHz. When considering the sub-Nyquist sampling rate and the results presented by the authors of [
175], the presented sub-Nyquist DNA fingerprinting process appears to provide a viable method for alleviating or reducing the burden that the current Nyquist and higher sampling rate-based passive SEI approaches place on IoT devices and infrastructure.
The authors of [
176] present a lightweight SEI approach built on the Gated and sliding Local self-attention transFormer (GLFormer). The authors’ approach is inspired by the successful use of the Transformer [
177] a self-attention mechanism that provides DL architectures the ability to capture interactions and persistent dependencies in sequential data such as time series data or signals. GLFormer differs from Transformer and many of its derivatives in that it requires fewer parameters and its computational complexity is linear versus the quadratic computational complexity of Transformer. GLFormer divides the input signals into shorter sequences or patches, embeds the patches into a token sequence via an embedding layer, and extracts SEI features using the combination of a gated attention unit and sliding local self-attention mechanism. The authors collect signals emitted by fifty maritime vessels’ Automatic Identification Systems (AIS) and extract the signals’ transient and steady-state portions. The authors compare their GLFormer-based SEI approach to four and three alternative SEI and Transformer-based approaches, respectively. The four SEI approaches are Square Integral Bispectrum (SIB) with Support Vector Machines (SVM), Bi-LSTM, a modified version of ResNet [
164,
178], and InceptionTime [
179]. While the conventional Transformer from [
177] is used along with the Swin-Transformer [
180], and Convolutions to Vision Transformers (CvT) [
181] as alternatives to GLFormer. In terms of average percent correct classification performance, the authors’ GLFormer-based SEI approach proves superior to all alternative approaches with an accuracy of 96.3% when extracting SEI features from the transient portion of the AIS signals and is second only to Inception-based SEI when using the AIS signals’ steady-state portion (90.1% versus 89.4%). However, the real advantage of the GLFormer-based SEI approach is the computational complexity reduction it provides. The authors measure computational complexity in terms of millions of FLOPS and GLFormer requires the fewest FLOPS. For transient-based SEI, GLFormer requires 33 Mega-FLOPS, which is twenty-five Mega-FLOPS lower than the next fewest of the Swin-Transformer-based SEI approach. GLFormer requires sixty-six Mega-FLOPS when performing SEI using the AIS signals’ stead-state portion, which is 1,735 Mega-FLOPS lower than the Inception-based SEI approach (highest SEI performance) and fifty Mega-FLOPS lower than the Swin-Transformer-based approach. GLFormer’s computational complexity reduction makes it attractive for IoT deployments; especially, if re-training or transfer learning can be performed via cloud, edge, or fog computing resources. Despite its advantages, the authors do not assess GLFormer-based SEI under degrading noise or channel conditions but do state that future work will investigate GLFormer’s performance under degrading SNR conditions. Such a study is necessary to ensure GLFormer is a viable SEI approach.
In [
182] the authors present a Mahalanobis distance and Chi-squared distribution RF fingerprinting approach focused on providing SEI-based authentication within 5G IoT next-generation networks. The authors show that their approach requires lower training times and fewer resources than five other SEI approaches while achieving a higher average accuracy. The five alternate SEI approaches include traditional and DL-based techniques that include Multiple Discriminant Analysis/Maximum Likelihood (MDA/ML), SVM,
k-Nearest Neighbors (
kNN), LSTM, and a multi-sample CNN. Additionally, the authors test their approach on an open-source, 5G management and orchestration stack using cloud computing. The authors make use of a simulated signal set–generated using MATLAB
®’s Wireless Waveform Generator toolbox–comprised of up to 450 emitters with 100 signals per emitter. The simulated SEI features consist of CFO, amplitude mismatch on the IQ components, phase offset on the IQ components, clock skew, and DC offset. SEI performance is assessed using as few as three to as many as seven of these features; however, results for only CFO, amplitude mismatch, and phase offset are provided. The authors’ Mahalanobis distance and Chi-squared distribution RF fingerprinting approach achieves the highest average percent correct classification performance of 99.35% with the poorest performance of 95% being generated by the MDA/ML-based SEI approach. The authors also assess their SEI approach under degrading SNR from 30 dB down to 15 dB and as the number of emitters increases from 50 to 450. Overall, the average percent correct classification performance remains consistent as the number of emitters increases but is negatively impacted by lower SNR values. The average percent correct classification performance is between 92% and 94% at an SNR of 15 dB, which is on average 4% lower than the 20 dB results regardless of the number of emitters. It would have been beneficial to see individual emitter percent correct classification performance because it would have shown cases of confusion between multiple emitters. The authors assert that their approach is intended to authenticate legitimate emitters and detect illegitimate emitters but do not provide any results supporting the latter claim. This is important because the authors use CFO as an emitter identifying feature and CFO is vulnerable to exploitation by adversaries (see
Section 4), thus further research should investigate the viability of authenticating legitimate and detecting illegitimate emitters when the CFO is not used as an emitter identifying feature. The authors state that future work will consider alternate channel conditions and signals collected from real IoT emitters.
The papers reviewed in this section employ a variety of techniques and approaches to reduce the computational and resource requirements associated with SEI in an attempt to make it a viable IoT security approach. Most of them focus on reducing the training time and complexity; however, SEI training–at least the initial–can be performed offline where training times and computational resource constraints are less of a factor. Such an approach can be advantageous and trained models updated by either repeating the offline, training process as new signals or data become available or through the use of transfer learning. A few of these papers did explore the use of Edge, Fog, and Cloud computing, which is important as 5G and next-generation networks are deployed and such computing resources are integrated to facilitate network operation and management. Future SEI research needs to consider the challenges associated with transferring and integrating the trained SEI model(s) within IoT devices. For instance, will the SEI model reside on the individual, edge IoT devices or at a central location such as an access point, base station, or a purpose-built device tasked with monitoring a specific portion of the IoT infrastructure? Such considerations will impact how an SEI model is communicated to the employing device; especially, in cases where the edge device lies dormant for long periods and communicates when only necessary to preserve or extend battery life. Communication of an SEI model will add network overhead, which adds complexity. Lastly, the SEI model employing device(s) will need to store the trained model, which will not only increase usage of limited onboard memory but the weights, biases, or other model values will more than likely be quantized. This will impact the SEI model’s accuracy, thus future SEI work will need to consider the extent of this impact and how to compensate for it.
7.2. Receiver-Agnostic SEI
Despite the amount of SEI research conducted over the past twenty-five-plus years, the attention paid to “receiver-agnostic” SEI has been limited to a handful of publications [
178,
183,
184,
185,
186,
187]. This is attributed to the fact that SEI research has primarily focused on investigating or developing novel signal representations, feature generation approaches, feature selection techniques, machine learning algorithms, communications standards, or a combination thereof, thus only a single receiver is employed and its unintentional features have little to no impact on the SEI process because they are consistent across all of its received signals. However, when considering large IoT deployments in which devices change base stations or access points due to mobility or entering, leaving, and re-entering the network, then the use of a single receiver is no longer feasible to ensure effective SEI-based security. Such scenarios create the need for an SEI process–built on a single model or trained NN–to be distributed throughout the IoT infrastructure to reduce complexity and simplify development, deployment, and updates in much the same way Tesla
® updates the Artificial Intelligence (AI) of its cars [
188,
189]. This creates a situation in which the receiver collecting signals for the deployed SEI process differs from that used to train it. Since each receiver’s RF front end is comprised of its own set of components, sub-systems, and systems, then each will impart its own set of unintentional features that differ from those of the receiver used to collect the training signal set. This mismatch between receivers’ features leads to poor SEI performance even when the only change is the use of another receiver [
183], thus effective SEI-based IoT security can benefit from a process or processes that train it to learn a set of signal features that are independent of the receiver used in the signal collection. The result is commonly referred to as “receiver-agnostic” SEI. This section summarizes works that investigate receiver-agnostic SEI.
The earliest receiver-agnostic SEI investigation is presented in [
183]. The authors of [
183] adopt a calibration-based approach to achieve receiver-agnostic SEI. Calibration is facilitated by training a Residual Neural Network (RNN) using a set of “golden” receiver-collected signals. The trained RNN is used to manipulate or change receiver-specific features in another receiver’s collected signals to match those present in the golden receiver’s collected signals. The authors consider ten receivers that span a range of capabilities from a high-end signal and spectrum analyzer down to mid-range SDRs, which provides a broad assessment of the presented approach to receiver-agnostic SEI. Each receiver is used to collect signals transmitted by twenty-five ZigBee emitters. The authors compare their approach to the simple case of using an augmented signal set to train the SEI process. This augmented signal set is constructed using signals collected by multiple receivers. The authors’ calibration-based approach achieves superior receiver-agnostic SEI performance compared to this simple case. The authors also assess their calibration-based approach when the signals are collected by
receiver(s) and under degrading SNR conditions. The result is improved SEI performance when using multiple receivers. The authors only use the high-end spectrum analyzer as the golden receiver, so it is unclear if similar receiver-agnostic SEI performance can be achieved when the golden receiver is a lower-end–in terms of SWaP-C–receiver. Such an investigation can serve to determine the minimum cost needed for implementation by IoT device manufacturers or IoT infrastructure/network administrators. Lastly, the authors do not consider the presence of an RFF-mimicking adversary (see
Section 4). The RNN’s “re-coloring” nature may increase the similarity between an adversary’s mimicked signal features and those present in the original/targeted (a.k.a., the one being mimicked) emitter’s signals, thus increasing attack success.
The authors of [
184] investigate mitigation of receiver-specific unintentional signal features using a cooperative approach. The authors consider an SEI process tasked with identifying
emitters using signals collected by
receivers under the assumption that only a single, unknown emitter is operating at the time all receivers are collecting its signals. The authors decompose the received signals using Empirical Mode Decomposition (EMD), Variational Mode Decomposition (VMD), or Intrinsic Time-scale Decomposition (ITD), which is followed by calculation of the skewness and kurtosis of the decomposed signals. SEI is performed using Support Vector Machines (SVM), a Back Propagation (BP) Neural Network (NN), and a Long Short-Term Memory (LSTM) NN. The authors train an SVM for each known receiver (i.e., there a
SVMs) and identify the emitters using a “maximum wins” voting process while the single BP-NN and LSTM are trained using the signals collected by all
receivers. Thus, the latter two classifiers are trained to learn features that enable receiver-agnostic SEI. The LSTM using skewness and kurtosis calculated from the ITD decomposed signals achieves the highest average accuracy. The authors do not provide individual emitter performance. Also, the authors simulate the emitters’ and receivers’ effects on ideal signals. The emitter effects considered are IQ imbalance, a spurious tone & carrier leakage, and the power amplifier’s non-linear distortion. For the receiver, the authors simulate phase noise, quantization noise, and sampling jitter. Although this approach is good for proof-of-concept demonstration, it is of limited practicality in real-world IoT deployments because emitter features have been shown to change from one transmission to another during normal operation [
49]. Additionally, the authors did not investigate cases in which the signals collected by one or more receivers are not available to the SEI process due to conditions that would stimulate re-transmission, a common occurrence in wireless communications. IoT deployments can form Wireless Ad hoc NETworks (WANETs) and Mobile Ad hoc NETworks (MANETs), thus the number of emitters may be equal to or less than the number of receivers. How these topologies impact receiver-agnostic SEI remains an open research question.
In [
185], the authors present a Separated Batch Normalization-Deep Adversarial Neural Network (SepBN-DANN) for receiver-agnostic SEI. The authors consider the case when the receiver used to collect the training signals is different than the one used to collect the testing signals, thus the approach in [
185] only considers a two-receiver case. The receivers are not specified but are stated to be of the same manufacturer and model. The two-receiver case serves as the impetus behind the authors’ use of SepBN because the distributions of the receiver-specific features are not identical across their signal sets. Each receiver’s collected signals are used as the training set while the other receiver’s signals serve as the testing set. In addition to the use of two receivers, the authors collect the signals of twenty, unspecified emitters over a three-day period. Although they collect signals over multiple days, it does not appear that the authors perform cross-collection (a.k.a., multi-day) SEI (see
Section 5.2). The fact that the authors do not provide emitter specifics makes it impossible to determine if the emitters are of the same manufacturer, model, or some combination of manufacturers and models. Such information would indicate the SEI difficulty level because serial number discrimination (a.k.a., all emitters are of the same manufacturer and model) remains the most challenging SEI case. Despite this, the authors show that their SepBN-DANN approach can achieve average SEI accuracies of 90% or higher for each of the three days and regardless of which receiver’s signals are used for training. The overall average SEI accuracy–computed across days and receiver used to collect the training signals–is 95.03% for SepBN-DANN versus 90.18% when using only the DANN and 68.22% when using a CNN. The authors do not provide individual emitter accuracy, the specifics of the signal collection setup (e.g., wired connection, wireless, in an anechoic chamber, etc.), or the SNR of the signals and resulting SEI performance. The use of two receivers can initially appear to be a limiting factor but not if the training receiver is considered the “golden” receiver and the testing receiver the deployed IoT device performing SEI, thus providing an opportunity for receiver-agnostic SEI in WANET and MANET configured IoT deployments.
The work in [
178] achieves receiver-agnostic SEI by compiling a large data set whose contents include the authorized emitters’ signals collected by all receivers. A total of ten LoRa nodes are used as authorized emitters and twenty SDRs as receivers. The set of receivers consists of two USRP N210s, two USRP B210s, two USRP B200s, two USRP B210 Minis, two ADALM Pluto SDRs, and nine RTL-SDR receivers, thus the receivers span a wide range of SWaP-C requirements. The authors’ approach to receiver-agnostic SEI uses an adversarial training architecture comprised of a feature extractor and two classifiers. One classifier is tasked with authorized emitter identification and the other with receiver classification. Each signal undergoes CFO correction and normalization to unit energy. Following CFO correction and energy normalization, data augmentation is conducted in accordance with [
190]. Data augmentation is applied to the training signals and achieved by passing each of them through a simulated multipath channel with Doppler effects to improve the feature extractor’s and both classifiers’ robustness to a range of conditions that can be present within an operating environment. Every training and testing signal is then represented using its spectrogram [
190]. The authors assess their receiver-agnostic SEI approach under various configurations and conditions to include the number of receivers represented in the training signal set, SNR, homogeneous and heterogeneous receiver configurations within and across the training and testing signal sets, and a six-emitter operational wireless network set up within an office environment without Line-of-Sight (LoS) between the emitters and any of the three receivers. The greatest receiver-agnostic SEI success is achieved using a collaborative approach in which the emitter identity predictions of multiple receivers are combined to form a “fused” prediction. The authors note that SEI accuracy increases as the number of predictions being fused increases. Despite the encouraging results presented in [
178], the authors only present average accuracy results, thus there is no way to know how well their approach identifies individual emitters. Overall, the construction of a large signal set that spans all receivers is not an issue so long as the receivers do not change (e.g., replaced) and are known before training the SEI process. However, this may not be practical in operational IoT infrastructures because every new receiver deployment would necessitate the collection of large signal data sets and computationally expensive retraining. Another observation is that the best receiver-agnostic SEI performance occurs when the training and testing receivers are of the same manufacturer and model (e.g., only N210s are used) or when the training receivers are of higher SWaP-C than those used for testing. An example of the latter is when training is conducted using signals collected by the N210s and B210s but the testing signals are collected using the RTL-SDR receivers. Lastly, it is unclear how the approach–presented by the authors of [
178]–would fair or be implemented in a WANET or MANET-configured IoT deployment because such configurations face even stricter onboard limitations (e.g., memory, power, computation) and the receivers can and more than likely would change location(s) within a given portion of the network, thus changing the authorized emitter signals that can be collected by a given receiver at any point in time. As previously noted, the latter would require the collection of large signal data sets and the retraining of the affected feature extractors and classifiers.
The authors of [
186] investigate two approaches for achieving receiver-agnostic SEI. The authors designate these two approaches as Statistical Distance-based Receiver Agnostic (SD-RXA) and GAN-RXA. Both are trained with the goal of training a feature extractor that extracts receiver-agnostic features from the signals of a set of emitters regardless of the receiver used to collect them. SD-RXA is built on the assumption that the receiver- and emitter-specific features are uncorrelated due to asymmetry between their features and random receiver-emitter pairing. However, the authors conclude the SD-RXA is difficult to work with due to the statistical distance between receiver and emitter feature distributions being nontrivial and tricky, the challenge of selecting an appropriate distance between two distributions, and most importantly the feature extractor’s effectiveness in achieving receiver-agnostic SEI cannot be evaluated during training. The GAN-inspired GAN-RXA approach overcomes these difficulties and achieves an average percent correct classification performance of 68% when the GAN-RXA feature extractor is trained using forty emitters and twenty-five receivers and tested using ten emitters and one receiver. This is relatively poor when considering the preponderance of SEI works that achieve average percent correct classification performances of 90% and higher. There may be reasons for this performance disparity. First, the emitters and receivers used for training are mutually exclusive to those used for testing (i.e., no emitter or receiver is used for training and testing). This is important because the feature extractor-learned features will be heavily influenced by the features present in the signals of the training emitters. Although the testing emitters can be of the same manufacturer and model (a.k.a., they only differ in serial number), there are still differences between each emitter’s signal features. The impact of these differences is not investigated by the authors of [
186]. Second, they use a portion of the publicly available data set provided by the authors of [
161], which includes signals collected over multiple days. This is important because SEI performance has been shown to suffer when training and testing are conducted using signals collected at different times (e.g., across multiple days) even when a single receiver is used (see
Section 5.2). This makes it difficult to determine if the poor performance is due to the GAN-RXA approach’s inability to learn receiver-agnostic SEI features, issues surrounding cross-collection (a.k.a., multi-day) SEI, or a combination of the two. Lastly, the authors do not remove CFO from the signals and are unclear as to whether signal energy is normalized to unity. The presence of either or both may be biasing SEI performance–positively or negatively–and make the presented approach susceptible to adversary exploitation (see
Section 4).
The authors of [
187] aim to achieve receiver-agnostic SEI by treating the features of the different receivers as a data augmentation technique to train a simple Siamese model using unsupervised learning [
191]. The simple Siamese model is then optimized using Local Maximum Mean Discrepancy (LMMD) regularization [
192] to capture emitter-specific SEI features. The authors’ receiver-agnostic SEI approach achieves an average accuracy of 95% at an unspecified SNR. It is important to note that the authors primarily use simulated emitter and receiver SEI features, but do evaluate their approach using two USRP X310s as the receivers and four USRP N210 emitters. This is a very limited case and not indicative of a realistic IoT deployment that will be comprised of tens to hundreds of devices that are not constructed using high-end (a.k.a., costing more than
$3,000 to
$15,500 per unit) SDRs. It is also unclear which results correspond to the simulated emitters and receivers versus the SDR-based evaluation. Many of the presented results show three or more receivers, which indicates the simulated emitters and receivers are used. The use of simulated SEI features is a valid approach but clearly pairing them with actual hardware-based results is essential to determining the value of any SEI-focused contribution.
A truly receiver-agnostic SEI process needs to be able to accept signals collected from receivers that were not present during the SEI training process. Also, all of these approaches assume the solution rests in the development of more sophisticated DL algorithms. Although the interest in DL is well-founded and warranted, it may not be the best approach, thus future work needs to look into alternative approaches that are not as demanding in terms of computation, memory, time, and resources to make receiver-agnostic SEI better suited for IoT deployments.