1. Introduction
In the recent decades, conventional car keys are replaced by remote keys, or key fobs, for keyless ignition systems. A Passive Entry Passive Start (PEPS) module enables to unlock the car and to start the engine in an easier way. People first pair their car and key fob using a specific method. After that, when the driver with the key fob approaches the car, the door is unlocked automatically. The driver gets into the car and presses the start button, and then the car engine starts, as shown in
Figure 1. It is definitely more convenient. Elderly drivers or arthritis patients may experience difficulty when trying to grip a car key and inserting it into the keyhole. People may have too many things in their hands, making it inconvenient to take the car key out of their pocket or bag. Keyless ignition systems improve a lot in these scenarios. Wireless network technologies, such as NFC (Near Field Communication), UWB (Ultra-Wide Band), BLE (Bluetooth Low Energy) have been adopted in the past few years [
1,
2,
3,
4]. They have the advantages of low power consumption. As a result, key fobs using regular batteries can work for several months. These technologies also support secure data transmission and accurate positioning. The car can detect whether a paired key fob is close to the car within a certain distance or in the car. However, new attack types are also emerged [
5]. A skilled car thief can read the unique wireless signal from the key fob in the air, and make a duplicate key without having the key in hand. It is easier than conventional car keys in forgery. In another scenario, since the driver doesn’t have to take out the key from the keyhole, he or she may sometimes leave it in the car. Anyone can start the engine and drive the car away.
Biometrics [
6] have been widely used in the authentication of laptop computers and handheld devices, such as fingerprint, voice, face, and iris. In this paper, we propose to use smart watches capable of sensing Electrocardiogram (ECG or EKG) [
7] as smart car keys. When a driver with a smart watch walks close the paired car, the PEPS module positions the location of the smart watch, receives the ECG signals of the driver measured by the smart watch, and recognizes the driver. Once the driver is identified and authorized, the PEPS module can unlock the car automatically and make the car engine ready to start.
ECG is a recording of the electrical activity of the heart. 12-lead ECG, commonly used in hospitals, uses 10 electrodes placed on the patient’s limbs and on the surface of the chest to measure the electrical potential of the heart from twelve different angles. In a normal cardiac cycle (or a heartbeat), there are three main components, which are P wave, QRS complex, and T wave, shown in
Figure 2. Features, like PR interval, ST segment, and QT interval, are often used in ECG analysis. ECG is useful in many heart disease diagnoses, such as cardiac arrhythmia [
7].
Early research also shows the potential of ECG in personal identification [
8,
9,
10] on some public ECG datasets. However, it is more difficult to use smart watches to accurately measure and record ECG. Smart watches are small. They are worn on the wrist, and may move in different ways quickly and slowly. Sometimes, they may not fit well with the skin. The ECG measured by smart watches is typically less precise than 12-lead ECG measured in hospitals.
In this paper, we adopt the deep learning technology to recognize the driver by the ECG. ECG measured from 15 subjects are first preprocessed, segmented into ECG cycles, and recognized by two deep learning models, Long Short-Term Memory (LSTM) [
11] and Auto Encoder [
12], with different training strategies. The experiment results show that LSTM models have achieved the best accuracy score for identity recognition (91%) when a single ECG cycle is used. However, it takes at least 30 minutes for training. The training of a personalized Auto Encoder model takes only 5 minutes. When 15 continuous ECG cycles are sensed in less than 20 seconds and used, it can achieve 100% identity accuracy.
2. Related Works
For biometric authentication, ECG has characteristics such as universality, uniqueness, permanence, and collectability. Specifically, universality indicates that every individual has ECG information; uniqueness indicates the distinctiveness of each individual’s ECG information; permanence indicates that some properties of an individual’s ECG information remains unaltered over a period of time, or even unmodifiable; and collectability indicates that ECG information can be acquired conveniently. Two additional characteristics are required for using and trusting ECG-based personal identification. One is the speed of acquiring ECG from an individual, and processing the information. The other is the accuracy of the ECG-based personal identification [
10].
Similar to the research development in computer-aided heart disease diagnosis based on ECG signal analysis, the recent research trend of ECG-based biometrics moves to the adoption of deep neural networks, such as CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), and LSTM (Long Short-Term Memory).
Salloum et al. [
13] applied LSTM to recognize ECG-based biometrics on two pre-filtered open datasets: ECG-ID [
14] and MIT-BIH [
15]. The recordings in the ECG-ID dataset were digitized at 500 Hz with 12 bit resolution over a normal ±10 mV range, while the recordings in MIT-BIH were sampled at 360 Hz with 11 bit resolution. Pan-Tompkins algorithm [
16] was used to find the PQRST complex. Once R peaks were detected, ECG segments (or cycles) were formed by concatenating samples before and after R peaks. The waveform of each ECG segment was fed into an LSTM model. Finally, the softmax function was used in the output layer. When 9 continuous ECG waveforms were processed, the LSTM model could achieve nearly 100% accuracy.
Cabra et al. [
17] evaluated different machine learning algorithms for ECG authentication and gender recognition on ECG-ID [
14] and CYBHi [
18] datasets. The recordings in the CYBHi dataset were digitized at 1 kHz with 12 bit resolution. ECG recordings were first bandpass filtered, and segmented by Pan-Tompkins algorithm [
16]. Then, 10 features were extracted from each ECG segment. The experiment results showed that the accuracy of Support Vector Machine (SVM) in ECG authentication was 99.2%. As well, the accuracy of k-Nearest Neighbors (KNN) in gender recognition was 95.1%
Lee et al. [
10] adopted ensemble learning for ECG authentication on a non-public dataset CU-ECG. The sample rate was 500 kHz. Deep learning models, such as LSTM and CNN, were stacked together into various structures and trained to find a best ensemble for ECG authentication. In the preprocessing, low-pass filter was used, and then ECG cycles were segmented with respect to R-peaks. The signals were also transformed into 2D images using different time-frequency transforms, such as STFT (Short-time Fourier Transform), Scalogram, FSST (Fourier Synchrosqueezing Transform), and WSST (Wavelet Synchrosqueezed Transform). The images were fed into 2D-CNN models, such as VGG-19, ResNet-101, and GoogleNet. The experiment results showed that the performance of LSTM models and 2D-CNN models was improved by LSTM-2D-CNN ensemble model.
Table 1 shows their results in ECG authentication on different datasets.
In addition, Ryan and Howes showed the relations between alcohol consumption, heart rate, and heart rate variability [
19]. Using ECG and PPG (photoplethysmogram) monitoring, Wang et al. used SVM to predict alcohol consumption and classify the subject into three classes: normal, light drinking, and drinking. Similarly, ECG and PPG data were bandpass filtered and segmented. 4 features were extracted from each ECG segment, and another 4 features were extracted from each PPG segment. Totally, 8 features were considered by SVM, and the accuracy of drinking classification was 95% [
20].
3. Method
In this paper, we adopt the deep learning technology to recognize the driver based on the ECG measured by the smart watch the driver wears. LSTM [
11] and Auto Encoder [
12] are investigated.
To be more realistic in the PEPS scenario, we use a commercial smart watch to measure the ECG of 15 subjects recruited in our experiments. We note that the availability and affordability of the smart watch in public markets is a key concern in this study. Usually, the sample rate of the resultant ECG recordings is significantly lower than the datasets used in [
13,
17], and so is the precision. However, speed and accuracy are required for using and trusting ECG-based personal identification [
10].
Similar to approaches used in [
13,
17,
20], each ECG recording is first preprocessed. It is bandpass filtered, and then segmented. ECG segments are divided into training and test sets for machine learning. LSTM and Auto Encoder models are trained using training set, and test sets are fed into the resultant models for evaluation.
In the PEPS scenario, two model types are considered in this study. The first is one model for all drivers, and the second is one model for one driver. We note that the differences between the two model types may raise many different management and security issues in practice. However, that is out of the scope of this paper. In the experiments, different training strategies are tested to examine the speed and accuracy of the models.
3.1. ECG Measurement
In this study, ASUS VivoWatch [
21] is used to measure the subject’s ECG. It builds in micro-electrical and optical sensors for ECG and PPG (photoplethysmogram), respectively, and can record Lead I-like single channel ECG. The sample rate of the smart watch is only 60 Hz. The measurements are practiced under two scenarios. The first measurement is taken several times randomly in the subject’s daily life. The second measurement is conducted twice, specifically, 15 and 30 minutes after drinking, as human body will begin to absorb the alcohol after 10 to 15 minutes of drinking, and the liver will begin to metabolize the alcohol after 25 to 30 minutes of drinking [
19,
20].
3.2. Data Preprocessing
There are three common types of noise in ECG. Muscle artifacts usually come from noise between the electrode and the skin, or are caused by muscular activities. Baseline wander are primarily caused by respiration, body movements, sweating, poor electrode contact, and skin electrode impedance. Electromagnetic interference (EMI) can be caused by nearby electronic devices, high voltage power sources, electromagnetic waves, metallic substances within the body, and static interference in dry environments. Muscle artifacts and baseline wander are low frequency noise, whereas electromagnetic interference is high frequency noise. Similar to [
13,
17,
20], each ECG recording is bandpass filtered.
We use Pan-Tompkins algorithm [
16] to find R peaks in this study. In practice however, since we don’t have to consider all heartbeats in the PEPS scenario, R peaks can also be found simply by a threshold, as shown in
Figure 3(a). A normal heart rate for adults is between 60 and 100 beats per minute. Assuming 80 beats per minute on average, there are 45 sample points per beat when the sample rate is 60 Hz. Here, we extract a fixed-length ECG segment for each R peak. The system looks forward and backward 22 sample points from the R peak in both time directions, and make a fixed-length ECG segment, as shown in
Figure 3(b).
3.3. Deep Learning Models
In this paper, two deep learning neural networks are investigated for ECG-based personal identification. They are LSTM [
11] and Auto Encoder [
12]. Two training strategies are tested. The first is to train one model for all users. The second is to train one model for one user.
3.3.1. Long Short-Term Memory (LSTM)
RNN (Recurrent Neural Network) is designed to handle sequence data. However, when the sequence is long, gradient exploding and vanishing problems might occur in the training process of RNN. LSTM enhances RNN [
11].
Figure 4 shows the architecture of LSTM, in which the input
Xt of a LSTM cell is a vector at time
t of a time sequence
X = [
X1,
X2, …,
Xn],
ht is the output vector of the LSTM cell, and
Ct is the derived information kept in the memory of the LSTM cell. We note that
ht is also referred to as the hidden state of the cell. Three gates are used to deal with the gradient vanishing problem. The forget gate determines whether the past information from its predecessors should be erased from the cell; the input gate determines whether new information is kept in the cell; and the output gate determines which information is to be outputted.
σ() and
tanh() are sigmoid and hyperbolic tangent functions, respectively. They serve as activation functions in the LSTM cell.
Equations (1)-(6) show how the LSTM cell updates. In addition to
Xt,
ht-1 and
Ct-1 are also taken into account, which are the hidden state and the past information kept in the memory of its immediate predecessor. W
C and
BC are the weight and bias for new information calculation. W
f and
Bf are the weight and bias of the forget gate; W
i and
Bi are the weight and bias of the input gate; and so are W
o and
Bo of the output gate.
Figure 5 shows the proposed LSTM model for ECG-based personal identification. The input is the waveform of a single ECG segment, which is a one-dimension time sequence with 44 data points. The dimensions of input and output vectors for each LSTM cell are 1 and 64, respectively. The output of the LSTM layer is then fed into three fully-connected layers. The activation functions used in these layers are ReLU (Rectified Linear Unit). In the last layer,
p is the number of subjects. One additional class refers to
no match. Finally, the output
Y tells whether the input ECG segment associates with a right subject. In this study, unidirectional LSTM and cross-entropy loss function are adopted.
3.3.2. Auto Encoder
Auto Encoder [
12] is usually used in unsupervised learning. It could be a multi-layer
artificial neural network, consisting of an encoder and a decoder. The encoder transforms the input vector into a more efficient representation, or simply referred to as a code, while the decoder tries to recreate the input vector from the code, as shown in
Figure 6. In other words, an Auto Encoder learns two functions: an encoding function to transform the input vector
X into a code
Z, and a decoding function to reconstruct
X from
Z. The learning objective is to minimize the difference between the input
X and the output
X’. In this setting, there is no need to annotate the output label for all training and test data.
When the length of
Z is shorter than
X, the encoder performs dimensionality reduction, just like what a PCA (Principal Component Analysis) does. As a result, the output of the encoder is sometimes referred to as a feature vector, which can be used in a following classifier.
Figure 7 shows the concept of Auto Encoder-based classification, where the encoder of a well-trained Auto Encoder is adopted as the feature extractor. Then, a classifier can be trained using supervised learning algorithms, such as SVM and neural networks.
Figure 8 shows the proposed Auto Encoder model for ECG-based personal identification. Again, the input is the waveform of a single ECG segment. After two LSTM layers, the encoder transforms the waveform into a feature vector whose length is 16. In the decoder, the RepeatVector layer duplicates the feature vector, and feed them into another two LSTM layers. Finally, the Dense layer reconstructs a output waveform, which should be very similar to the input. The loss function is SSE (the sum of squares due to error).
In this study, when the training strategy is one model for one user, the ECG-based personal identification become a one class classification problem. For Auto Encoder models, we can identify the user based on the difference between the input
X and the output
X’.
Figure 9 shows the distribution of the SSE loss of the Auto Encoder model for one of the subjects recruited in the experiments.
In
Figure 9, the SSE loss between the reconstructed waveform and the input ECG segment of the target subject is usually significantly smaller than the loss of other subjects. It is easy to compute a threshold of SSE to distinguish the target subject and others. In this case, the threshold is 4.
4. Experiment
4.1. ECG Measurement
In this study, 15 subjects are recruited. ASUS VivoWatch [
21] is used to measure their ECG. Every subject makes 40 measurements or so, and each measurement takes 15 seconds. After preprocessing, i.e., the ECG data are bandpass filtered and segmented, there are 12,140 normal ECG segments and 2,080 abnormal ECG segments after drinking. They are divided into 80% for training and 20% for testing.
4.2. Experiment Environment and Setup
All experiments are performed on a generic personal computer. The hardware and software for ECG processing are shown in
Table 2.
For the two training strategies, several experiments are performed. Each experiment includes ECG segments from different numbers of subjects. For the first strategy, a single model is trained to recognize all subjects in the dataset. For the second strategy, a personalized model for each subject is trained. Since the paring between the smart watch and the PEPS car must be done easily, the training time must be short, and recognition accuracy must be high enough.
For all models, the training rate is 0.001, batch size is 8, and epoch is 150. Adam optimizer is used.
4.3. Experiment Results
For the first strategy, a single model is trained to recognize all subjects in the dataset. The training time of both LSTM and Auto Encoder models increases when the number of subjects increases, whereas the accuracy of subject recognition decreases. LSTM models perform slightly better than Auto Encoder models. However, the training time of LSTM models is significantly longer than that of Auto Encoder models. The accuracies of all models are 90% or so, which are lower than those reported in [
13,
17]. We note that the precision of the ECG recordings measured by the smart watch is typically lower than 12-lead ECG measured in hospitals, and the sample rate of ASUS VivoWatch [
21] ECG is only 60 Hz.
For the second strategy, for each subject in the dataset, one personalized model is trained. The training time of LSTM and Auto Encoder models is stable. Again, training Auto Encoder models is significantly faster than training LSTM models. LSTM and Auto Encoder models outperform each other in subject recognition in different experiments. The accuracies of these models are 90% or so.
In the following experiment, for each subject, we build a personalized model.
Although the accuracies of these personalized models in subject recognition are 90% or so when one ECG segment is used, the accuracies reach 100% when 15 continuous ECG segments are taken into consideration. Similar to [
13], collective decision based on continuous ECG segments significantly enhances the performance of these models, as shown in
Figure 10.
Table 4 summarizes the experiment results. The training time of a personalized Auto Encoder model is stable and significantly lower than that of a personalized LSTM model. As a personalized Auto Encoder model is an unsupervised learning one class recognizer, it can be built from only the driver’s ECG recording, and in a more controllable, simple and secure environment. This will simplify the whole procedure for ECG recordings management extremely. In the current setting, when a driver wants to pair his or her smart watch with a PEPS car, the driver is required to measure a 5 minutes ECG recording, and upload the recording to a specific secure cloud computing service. The cloud can build a personalized Auto Encoder model for the driver and the watch without other materials. Then, the model can be downloaded and introduced into the car according to a secure protocol, such as Fast Identity Online (FIDO) standard [
22]. Finally, the recording and the model on the cloud can be erased completely. We will further study secure protocols for the PEPS system, such as FIDO.
When a driver walks close to the paired PEPS car, the smart watch is triggered to sense the ECG for 20 seconds, and transmit the ECG to the car for personal authentication. If the one passes, the car unlocks the door and be ready to start.
5. Conclusions
In this paper, we propose to use a smart watch as a novel car key in PEPS vehicles. Wireless network technologies, such as NFC, UWB, and BLE, support secure data transmission and accurate positioning of the smart watch when it is near or in the car. The new PEPS module can use ECG signal measured by the driver’s smart watch to perform biometrics authentication. The experiment results show that for each driver, a personalized Auto Encoder model for each subject for subject recognition can be trained quickly. Although the precision of the ECG recording measured by the smart watch is typically lower than 12-lead ECG used in hospitals, when 15 continuous ECG segments are taken into consideration, the accuracy of subject recognition reaches 100%. As a personalized Auto Encoder model is an unsupervised learning one class recognizer, it can be built from only the driver’s ECG recording. Management of ECG recordings could be simplified extremely.
Currently, we are investigating Fast Identity Online (FIDO) standard [
22] for password-free biometrics, like ECG proposed in this article. Since drunk driving is very dangerous, drunk driving detection is important. We are also conducting experiments of drunk driving detection based on the ECG signal measured by the driver’s smart watch.
Author Contributions
Conceptualization, R.-I Chang and J.-W. Lin; methodology R.-I Chang; software, T.-C. Lin; validation, J.-W. Lin; formal analysis, T.-C. Lin; investigation, R.-I Chang and J.-W. Lin; writing—original draft preparation, T.-C. Lin; writing—review and editing, J.-W. Lin; visualization, T.-C. Lin.
Funding
This research was partially supported by NAME OF FUNDER, grant number XXX.
Data Availability Statement
ECG-ID [
14], MIT-BIH Arrhythmia [
15], and CYBHi [
18] datasets are public available.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Cao, Y.; Lu, X.; Zhao, Z.; Ji, X.; Yang, J.; Pang, X. A comparative study of BLE-based fingerprint localization for vehicular application. 2018 Ubiquitous Positioning, Indoor Navigation and Location-Based Services, Wuhan, China, 22–23 March 2018.
- Attaran, A.; Altunyurt, N.; Locke, J.; DeLong, A. NFC reader antenna design and considerations for automotive applications. 2021 Antenna Measurement Techniques Association Symposium, Daytona Beach, FL, USA, 24–29 October 2021.
- Hong, J.; Shin, J.; Lee, J. Strategic management of next-generation connected life: Focusing on smart key and car–home connectivity. Technol. Forecast. Soc. Change 2016, 103, 11–20. [Google Scholar] [CrossRef]
- Bae, H.J.; Choi, L. Environment aware localization with BLE fingerprinting for the next generation PEPS system. 2019 IEEE Wireless Communications and Networking Conference, Marrakesh, Morocco, 15–18 April 2019.
- Patel, J.; Das, M.L.; Nandi, S. On the security of remote key less entry for vehicles. 2018 IEEE International Conference on Advanced Networks and Telecommunications Systems, Indore, India, 16-. 19 December.
- Singla, D.; Verma, N. Machine and Deep learning in Biometric Authentication: A Review. 2023 International Conference on Advancement in Computation & Computer Technologies, Gharuan, India, 5-. 6 May.
- Luz, E.; Schwartz, W.R.; Cámara-Cháveza, G.; Menotti, D. ECG-based heartbeat classification for arrhythmia detection: A survey. Comput. Methods Programs Med. 2016, 127, 144–164. [Google Scholar] [CrossRef] [PubMed]
- Chan, A.D.C.; Hamdy, M.M.; Badre, A.; Badee, V. Wavelet distance measure for person identification using electrocardiograms. IEEE Trans. Instrum. Meas. 2008, 57, 248–253. [Google Scholar] [CrossRef]
- Prakash, A.J.; Patro, K.K.; Hammad, M.; Tadeusiewicz, R.; Pławiak, P. BAED: A secured biometric authentication system using ECG signal based on deep learning techniques. Biocybern. Biomed. Eng. 2022, 42, 1081–1093. [Google Scholar] [CrossRef]
- Lee, J.A; Kwak, K.-C. Personal Identification Using an Ensemble Approach of 1D-LSTM and 2D-CNN with Electrocardiogram Signals. Appl. Sci. 2022, 12, 2692. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Hinton, G.E.; Zemel, R.S. Autoencoders, minimum description length and Helmholtz free energy. In Advances in neural information processing systems 6 (pp. 3-10).
- Salloum, R.; Kuo, C.-C.J. ECG-based biometrics using recurrent neural networks. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, New Orleans, LA, USA, 05–09 March 2017.
- ECG-ID Database. Available online: https://physionet.org/content/ecgiddb/1.0.0/.
- MIT-BIH Arrhythmia Database. Available online: https://www.physionet.org/content/mitdb/1.0.0/.
- Pan, J.; Tompkins, W.J. A Real-Time QRS Detection Algorithm. IEEE. Trans. Biomed. Eng. 1985; BME-32, 230–236. [Google Scholar]
- Cabra, J.-L.; Mendez, D.; Trujillo, L.C. Wide machine learning algorithms evaluation applied to ECG authentication and gender recognition. 2018 2nd International Conference on Biometric Engineering and Applications, Amsterdam, Netherlands, 16–18 May 2018.
- Check Your Biosignals Here Initiative (CYBHi) Dataset. Available online: https://zenodo.org/record/2381823.
- Ryan, J.M.; Howes, L.G. Relations between alcohol consumption, heart rate, and heart rate variability in men. Heart 2002, 88, 641–642. [Google Scholar] [CrossRef] [PubMed]
- Wang, W.-F.; Yang, C.-Y.; Wu, Y.-F. SVM-based classification method to identify alcohol consumption using ECG and PPG monitoring. Pers. Ubiquitous Comput. 2018, 22, 275–287. [Google Scholar] [CrossRef]
- ASUS VivoWatch. Available at https://www.asus.com/mobile-handhelds/wearable-healthcare/asus-vivowatch/.
- Feng, H.; Guan, J.; Li, H.; Pan, X.; Zhao, Z. FIDO gets verified: A formal analysis of the universal authentication framework protocol. IEEE Trans. Dependable Secur. Comput. 2022, 20, 4291–4310. [Google Scholar] [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).