IoHT and Edge Computing Aided Pandemic-Compliant, Resilient and Perceptive Platform for Smart-City Human Habitat

Atlanta Choudhury ---; Kandarpa Kumar Sarma; Debashis Dev Misra; Koushik Guha; Jacopo Iannacci

doi:10.20944/preprints202409.1035.v1

Submitted:

12 September 2024

Posted:

13 September 2024

You are already at the latest version

Abstract

The COVID-19 pandemic has highlighted the need for a robust medical infrastructure and crisis management strategy as part of smart-city applications, with technology playing a crucial role. The Internet of Things (IoT) has emerged as a promising solution, leveraging sensor arrays, wireless communication networks, and artificial intelligence (AI)-driven decision-making. Advancements in Edge Computing (EC), Deep Learning (DL), and Deep Transfer Learning (DTL) have made IoT more effective in healthcare and pandemic-resilient infrastructure. When combined with medically oriented IoT setups, DL-architectures are suitable for integrating into pandemic-compliant medical infrastructure. The development of intelligent pandemic-compliant infrastructure requires combining IoT, edge and cloud computing, image processing, and AI tools to monitor adherence to social distancing norms, mask-wearing protocols, and contact tracing. The proliferation of 5G wireless communication has enabled ultra-wide broadband wireless communication, with high reliability and low latency, thereby enabling seamless medical support as part of smart-city applications. Such set-ups are designed to be ever-ready to deal with virus-triggered pandemic-like medical emergencies. This study presents the design of a pandemic-compliant mechanism leveraging IoT optimized for healthcare applications, edge- and cloud-computing frameworks, and a suite of DL-tools. The framework uses a composite attention-driven framework incorporating various DL-pre-trained models (DPTM) for protocol adherence and contact tracing. When connected to public networks, it can detect specific cyber-attacks. The results confirm the effectiveness of the proposed methodologies.

Keywords:

deep learning

;

deep transfer learning

;

contact tracing

;

facemask

;

social distancing

;

edge computing

;

cyber-attack

;

Federated Learning

Subject:

Engineering - Electrical and Electronic Engineering

1. Introduction

Globally, the COVID-19 pandemic situation has improved dramatically, with infection rates declining but not completely disappearing. Despite widespread immunization efforts, the emergence of new variants and continued infections indicate that COVID-19 may persist alongside other influenza-like illnesses as an ongoing challenge for humanity [1,2]. Experts emphasize the importance of curtailing virus transmission through human-to-human contact, including aerosol transmission, which is typical of many influenza-like diseases. Guidelines from the World Health Organization (WHO) recommend measures such as maintaining a minimum distance of six feet or two meters between individuals in crowded areas and wearing masks to prevent virus spread [1,61]. Given these recommendations, there is a need for the development of monitoring tools to reinforce mask-wearing standards in public places and uphold prescribed bio-safety measures [2]. Contact tracing remains a crucial component in preventing virus spread, with governments and agencies advocating for the use of privacy-preserving contact-tracing applications to collect relevant data while safeguarding user privacy [3]. Research indicates that contact tracing and associated policies are effective in preventing the spread of COVID-19 and similar contagious diseases [4]. By enabling infected individuals to voluntarily or otherwise quarantine themselves, these measures help mitigate the impact of community contamination, reducing overall suffering within colonies of human habitats and nations [5,6].

The combination of maintaining social distance, wearing masks, and implementing contact tracing have been recognized as crucial in preventing widespread infection of influenza-like viruses, including COVID-19 [7]. It is equally important to implement measures for monitoring adherence to social distancing and mask-wearing guidelines [8]. Similarly, robust contact tracking and database maintenance systems that allow frequent updates and accessibility are required to help reduce the transmission of viruses linked with influenza outbreaks. These elements are essential in designing pandemic-resilient infrastructure for human habitation.

Among the array of technological solutions, the Internet of Things (IoT) stands out, leveraging sensor packs, wireless communication networks, and AI-driven automated decision-making [9,10]. Furthermore, advancements in edge-computing [11], deep learning (DL) [12], and deep transfer learning (DTL) [13] have empowered IoT to play a decisive role in healthcare, particularly in combating outbreaks of influenza-like viruses such as COVID-19. The application of IoT in healthcare has given rise to Medical IoT (MIoT) or Internet of Healthcare Things (IoHT), offering significant benefits to stakeholders. These technologies facilitate effective tracking of agents (patients, medical professionals, and resources), automatic data sensing, authentication, and diagnostic decision support within health management systems [14].

Many published studies have concentrated on inventing and implementing technology for observing social distancing, mask-wearing, and contact tracing, and these activities are frequently carried out individually. Despite the extensive use of AI tools in these endeavors, there exists an opportunity to integrate IoT, edge, and cloud computing along with AI to create comprehensive frameworks for effectively monitoring adherence to protocols related to influenza viruses, including COVID-19 outbreaks. In this work, we present the design of a pandemic-compliant mechanism for monitoring adherence to influenza virus protocols, including social distancing norms and mask-wearing, as well as contact tracing. We utilize the Grove AI HAT and Raspberry Pi 4 combination, connecting multiple sensors to form edge nodes that collaborate with a cloud server hosting various ML/DL tools within a residential premise. In particular, we use DL models that have already been trained, like RESNET-50, MobileNetV2, and SocialdistancingNet-19, in a framework called Hybrid Multi-Head Attention-Aided Multi-Tasking Deep Network with Diffusion Stability (HMAAHDNDS), which is based on attention and DTL. This framework utilizes samples from open-source databases to monitor protocol adherence and conduct contact tracing. Additionally, it generates images from text descriptions of violators and is equipped to detect cyber-attacks when connected to public networks. We benchmark our approach using various AI models, including feed-forward (FF) Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), VGG-16, Random Forest (RF), Decision Tree (DT), ResNet-50, and MobileNetV2. The results obtained from this integrated system, referred to as the Internet of Intelligent Infrastructure Things (IoIIT), demonstrate the effectiveness of our proposed approach.

The specific contributions of the work are:

Design of a pandemic-compliant mechanism for effective monitoring of the adherence to bio-safety norms notified for influenza-type virus infections. The mechanism is configured to perform facemask-wearing detection, social-distance norm adherence, contact tracing etc.
Edge nodes designed using Grove AI Hat and Raspberry Pi4 trained, tested and synchronized with cloud resident DL tools are deployed as part of a residential complex.
Most importantly, several pre-trained DL models are configured to formulate a hybrid platform named HMAADNDS which effectively performs multiple tasks including the detection of cyber-attacks.

2. Proposed Pandemic Compliant Perceptive Infrastructure Design

Extensive analysis of existing literature highlights the existence of a significant opportunity for devising solutions that seamlessly integrate various aspects such as face mask detection, mask classification, selection of appropriate mask types, adherence to social distancing guidelines, and contact tracing in the context of influenza viruses, including COVID-19. An AI-driven approach holds the potential to automate these processes, manage accumulated data effectively, generate insightful analytics, ensure higher reliability, and establish a connection with confirmed cases of infection. These are essential features that should be incorporated in the design of a pandemic-resilient intelligent infrastructure as part of a human residential complex. Figure 1 illustrates an integrated methodology that combines adherence to social distancing norms, recognition of face masks, and contact tracing about influenza viruses, including COVID-19. Notably, our focus extends to developing a comprehensive strategy for monitoring protocol adherence during pandemics, encompassing social distancing norms and correct mask usage. This involves leveraging the capabilities of the Grove AI HAT and Raspberry Pi 4 combination, which connects multiple sensors to form edge nodes operating in tandem with a cloud server. These components are integral to a residential setting and employ a suite of machine learning (ML) and DL tools. To elucidate the operational dynamics of the proposed approach, a concise overview of the relevant background is provided below.

2.1. Setting Up IoT, Edge, and Cloud Computing Framework for Protocol Monitoring and Contact Tracing

The primary focus of our proposed solution centers around the development of an IoT and edge computing setup (depicted in Figure 2) tailored for monitoring compliance with mask-wearing and social distancing guidelines, alongside enabling efficient contact tracing, text-to-scene generation, and cyber-threat detection. The edge computing infrastructure comprises a Grove AI HAT [64], a Raspberry Pi 4 (model B), and an assortment of sensors. Each entry point in a residential premise (with four entrances) is equipped with an IoT pack containing a camera, an IR thermometer, an ultrasonic sensor, and a proximity sensor. The Grove AI HAT interfaces with the camera via a 24-pin serial communication connector, which is crucial for monitoring adherence to physical distancing norms. In case of violations, a loudspeaker alerts individuals, prompting necessary action. This forms the basis of the IoIIT. Similarly, an infrared thermometer measures individuals’ temperatures non-invasively as they enter the premises. The ultrasonic sensor identifies unauthorized entry through the gates. The edge node has sufficient AI-aided decision-making capability. The Raspberry Pi handles data recording, and processing, and maintains a tally of individuals passing through. Additionally, there is a provision for integrating a display to showcase temperature readings, social distancing metrics, compliance status, etc.

The ultrasonic and infrared temperature sensors, along with the Raspberry Pi, are interconnected, ultimately linking the Grove AI HAT and a cloud server via Wi-Fi. Remote control of the setup is facilitated through VNC Connect software [64]. Utilizing edge computing, calculations are performed during the processing stage for each sensor’s data about an individual scan. Subsequently, in a multitasking scenario, the edge computing platform selects a sensor node based on its location and updates the cloud server’s records accordingly, guided by the sensor data analysis. The sensor locations are fixed and are marked in terms of port connections in the Raspberry Pi and the subsequent feed to the cloud server.

The combination of IoT, AI, and Wi-Fi access completes the process of data collection, processing, local decision-making, and communication. Typically, both wired and wireless networks are utilized by various applications to directly interface with the edge node and the cloud server computing platform. Additionally, both the cloud servers and edge nodes record all node parameters, enabling them to calculate results based on a repository of comprehensive information. In our experiments, we deployed four such edge-computing nodes at the entrance of a residential complex, as illustrated in Figure 3. As individuals enter, each sensor takes readings and transmits them to the edge-computing node. Initial decisions regarding actions to be taken, including monitoring compliance with mask-wearing and social distancing norms, and executing contact tracing, are made at the edge node level. This information is shared with the cloud server and stored in a database. For each of these actions, the edge node provides decision support, aided by dedicated Deep Neural Networks (DNNs) deployed on the cloud server. These DNNs undergo end-to-end learning and are trained to handle tasks as outlined above. Details of each DNN type configured for these purposes are provided in subsequent sections. The Wi-Fi access is through data rates of 50 Mbps with an optical fiber backhaul of 1 Gbps. Latency is a crucial aspect of the setup. We tested the setup with three Wi-Fi configurations (50 Mbps, 40 Mbps, and 30 Mbps), 4G (10 Mbps), and 3G (1 Mbps), providing latency ranging from 3 milliseconds (at 50 Mbps) to 100 milliseconds. The configuration of the edge node and cloud servers is outlined in Table 1, while the different sensors/ components used in the system are detailed in Table 2.

You only Look once version 4 (YOLOV4) is used at the edge node level for local decision-making based on vision captured by the cameras especially to perform face-mask detection, classification type, proper wearing, and social distance monitoring. The YOLOV4 uses a pre-trained architecture, carries out one-shot training and detection, and is reliable for fast real-time applications. The range of the sensors and other related connectors are shown in Table 2. The cameras (Philips, HSP3500) for face-mask handling are effective upto 25 meters in daylight and 10 meters in low light.

The IoT framework designed for this purpose comprises four layers: the perception layer, transport layer, processing layer, and application layer. The perception layer consists of a camera, IR thermometer, ultrasonic sensor, and proximity sensor, each deployed at the entrance of the gates. The transport layer facilitates communication between the nodes and the cloud server, primarily utilizing Wi-Fi access (XLT240170), which has a maximum range of 150 meters with various backhaul support options. A detailed summary of the sensor types used is provided in Table 2. The processing layer is constituted by the proposed architecture outlined in , which undertakes tasks such as detecting facemasks, classifying masks, ensuring proper mask-wearing, monitoring social distancing norms, executing contact tracing, generating images of violators, and defending against cyber-attacks. The application layer is constituted by the edge nodes (Raspberry Pi + Grove AI-Hat) deployed with an assortment of sensors at the four entrances of a residential premise and connected to a cloud- server so as to ensure synchronized operations with decision support executed by the DL tools. This essentially formulates the IoIIT.

2.2. Proposed Hybrid Multi-Head Attention-Aided Multi-Tasking Deep Network with Diffusion Stability (HMAAHDNDS)

The central focus of our proposed work lies in the design of a Hybrid Multi-Head Attention-Aided Multi-Tasking Deep Network with Diffusion Stability (HMAAHDNDS) framework (Figure 4). At its core, the design incorporates several deep task processing modules which are DPTMs namely RESNET-50, MobileNetV2, and SocialdistancingNet-19, each trained independently to address tasks such as face-mask detection, proper face-mask recognition, face-mask classification, monitoring social distancing norms, and contact tracing. Additionally, the HMAAHDNDS, as a composite setup, is equipped to perform tasks such as image generation from text inputs and detection of cyber-attack types (Table 4). Figure 4 provides a detailed layout of this architecture. The incorporation of attention layers within the network facilitates optimized processing, outlier cancellation, and focused analysis of specific feature maps. Following this, several fully connected (FC) layers generate composite feature maps and text outputs, which are then applied into a generator network for the synthesizing the images of the violators, improper mask-wearing by specific individuals (samples), and creation of the contextual scenarios within the camera range connected to the IoT package. Furthermore, the setup is adept at detecting cyber-attacks, with each network core (DPTM) trained for this purpose. Details regarding the outcomes of cyber-attack detection will be discussed later. Operating on a cloud platform, each primary task, such as face-mask recognition, social distancing monitoring, and contact tracing, has been rigorously tested and is elaborated in the subsequent sections. The inputs are images and sensor feeds. Images are coloured and sensor feeds are taken at intervals of 5 seconds. A summary of the key activities and decision states of the proposed system is shown in Table 3 and Cyber-attacks associated with data driven approaches of healthcare system is shown in 4.

2.3. Face Mask Recognition Utilizing Multiple ML/DL Methods Deployed on Cloud Platforms

As discussed above, proper face mask wearing and its recognition are crucial in the design of pandemic-compliant facilities and related infrastructure. For the design of a system that recognizes a proper face mask, it is necessary to configure an appropriate computer vision-based system with relevant decision-support mechanisms. The fabrication of such a mechanism first in standalone form is key to the design of pandemic-compliant infrastructure. It prepares a DPTM to be part of the proposed HMAAHDNDS (discussed in Section 2.2). The proposed method of proper face mask recognition is summarised using the block diagram shown in Figure 1.

The data used for configuring the model is first separated for training, testing, and validation. This content has a set of images having regions of interest (RoI) marked and appropriately labeled. As already mentioned, we have used DPTMs, namely ResNet-50 and MobileNetV2, for performing the face mask detection to make the system suitable for real-world situations. The other mentioned discrimination methods are used as benchmark approaches (Table 5). The details of the different parameters of the benchmark methods and DPTMs used for face mask detection are shown in Table 5.

Here, the term "batch size" refers to the number of samples processed in a single iteration of a ML/DL network. Instead of feeding the entire dataset into the model at once, which can be computationally expensive and memory-intensive, the dataset is divided into smaller subsets called batches. Further, the layers in the front section of the input block are used to derive feature maps from the available spatial resolution of the images captured by the cameras.

Along with face mask detection, proper wearing of a face mask is also crucial. We have performed some experiments to ascertain the correctness of wearing of the face mask. The methods and parameters used for this purpose are provided in Table 6.

The working of the system is based on certain steps. Kaggle data sets (https://www.kaggle.com/mrviswamitrakaushik/facedatahybrid) are adopted for the work. The data set provides both faces with and without face masks. Around 7000 images overall, of which 3725 depict people wearing face masks and 3828 depict people without them, are gathered into one set. Additionally, the images come in a variety of sizes, colours, backdrops, brightness levels, and contrast levels to suit any circumstance. There are two groupings made up of these datasets. The training set, which has roughly 6042 photos (80%), is one of them. Both the validation and the testing data sets contain 755 photos (20%). The outputs of the pre-processing portions have been applied to train the range of learning-based systems mentioned in Table 5. We have used dropout rates as part of multi-modal factorized bilinear pooling (MFBP) with ResNet-50 [46] and MobileNet [47] to avoid over-fitting. Then, to assess the learning accomplished and to compare the performances before and after training, we employ the mean square error (MSE) as the cost factor and vary the learning rates using the adaptive moment estimation optimizer. Figure 1 shows the position of the facemask detection in the overall scheme of the proposed design. The related parameters and specifications are shown in Table 5 and Table 6.

Similarly, we have performed a series of experiments to ascertain the classification of face masks used by people during pandemic situations in public places. After executing a series of trials, it is found that the ResNet-50 performs best in terms of accuracy and processing time which becomes the main cloud-resident tool for carrying our decision support related to face masks.

2.4. Monitoring Compliance with Social Distancing Norms (SDN)

To prevent the spread of infectious diseases, such as those transmitted through droplets and micro-droplets, adherence to social distancing measures is crucial, as recommended by the WHO [61]. Governments and concerned agencies require mechanisms to continuously monitor compliance with social distancing norms. Automated monitoring frameworks based on pattern recognition, object detection, and AI tools have been considered for this purpose. Some popular methods include region-based and unified-based algorithms [38]. The Region-Based Convolutional Neural Network (RCNN), based on recurrent clustering, is effective but relatively slower due to the need to run procedures for each generated region [60]. An alternative, the Fast-RCNN, enhances speed by processing all regions simultaneously and utilizing a regional proposal network (RPN) instead of targeted search [60,67]. Improved performance is expected from several popular DPTMs, including MobileNetV2, ResNet-50, and Social DistancingNet-19 [67]. The logic for detecting compliance with social distancing norms is depicted in Figure 8, where inputs include images and video streams. Relevant parameters and configuration details are provided in Table 7.

Utilizing the YoLo4 network, renowned for its capability of enabling real-time detection and enhancing the accuracy of identifying small objects, is a specific technology employed in a unified approach [50]. Another efficient method, the Single Shot Multibox Detector (SSD), was also suggested by the authors in [36]. These techniques utilize Convolutional Neural Networks (CNNs) to construct feature maps from images for object detection. By employing surveillance cameras installed in public areas, YOLO or SSD techniques can be utilized to detect individuals. A system linked to these cameras continuously monitors the live image stream, enabling the identification of individuals who violate social distancing regulations. As mentioned above, the YOLO4 network is particularly relevant to the edge nodes constituted by the Raspberry Pi 4 (Table 1) and is deployed in the four entrances to monitor local activity.

As already mentioned, the overall system includes a social distance algorithm. As a result, once the server recognizes an infected person through a cough signal and a sneeze action, it notifies the decision support system, edge computing nodes, local populace and triggers certain procedures linked with the social distancing guidelines. The cough and sneeze actions of a specific person are learned by the system and if the person enters the gate of the residential premises, it records the necessary vitals through the installed sensors. This process helps in placing the person under observation.

In Figure 1, the logical approach to the design and development of the smart social distancing system is summarized. The social distance observance is described by locating people wearing masks and then finding the physical separation between them.

The system initially focuses on detecting masks worn by individuals. Identifying a mask on a face serves as a confirmation of a person’s presence. Subsequently, the distance between two individuals wearing masks is calculated through pixel-to-spatial separation conversion. Facial recognition is then employed to ascertain the identity of potential violators. A record of these violations is generated, and notifications are sent to relevant stakeholders, including the individuals involved. This mechanism is crucial for facilitating contact tracing. After conducting a series of standalone trials, it was determined that SocialDistancingNet-19 is the most suitable model for monitoring compliance with social distancing norms. It is chosen as the cloud-resident tool to oversee adherence to these norms. In this capacity, it functions as a DPTM and undergoes further end-to-end learning using online samples received from edge nodes. At the edge node level, the Yolo4 network provides initial decisions while the DPTM extracts, stores, and shares initial features captured by video cameras. The higher-level inputs are then used by the cloud-resident SocialDistancingNet-19 to continuously train along with the inputs received from all the remaining feed points (as depicted in Figure 4).

2.5. Implementing Contact Tracing in Public Spaces and Residential Communities

During the peak period (2020–2022) of the severe acute respiratory syndrome caused by SARS-CoV-2, traditional contact tracing methods became overwhelmed. Public health organizations resorted to manual contact tracing by interviewing infected individuals to identify potential contacts. Exposed contacts were advised to seek medical attention, undergo treatment, and self-quarantine if necessary. While this strategy helped reduce transmission, it faced challenges and opportunities for technological interventions to enhance reliability. Digital contact tracing emerged as a solution to scalability, notification delays, recall errors, and identifying contacts in public spaces. It utilizes electronic data to trace exposures to illnesses. Most COVID-19 contact-tracing apps estimate proximity between smartphones using Bluetooth signal strength and determine exposure status based on proximity duration and distance from an infected individual. However, privacy and security concerns prompted debates over the architecture used for data gathering in tracking apps. Three main system architectures—centralized, decentralized, and hybrid—have been employed for COVID-19 tracing applications, with the final architecture often blending elements of both centralized and decentralized approaches. The proposed contact tracing technique’s block diagram is depicted in Figure 5. Seven alternative architectures (VGG-16, ResNet-50, MobileNetV2, SVM, DT, KNN, and LR) have been tested to determine the most reliable network for executing contact tracing within a cloud-resident framework integrated with intelligent edge nodes. In this approach, potential contacts are identified based on received inputs, initiating forward tracing. Infected individuals are isolated and treated, while symptom recognition during quarantine ensures continued care. Contacts of isolated cases are identified, and infected individuals are monitored for symptom severity to determine appropriate treatment levels. Quarantine, testing, and medication for contacts of infected individuals occur concurrently, with updates continuously logged in the database. Table 8 outlines the settings and configurations for the various ML and DL architectures utilized in the contact tracing efforts.

2.6. Text to Image Generation

Generating images from text is crucial for visually representing data obtained from continuous monitoring data and deriving proper interpretation of the situations. It aids in raising public awareness by illustrating challenging circumstances and provides training samples for enhancing network efficiency. To accomplish text-to-image generation, the input text is converted into a meaningful representation, such as a feature vector, which is then utilized to produce an image that aligns with the description.

The system initially receives text input and conducts tokenization to extract key textual features. Subsequently, images and text data from the dataset are combined before proceeding to the training phase. Here, the generator and discriminator play pivotal roles: the generator generates images while the discriminator classifies them. This iterative process continues until the generator can proficiently create images based on the input data. Ultimately, a trained model capable of producing visuals from textual input is obtained [80]. The flow logic and constituent blocks are illustrated in detail in Figure 4.

The Long Short-Term Memory (LSTM) layer is a crucial component within the block, serving for sequence modeling and context retention. Within the system, the LSTM layer takes in text embeddings, which are then processed through a dense layer and concatenated with a latent vector for generating a response. To enrich the dataset, a normal distribution is employed for sampling additional data from the latent vector. The output of the dense layers undergoes reshaping and concatenation with the activation volumes of convolution blocks in the generator. Passing the embedded text through various dense layers with outputs of different sizes achieves this. The generator includes an up-sampling layer after a convolution block housing two

3 \times 3

kernel-sized convolution layers. Additionally, a

1 \times 1

convolutional layer with three kernels is applied to the activation volumes of the generator. Pixel-wise normalization is utilized within the generator. In the discriminator, after an average pooling layer, the convolution block comprises two convolution layers with

3 \times 3

kernel sizes. Finally, the resulting output passes through a linear activation block, which determines whether the image is authentic or false.

2.7. Enhancing Cyber-Attack Detection

The risk of cyber-attacks looms large whenever a computer-driven system is connected to a broader network with partial, temporary or complete access to the internet. In this context, the composite system’s cyber-attack detection capability is developed to continuously monitor data traffic, discerning unauthorized access or activities within a networked environment. Cyber-attacks can manifest in two primary phases—during training and testing. Several common cyber-attacks targeting IoT-based setups include phishing, log access breaches, social engineering attacks, and brute force attacks aimed at passwords [80].

The provision for the detection of cyber-attacks has been incorporated into the system because this is an ICT-driven framework. Such a framework is bound to be affected at some point in time by a cyber-intruder which may be external or internal. Moreover, there is always a possibility that mobile device access to information generated by the system should be added as an extension of the work. In such a situation, cyber-threat shall be a reality. Further, internal cyber-threat will always be present.

Database creation is necessary as the records of sensor readouts and decisions generated for individuals will continuously expand. As a result, cyber-threats will always be present. The system can be operated as a standalone system and may not be connected to mobile devices/apps. But then updates and information dissimination will not be automatic and real-time. In the present form, the experiments are only intended to validate the system’s capability against generalized cyber-threats.

There are different types of cyber-threat detection methods. Here, ResNet-50 and MobilenetV2 are used as classifiers to detect cyber-attacks. However, Linear Regression (LR) and VGG-16 classifiers are taken as benchmark methods for similar purposes to ascertain the capabilities of DPTMs. For this, a total of 30,190 (CSC2010HTTP) samples (https://www.kaggle.com/akashkr/phishing-website-dataset) are taken out of which 70% are used for training and 30% are used for testing. These samples are balanced. Social distancing monitoring and cyber-attacks can be done through a mobile app. Residential blocks are required to have internet connectivity. Mobile devices will be connected through the internet and records will be stored on the cloud. So these combinations and features of the networks make the cyber- threat detection a necessary element of the proposed method. Cyber-attacks that are reported to be associated with data-driven approaches of healthcare monitoring systems are shown in Table 4.

3. Results and Discussion

In this section, we discuss the results obtained using different approaches adopted for the detection of facemask wear, the observance of social distance norms, and contact tracing. Further, we report the impact analysis of the proposed work with regard to its deployment in a residential premise. The validation of the system to perform the discrimination of cyber-attacks has also been included. We report the results in a sequence, starting with the work related to facemask detection (Section 3.1), the correct or incorrect wearing of a facemask, and the classification of a facemask, the observance of social distance (Section 3.2), and contact tracing (Section 3.3).

3.1. Facemask Detection Experimental Findings

The cornerstone of an effective design for facemask detection hinges on various subsystems within the system, with the reliability of the ML/DL-aided decision support system being paramount. In our approach, we have leveraged a spectrum of ML and DL techniques to identify the optimal method for dependable facemask detection. Our methodology also encompasses capabilities that are used for distinguishing between correct and incorrect facemask usage. Training and testing have been conducted using a dataset comprising over 6000 samples across multiple iterations. Approximately 80 percent of the data has been allocated for training purposes, with the remaining data reserved for validation and testing. During real-time testing, samples gathered from field visits to potential deployment locations have been utilized. Images of individuals both wearing and not wearing masks, as depicted in Figure 6a and Figure 6b, respectively, have been instrumental in training and testing DL models for facemask detection. Additionally, Figure 6c showcases datasets illustrating improper facemask usage, which serve as valuable resources for refining and enhancing facemask detection algorithms and models, constituting a crucial aspect of our ongoing efforts. These datasets contribute to the robustness of the DL models.

Medical facemasks are commonly crafted from materials like cloth, non-woven fabric, or disposable paper. These can be either disposable or reusable, serving as essential personal protective equipment (PPE) to safeguard both patients and healthcare workers against the transmission of infectious diseases. Consequently, medical face masks find extensive utilization in healthcare settings such as hospitals and clinics. Examples of datasets featuring normal masks and medical facemasks are depicted in Figure 7a and Figure 7b, respectively.

Table 9 shows the summary results obtained from a class of methods trained to detect correct facemask wear. We clearly observe that VGG-16, conventional CNN, ResNet-50, and MobileNetV2 provide better reliability.

In our experiments concerning the detection of proper facemask wear through computer vision and learning-based techniques, we noted that MobileNetV2 outperforms ResNet-50 in terms of accuracy, as evidenced by the summarized results presented in Table 10. Notably, Table 9 showcases a reported 100% accuracy by [15]. Upon implementing the approach detailed in [16], we confirmed achieving 100% accuracy with small datasets. However, when working with larger datasets, this accuracy significantly decreases.

Additional Experiments related to Facemask Type Detection: Table 11 illustrates the outcomes obtained from ResNet-50 and MobileNetV2. Further, it showcases the ResNet-50’s superior accuracy over MobileNetV2. We carried out the implementation of the system as detailed in [47], with parameter adjustments and rigorous training and achieved the accuracy reported in Table 11.

Comparison of ResNet-50 Results with Previous Studies: Our study aligns with prior research, utilizing identical datasets, methodologies, and testing procedures, as depicted in Table 12. Our ResNet-50 based model exhibits a 4% decrease in performance due to extensive data diversity, inclusion of on-field samples, background, interaction, sensor quality fluctuations, and variations in illumination. Anticipated enhancements in results are foreseeable with increased exposure to similar samples, especially those with complex backgrounds. Table 12 exclusively features studies employing ResNet-50, highlighting its widespread utilization in such applications.

We conducted comprehensive model implementations across various datasets and diversity settings, yielding comparable results. [Figure 8,Figure 9,Figure 10] illustrate model accuracies, and training time losses for ANN, CNN, VGG-16, MobileNetV2, and ResNet-50, alongside classification performance metrics for ResNet-50 and MobileNetV2.

3.2. Evaluation of Social Distancing Measures

The goal of social distancing is to slow down the transmission of the disease and reduce the burden on the healthcare system. Using day and night feeds from the cameras effective performance has been obtained upto 20 meters. Accuracy and F1-score are shown in Table 13.

Table 13 summarizes the average performance for person detection and social distancing using specific methods. The model training accuracies for ResNet-50, MobileNetV2, and SocialDistancingNet-19 are compared, with SocialDistancingNet-19 achieving the highest accuracy at 94.5%. Table 7 displays the parameters of different DL methods employed for social distancing, reflecting the combinations and tuned states of the models utilized in this study. Ultimately, SocialDistancingNet-19 emerged as the most suitable option and has been deployed as a cloud-resident network for social distance decision support.

3.3. Contact Tracing: Tracking and Managing Exposure Risks

The objective of contact tracing is to mitigate virus transmission, avert outbreaks, and decelerate disease spread. To facilitate contact tracing, the VGG-16, MobileNetV2, and ResNet-50 networks undergo training using pre-existing data. Table 5 outlines the configurations of these networks employed for this purpose. Current data are continually updated with previously known details, enriching the training process. Among these models, ResNet-50 achieves an accuracy of 96.1%. Table 14 illustrates a comparison of results attained from proposed methods, encompassing VGG-16, MobileNetV2, ResNet-50, and several other previously reported methods.

3.4. Impact Assessment of the Proposed System Architecture

Table 15 provides a comprehensive overview of the optimal performance of the proposed multi-node edge and cloud computing setup tailored for residential premises. It encompasses the analysis of deviations in sensor readings, decision support for facemask recognition, adherence to social distancing norms, contact tracing capabilities, and latency response across various load conditions at both node and system levels. These deviations stem from the diverse array of tasks managed by each node, the continual variation in sample types, the operation of different network types, and synchronization challenges. Despite these complexities, the system exhibits reliability. However, the primary bottleneck emerges from Wi-Fi access limitations and the performance of pickup devices. Notably, varying error levels occur at both node and server levels. Learning and decision-making processes are shared between the nodes and the server. Errors may occur when an individual, who typically utilizes a specific gate on one day, appears at a different gate. In such instances, the server may make a true negative decision, which gradually stabilizes over time toward the desired outcome.

Table 17 presents a comprehensive overview of the effectiveness of various strategies implemented in the proposed system, evaluated in residential settings over a six-month period. The post-validation and deployment phases demonstrate response speeds (in milliseconds) suitable for real-world applications, despite extensive training and testing duration (in seconds). Similarly, a detailed analysis has been conducted regarding social distance measurements using the SOCIALDISTANCINGNET-19. The system’s results exhibit slight discrepancies due to inaccuracies in distance calibration and spatial calculations derived from pixel values. The DL method exhibits a maximum error margin of 12% between projected and actual distances. The devastation caused by the COVID-19 pandemic between 2020 and 2022 underscores the importance of mitigating the spread of contagious diseases by minimizing interpersonal contact. Hence, the development of intelligent infrastructure capable of continuous monitoring of social distance compliance, proper facemask usage, and contact tracing is imperative, particularly in high-risk areas.

The proposed method demonstrates high accuracy rates for facemask detection, ranging from 90% to 96% across different DL models. This facilitates reliable surveillance in crowded environments and provides guidance on correct mask usage. Additionally, the system employs human detection techniques, achieving accuracies of 87% with ResNet-50 and 90% with MobileNetV2, enhancing social distance monitoring reliability. Moreover, the contact tracing accuracy rates using VGG-16, MobileNetV2, and ResNet-50 are reported as 92%, 93%, and 96% respectively, ensuring robustness and dependability. The collective effectiveness of each component underscores the system’s capability to facilitate continuous COVID-related behavioural monitoring as part of intelligent infrastructure compliant with pandemic regulations. Given the challenges and limitations associated with manual intervention, the proposed system emerges as pivotal in the creation of intelligent human environments, offering a solution that is efficient, safe, and accurate.

3.5. Validation of Component Block Capabilities in Detecting Cyber-attacks within the Proposed System

Confusion matrix of various classifiers tested for malicious and normal traffic is shown in Table 18.

A total number of 30,190 samples are taken for detecting cyber-attacks in the proposed system. The accuracy rates for cyber-attack using MobileNetV2, and ResNet-50 are 94.89% and 90.07%, respectively. However, VGG-16 is used as a benchmark to train this model and the summary performance is shown in Table 19.

The ability of the system to provide image output from text input is shown in Figure 11. For four different text descriptions, how the composite system is able to generate the corresponding representative images is shown in Figure 11. The text embedding that contains the decision states obtained from the DPTMs are shown in Table 20. These text descriptions assist the text-to-image transformer in using the principles of diffusion stability to provide synthetic visual representations related to bio-safety norms as part of a pandemic-compliant, resilient and perceptive framework. The key attributes of the proposed method compared to previously reported works have been summarized in Table 21. The advantage of the proposed method is obvious.

4. Conclusion

We present a system architecture comprising edge computing nodes, an IoHT system equipped with sensor packs, and a composite attention-driven framework integrating various DL pre-trained models (DPTM) such as RESNET-50, MobileNetV2, and SocialdistancingNet-19. This framework forms an IoIIT and executes monitoring protocol adherence, including social distancing norms and mask-wearing, as well as for contact tracing utilizing samples from open-source databases. Additionally, the composite framework generates scene images, and context, and identifies violators with text descriptions. Furthermore, it detects cyber-attacks when connected to public networks. Configured and trained to effectively address real-world challenges within residential premises, the constituent systems undergo rigorous testing. Notably, the accuracy rates for detecting mask presence and proper usage are 92% (ResNet-50) and 97.2% (MobileNetV2). However, facial recognition encounters difficulties when faces are inclined at certain angles, regardless of mask usage, necessitating the resolution of this issue. To discern between fabric masks and N-95 masks, ResNet-50 and MobileNetV2 are employed. Parallelism enables the simultaneous execution of mask categorization and social distancing calculations. Despite the system’s robustness, insufficient and non-diverse datasets limit its adaptability. For instance, inadequate representation of beards may lead to occasional confusion with masks. Access to more comprehensive datasets could facilitate the training of more potent models. Contact tracing for COVID-19 and influenza involves identifying individuals with significant contact with confirmed virus carriers and monitoring them for symptoms. This data aids in virus transmission prevention and outbreak management. Public health experts leverage contact tracing results to understand virus spread within communities and inform pandemic response strategies. System performance is constrained by network latency, varying with traffic load. Enhancing bandwidth and physical connectivity effectively mitigates this limitation.

Author Contributions

For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used “Conceptualization, Atlanta Choudhury and Kandarpa Kumar Sarma; methodology, Atlanta Choudhury; software, Atlanta Choudhury; validation, Atlanta Choudhury, Kandarpa Kumar Sarma.; formal analysis, Kandarpa Kumar Sarma; data curation, X.X.; writing—original draft preparation, Atlanta Choudhury.; writing—review and editing, Atlanta Choudhury and Kandarpa Kumar Sarma; visualization, Kandarpa Kumar Sarma.; supervision, Kandarpa Kumar Sarma.; funding acquisition, Atlanta Choudhury

Funding

This work is supported by SERB, DST, Government of India, under the project reference no. SUR/2022/001704.

Data Availability Statement

(https://www.kaggle.com/akashkr/phishing-website-dataset)

Acknowledgments

This work is supported by SERB, DST, Government of India, under the project reference no. SUR/2022/001704.

Conflicts of Interest

Declare conflicts of interest or state “The authors declare no conflicts of interest.”

References

C.Turni. et al.,“COVID-19 vaccines-An Australian Review," in Journal of Clinical & Experimental Immunology, vol. 7, no. 3, pp. 491-508, August 2020.
H. Abid, J. Mohd, and V.Raju, “Effects of COVID 19 pandemic in daily life," in Curr. Med. Res. Pract., 2020. [CrossRef]
Y. Wang, Z. Deng and d. shi, “How effective is a mask in preventing COVID?19 infection?" in Medical devices & sensors, vol. 4, no. 1, 2021, e10163. [CrossRef]
O. Kwon, “Evidence of the importance of contact tracing in fighting COVID-19," in Epidemiology and Health, 44, 2022. [CrossRef]
K. Jones and R. Thompson, “To use or not to use a COVID-19 contact tracing app: Mixed methods survey in Wales." in JMIR mHealth and uHealth, vol. 9, no. 11, 2021, e29181. [CrossRef]
O. Saidi, D. Malouche, P. Saksena, L. Arfaoui, “Impact of contact tracing, respect of isolation, and lockdown in reducing the number of cases infected with COVID-19. " in International Journal of Infectious Diseases, vol. 113, pp. 26-33, May 2020. [CrossRef]
R. M.Stuart, R. G. Abeysuriya, C. C. Kerr, D. Mistry et al., "Role of masks, testing and contact tracing in preventing COVID-19 resurgences: a case study from New South Wales, Australia." in BMJ open, vol. 11, no. 4, e045941, 2021. [CrossRef]
A. Prasad, & D. Kotz, "ENACT: encounter-based architecture for contact tracing." in Proceedings of the 4th International on Workshop on Physical Analytics, pp. 37-42, June 2017.
R. Vinuesa, H. Azizpour, I. Leite et al., "The role of artificial intelligence in achieving the Sustainable Development Goals." in Nature communications, vol. 11, no. 1, pp. 1-10, 2020.
"Water, sanitation, hygiene and waste management for COVID-19: technical brief", in World Health Organization, 03 March 2020, (No.WHO/2019 – NcOV/IPCWASH/2020.1).
R. Ellis, "WHO changes stance, says public should wear masks." in WebMD.
R. Jahromi, V. Mogharab, H. Jahromi & A. Avazpour, "Synergistic effects of anionic surfactants on coronavirus (SARS-CoV-2) virucidal efficiency of sanitizing fluids to fight COVID-19." in Food and Chemical Toxicology, vol. 145, 111702, 2020. [CrossRef]
S. Feng, C. Shen, N. Xia et al., "Rational use of face masks in the COVID-19 pandemic." in The Lancet Respiratory Medicine, vol. 8, no. 5, pp. 434-436, 2020. [CrossRef]
"Advice on the use of masks in the context of COVID-19: interim guidance," in World Health Organization, 5 June 2020, (No.WHO/2019 – nCoV/IPC_Masks/2020.4).
A. C. Prajapati, M. M. Patel, H. J. Khanpara et al., "A Hospital based cross sectional study to find out factors associated with disease severity and length of hospital stay in COVID-19 patients in Tertiary Care Hospital of Ahmedabad city." in Indian Journal of Community Health, vol. 33, no. 2, pp. 256-259.
B. Qin & D. Li, "Identifying facemask-wearing condition using image super-resolution with classification network to prevent COVID-19." in Sensors, vol. 20, no. 18, pp. 5236. [CrossRef]
M. S. Ejaz, M. R. Islam et al., "Implementation of principal component analysis on masked and non-masked face recognition." in 1st international conference on advances in science, engineering and robotics technology (ICASERT),IEEE, pp. 1-5, 2019.
J. S. Park, Y. Oh et al., "Glasses removal from facial image using recursive error compensation." inIEEE transactions on pattern analysis and machine intelligence, vol. 27, no. 5, pp. 805-811, 2005. [CrossRef]
C. Li, R. Wang et al., "Face detection based on YOLOv3. In Recent Trends in Intelligent Computing," in Communication and Devices: Proceedings of ICCD pp. 277-284, Springer, Singapore, 2018.
N. U. Din, K. Javed et al., "A novel GAN-based network for unmasking of masked face." in IEEE Access, 8, pp. 44276-44287, 2020.
H. Habibzadeh, K. Dinesh et al., "A survey of healthcare Internet of Things (HIoT): A clinical perspective." in IEEE Internet of Things Journal, vol. 7, no. 1, pp. 53-71, 2019. [CrossRef]
T. Wu, F. Wu, et al., "A rigid-flex wearable health monitoring sensor patch for IoT-connected healthcare applications." in IEEE Internet of Things Journal, vol. 7, no. 8, pp. 6932-6945,2020.
S. Esmaeili, S.Tabbakh, et al., "A priority-aware lightweight secure sensing model for body area networks with clinical healthcare applications in Internet of Things." in Pervasive and Mobile Computing, vol. 69, 101265, 2020. [CrossRef]
W. Huifeng, S. N. Kadry, et al., "Continuous health monitoring of sportsperson using IoT devices based wearable technology." in Computer Communications, vol. 160, pp. 588-595, 2020.
C. M. Dourado, S. P. P. da Silva, et al., "An open IoHT-based deep learning framework for online medical image recognition." in IEEE Journal on Selected Areas in Communications, vol. 39, no. 2, pp. 541-548, 2020.
P. P. Ray, N. Thapa, et al., "Novel implementation of IoT based non-invasive sensor system for real-time monitoring of intravenous fluid level for assistive e-healthcare." in Circuit World, vol. 45, no. 3, pp. 109-123.
J. Das, S. Ghosh, et al., "RESCUE: enabling green healthcare services using integrated IoT?edge?fog?cloud computing environments." in Software: Practice and Experience, vol. 52, no.7, pp. 1615-1642, 2022.
A. Kumar, K. Sharma, K., et al., "A drone-based networked system and methods for combating coronavirus disease (COVID-19) pandemic. " in Future Generation Computer Systems, vol. 115, pp. 1-19, 2021. [CrossRef]
S. Tuli, S., R. Tuli, & S. S. Gill, "Predicting the growth and trend of COVID-19 pandemic using machine learning and cloud computing." in Internet of things, vol. 11, 100222, 2020. [CrossRef]
N. Khan, N., A. Ullah, & K. Polat, "DCA-IoMT: Knowledge-Graph-Embedding-Enhanced Deep Collaborative Alert Recommendation Against COVID-19." in IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8924-8935, 2022.
X. Lin, J. Wu, J., et al., "FairHealth: Long-term proportional fairness-driven 5G edge healthcare in Internet of medical things." in IEEE Transactions on Industrial Informatics, vol. 18, no. 12, pp. 8905-8915, 2022.
F. Desai, D. Chowdhury, et al., "Health Cloud: A system for monitoring health status of heart patients using machine learning and cloud computing." in Internet of Things, vol. 17, 100485, 2022.
N. K. Dewangan & P. Chandrakar, "Patient-centric token-based healthcare blockchain implementation using secure internet of medical things." in IEEE Transactions on Computational Social Systems, 2022.
W. Lv, S. Wu, et al., "Towards large-scale and privacy-preserving contact tracing in COVID-19 pandemic: a blockchain perspective." in IEEE Transactions on Network Science and Engineering, vol. 9, no. 1, pp. 282-298, 2020. [CrossRef]
S. Kumar, R. D. Raut, et al.,"The impact of IoT on the performance of vaccine supply chain distribution in the COVID-19 context." in IEEE Transactions on Engineering Management, 2022.
W. Hariri, "Efficient masked face recognition method during the covid-19 pandemic." in Signal, image and video processing, vol. 16, no. 3, pp. 605-612, 2022. [CrossRef]
X. Wang, E. G. Ferro, et al., "Association between universal masking in a health care system and SARS-CoV-2 positivity among health care workers." in Jama, vol. 324, no. 7, pp. 703-704, 2020.
N. H. Leung, D. K.Chu, et al., "Respiratory virus shedding in exhaled breath and efficacy of face masks. Nature medicine", vol. 26, no. 5, pp. 676-680, 2020. [CrossRef]
S. Feng, S. Shen, et al., "Rational use of face masks in the COVID-19 pandemic." in The Lancet Respiratory Medicine, vol. 8, no. 5, pp. 434-436, 2020. [CrossRef]
R. B. McFee, "COVID-19 medical management including World Health Organization (WHO) suggested management strategies." in Disease-a-Month, vol. 66, no. 9, 101068, 2020. [CrossRef]
B. Kneis, "Face detection for crowd analysis using deep convolutional neural networks." In Engineering Applications of Neural Networks: 19th International Conference, Springer International Publishing, EANN 2018, Bristol, UK, Proceedings 19, pp. 71-80, 2018.
S. K. Addagarla, G. K. Chakravarthi, et al.,(2020) "Real time multi-scale facial mask detection and classification using deep transfer learning techniques." in International Journal, vol. 9, no.4, pp. 4402-4408, 2020.
X. Su, M. Gao, et al., "Facemask detection and classification via deep transfer learning." in Multimedia Tools and Applications, pp. 1-20, 2022.
A. Krizhevsky, I. Sutskever, et al., "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems, vol. 25, 2012.
C. Vimal, & N. Shirivastava, "Face and Face-mask Detection System using VGG-16 Architecture based on Convolutional Neural Network." in International Journal of Computer Applications, vol. 975, 8887, 2022.
A. Faisal, T. N. A. Dharma, et al., "Comparative study of VGG16 and MobileNetv2 for masked face recognition." in JurnalIlmiah Teknik ElektroKomputer dan Informatika (JITEKI), pp. 230-237, 2021.
M. Utomo, & F. Violita,"Face Mask Wearing Detection Using Support Vector Machine (SVM)." in IJID (International Journal on Informatics for Development), vol. 10, no. 2, pp. 72-81, 2021.
P. Sermanet, D. Eigen, "Overfeat: Integrated recognition, localization and detection using convolutional networks." arXiv preprint arXiv:1312.6229. [CrossRef]
D. Erhan, et al., "Scalable object detection using deep neural networks." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2147-2154, 2014.
J. Redmon, S. Divvala, et al., "You only look once: Unified, real-time object detection." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
M. Inamdar, & N. Mehendale, "Real-time face mask identification using facemasknet deep learning network." Available at SSRN 3663305, 2020.
S. Qiao, C. Liu, et al., "Few-shot image recognition by predicting parameters from activations." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7229-7238, 2018.
A. Kumar, Z. J. Zhang, & H. Lyu, "Object detection in real time based on improved single shot multi-box detector algorithm." in EURASIP Journal on Wireless Communications and Networking, pp. 1-18, 2020.
A. Morera, A. Sanchez, et al. "SSD vs. YOLO for detection of outdoor urban advertising panels under multiple variabilities." in Sensors, vol. 20, no. 16, 4587, 2020. [CrossRef]
K. Zhang, Z. Zhang, et al. "Joint face detection and alignment using multi task cascaded convolutional networks." in IEEE signal processing letters, vol. 23, no. 10, pp. 1499-1503, 2016.
A. Alzu’bi, F. Albalas, et al., "Masked face recognition using deep learning: A review." in Electronics, vol. 10, no. 21, 2666, 2021.
C. R. MacIntyre, & A. A. Chughtai, "Facemasks for the prevention of infection in healthcare and community settings", in Bmj, vol.350. [CrossRef]
G. H. Christa, et al., "CNN-based mask detection system using openCV and MobileNetV2." In 2021 3rd International Conference on Signal Processing and Communication (ICPSC), pp. 115-119, IEEE, (2021, May).
S. E. Snyder, & G. Husari, "Thor: A deep learning approach for face mask detection to prevent the COVID-19 pandemic." In Southeast Con, pp. 1-8, IEEE, March 2021.
S. Hussain, Y. Yu, et al., "IoT and deep learning based approach for rapid screening and face mask detection for infection spread control of COVID-19." in Applied Sciences, vol. 11, no. 8, 3495, 2021.
I. Ahmed, I., M. Ahmad, & G. Jeon, "Social distance monitoring framework using deep learning architecture to control infection transmission of COVID-19 pandemic." in Sustainable cities and society, vol. 69, 102777, 2021. [CrossRef]
E. Fonseca, et al., "General-purpose tagging of free sound audio with audio set labels: Task description, dataset, and baseline." arXiv preprint arXiv:1807.09902, 2018. [CrossRef]
R. Girshick, "Fast r-cnn." In Proceedings of the IEEE international conference on computer vision, pp. 1440-1448.
B.Pfitzner, N. Steckhan & B.Anrich,"Federated Learning in a Medical Context: A Systematic Literature Review", University of Potsdam, in Germany ACM Trans., Internet Technol., Vol. 21, No. 2, Article 50, Publication date: May 2021.DOI: . [CrossRef]
I. Goodfellow, Y. Bengio, & A. Couville, "Deep Learning (Adaptive Computation and Machine Learning series." in Nature., 2016.
J. Park, J., J. Jung, S. Eun, & S. Y. Yun, "Ui elements identification for mobile applications based on deep learning using symbol marker" in The J Ins Int Broadcasting Communication.3rd ed., pp. 89-95, 2020. [CrossRef]
R. Keniya, & N. Mehendale, "Real-time social distancing detector using SocialdistancingNet-19 deep learning network." in SSRN., 2020. [CrossRef]
W. Vijitkunsawat & P. Chantngarm, "Study of the Performance of Machine Learning Algorithms for Face Mask Detection." in IEEE, ISBN NO 978-1-7281-6694-0/20, 2020 IEEE, 2020.
"COVID-19 Map—Johns Hopkins Coronavirus Resource Center." at Johns Hopkins University. Available online: https://coronavirus.jhu.edu/map.html (accessed on 8 May 2012)., (2020, July 30).
S. V. Militante & N. V. Dionisio, N. V. "Deep Learning Implementation of Facemask and Physical Distancing Detection with Alarm Systems." in The Third International Conference on Vocational Education and Electrical Engineering (ICVEE)., 2020. [CrossRef]
R. Jahromi, V. Mogharab, et al., "Synergistic effects of anionic surfactants on corona virus (SARS-CoV-2) viricidal efficiency of sanitizing fluids to fight COVID-19." BioRxiv, 2020. [CrossRef]
R. Ellis, "WHO Changes Stance, Says Public Should Wear Masks." News. hAvailable online: ttps://www.webmd.com/lung/news/20200608/who-changes-stance-says-public-should-wear-masks (accessed on 8 May 2012).
K. He, G. Gkioxari, et al., "Mask r-cnn "In Proceedings of the IEEE International Conference on Computer Vision., pp. 2961-2969.
W. Liu, & D. Anguelov, "Ssd: Single shot multibox detector." In European Conference on Computer Vision., Springer.pp. 21-37,2016.
J. Redmon & A. Farhadi, "Yolov3: An incremental improvement.", IEEE. Available online: https://doi.org/10.48550/arXiv.1804.02767 (accessed on 8 May 2012)., 2018.
S. Ren, K. He, R. Girshick, & J. Sun, "Faster r-CNN: Towards real-time object detection with region proposal networks." IEEE. arXiv preprint arXiv:1506.01497, 2015. [CrossRef]
A. Naezullaev, "Contact Tracing of Infectious Diseases using Wi-Fi signals and Machine Learning classifications." in IEEE, 2020. [CrossRef]
S. Mondal & P. Mitra, P. "The Role of Emerging Technologies to Fight against COVID?19 Pandemic: An Exploratory Review." in Transactions of the Indian National Academy of Engineering, 7th ed., p. 157-174, 2022. [CrossRef]
H. Lukas, Y. Yu, W. Gao & C. Xu, "Emerging Telemedicine Tools for Remote COVID-19 Diagnosis, Monitoring, and Management." , pp. 16180-16193. American Chemical Society. Available online: https://dx.doi.org/10.1021/acsnano.0c08494 (accessed on 8 May 2012)., 2020.
A. A. Alzubi, M. A. Matiah & A. Alarifi, "Cyber-attack detection in healthcare using cyber-physical system and machine learning system." in Computing, Springer Nature., 2021. [CrossRef]
L. Khriji, & S. Bouaafia, "Secure Convolutional Neural Network-Based Internet-of-Healthcare Applications." in IEEE Access, 2023. [CrossRef]
H. Ku & M. Lee, "Text Control GAN: Text to image synthesis with controllable Generative Adversarial Networks." in Applied Sciences, MDPI, 2023. [CrossRef]
R. Sawant, A. Shaikh, S. Sabat , & V. Bhole, "Text to Image Generation using GAN.",2nd International Conference on IoT Based Control Networks and Intelligent Systems,2021.
C. H. Loke, M. S. Adam, et al., "Physical distancing device with edge computing for covid-19 (paddie-c19)." in Sensors, vol. 22, no. 1, 279, 2021. [CrossRef]

Figure 1. Proposed block diagram

Figure 2. Schematic showing the connection of the IoT-edge computing node to cloud server for monitoring of the compliance with face mask wearing and social distance norms and execute contact tracing

Figure 3. Schematic outlining the deployment of the edge computing nodes and connection with a cloud server as part of a residential block.

Figure 4. Hybrid Multi- Head Attention Aided Hybrid Deep Network with Diffusion Stability (HMAAHDNDS)

Figure 5. Block diagram of proposed contact tracing technique

Figure 6. Examples of face masks datasets (a) with masks, (b) without mask and (c) incorrect face mask

Figure 7. Datasets of (a) NG facemasks and (b) medical facemask.

Figure 8. Model accuracy and loss of facemask wearing (a)-(b) ANN, (c)-(d) CNN, (e)-(f) VGG-16, (g)-(h) MobileNetV2 and (i)-(j) ResNet50

Figure 9. Facemask classification accuracy and loss (a)-(b) ResNet-50 (c)-(d) mobileNetV2 (e)-(f) CNN

Figure 10. Incorrect facemask wearing accuracy and loss (a)-(b) MobileNetV2 (c)-(d) ResNet-50

Figure 11. Text to image transform for four different cases of wearing masks

Table 1. Configuration of the edge node and the cloud servers

Item	Description	Node
DNN model	YOLOv4	Raspberry Pi 4 (Edge)
Programming based on	Python
RAM in GB	4GB
Processor	ARM Cortex-72
Clock in GHz	1.5
GPU	Pi 3 A+: Broadcom VideoCore IV; 400 MHz
DNN model	Kmodel	Grove AI HAT (Edge)
Programming based on	MicroPython
RAM in GB	0.008
Processor	M1 K210 RISC-V
Clock in GHz	0.4-0.6
GPU	KPU
Maximum training time	121 minutes
Response time after training	2 seconds
DNN model	RESNET-50, YOLOv4, MobileNet, VGG-16, SocialDistanceNet-19 etc	Google Colab
Programming based on	Python
RAM in GB	32
Processor	Intel® Xeon® Gold 5318N
Clock in GHz	2.25
GPU	Tesla T4

Table 2. Features of different sensors

Sl.No	Sensor Type	Specifications
1	Ultrasonic sensor	HC-SR04-SEN-15569 Analog and digital connection. Baud Rate=9600 I/P voltage=5v Working temp= -150-700c Sensing angle=300 cone Ultrasonic frequency=40KHz; Range=2cm-400cm
2	Infrared Temperature Sensor	MLX90614 Vcc pin of the sensor with the Vin of the node Operating voltage=3.6v to 5 v Supply current=1.5mA Object temperature range= -700-382.20 c Ambient temperature range= -400c to 1250 c FOV=800 Distance between object and sensor= 2 cm to 5 cm
3	Proximity Sensor	PSAM8/ HC-SR04 Vcc-à +5v GND of sensor is connected to GND of Raspberry Pi; Data Pin(IR sensor) is connected to PIN 16 Supply voltage=10v to 30v Frequency=800Hz Working temp= -250-700c Long range detection Range=30mm-50mm Range upto 600cm
4	IR camera	MLX90640 Vcc-à +5v GND of sensor to GND of Raspberry Pi OutàGPIO pin Small size; low cost; Supply voltage= 3v; FOV(2 options): 550x350 and 1100x 750 target temperature= -400 c-3000c 32x24 pixels in IR array; I2C protocol, Range=1cm-100cm
5	Wi-Fi Router	XLT24017 Wireless LAN Transmission range= 150 m; Frequency= 2.4GHz; High stability built in PLL; Low clutter leakage; Size(L x W): 18.3x17.6 mm, Range=30m-100m
6	Conventional CCTV Camera	Philips HSP 3500 Wireless Low light range-10 meters; Day time -20-25 meters

Table 3. Various activities and their decision states

Sl. No	Task	Input Types	Output Types
1	Facemask detection	Image	Yes/no binary decision
2	Proper facemask wearing	Image	Yes/no Male/female Indoor/outdoor
3	Classification of facemask	Image	Class 1, Class 2
4	Social distancing	Image and sensor data	Violation Yes/no
5	Contact tracing	Image and sensor data	Data
6	Cyberattack	Text data, image	Class labels
7	Text to image generator	Text description from (1), (2), (3), (4) obtained in table and applied as CSV file	Image
8	Action detection	Image	Cough, sneeze

Table 4. Cyber-attacks associated with data driven approaches of healthcare

Sl. No	Cyber-attacks on data gathering phase	Cyber-attacks at Network Phase	Cyber-attacks at storage phase
1	Phishing	Eavesdropping of health record	Cross site scripting
2	Log Access	Man in the middle attack	Weak authentication attack
3	Social Engineering Network	Data tempering	SQL injection attack
4	Brute force attack on passwords	Denial of the service attack
		Data Interception
		Spoofing and sniffing attack

Table 5. Parameters of the different benchmark methods and DPTMs used for facemask detection (MSE 10-6)

Sl. No	Method	Layer type	Batch Size	Trainable parameters	Epoch
1	ANN [67]	Pooling layer, dense layer	5	430562	20
2	CNN	Pooling layer, dense layer	32	449059	25
3	RF	Pooling layer, dense layer	32	425256	15
4	DT	Pooling layer, dense layer	32	497852	20
5	SVM [66]	Pooling layer, dense layer	32	422285	30
6	RESNET-50	ResNet-50, pooling layer, dense layer	32	23591810	10
7	MOBILENET	MobileNetV2, pooling layer, dense layer	32	2260546	20
8	VGG-16 [65]	VGG-16-layer, pooling layer, dense layer	32	14715714	20

Table 6. Specifications of a few DL methods used for ascertaining the correctness of wearing facemask

Sl.No	Models	Trainable parameters	Layer type	Batch size
1	MOBILENET	2261827	Conv2d, max pooling layer, dropout, dense layer, flatten layer	7x7x1280
2	RESNET-50	449059		244x244x32
3	CNN	1759842		244x244x32

Table 7. Parameterization of Various Deep Learning Models for Continuous Monitoring of Social Distancing Norms (MSE 10-6)

Sl.No	Model	Parameters
Sl.No	Model	No. of layer	Epoch	Optimizer	Training dataset	Learning objective	Composition of layers
1	ResNet-50	Input layer-1 Zero padding layer-2 Relu layer-1 Dense layer-6	20	Adaptive moment estimation (Adam)	4098	Scale conjugate gradient	Conv layer 224 x 224 x 32 Max-pooling layer 112x112x32
2	MobileNetV2	Input layer-1 Zero padding layer-2 Relu layer-2 Dense layer-2	20	Adaptive moment estimation (Adam)	3843	Scale conjugate gradient	7 x 7 x 1280
3	SocialdistancingNet-19	Input layer-1 Zero padding layer-2 Relu layer-1 Dense layer-2	20	Adaptive moment estimation (Adam)	2598	Scale conjugate gradient	Conv layer 244 x 244 x 32 Maxpooling layer 112 x 112 x 32

Table 8. Parameters of different DL models for contact tracing (MSE 10-6)

Sl.No	Model	Parameters
Sl.No	Model	No. of layers	Training	Epoch	Optimizer	Learning objective	Composition of layers
1	ResNet-50	Input layer-1 Zero padding layer-2 Relu layer-1 Dense layer-6	4098	20	Adaptive moment estimation (Adam)	Scale conjugate gradient	Conv layer 224x224x32; max-pooling layer 112x112x32
2	MobileNetV2	Input layer-1 Zero padding layer-2 Relu layer-2 Dense layer-2	3843	25	Adaptive moment estimation (Adam)	Scale conjugate gradient	7x7x1280
3	VGG-16	Input layer-1 Zero padding layer-2 Relu layer-1 Dense layer-2	2598	25	Adaptive moment estimation (Adam)	Scale conjugate gradient	Conv layer 244x244x32 max pooling layer 112x112x32
4	SVM	Conv2d, max pooling, dropout, dense layer, flatten layer	8450	20	Adaptive moment estimation (Adam)	Scale conjugate gradient	Conv layer 244x244x32 max pooling layer 122x122x32
5	DT		8450	20
6	KNN		8450	20
7	LR		8450	20

Table 9. Comparison results of eight different classifiers for mask wearing detection (MSE

10 - 6

)

Table 9. Comparison results of eight different classifiers for mask wearing detection (MSE

10 - 6

)

SL. No	MODEL	C L A S S	R E C A L L	S U P P O R T	CONFUSION Matrix	Fl Score	P R E C I S I O N	A C C U R A C Y (%)	ACCURACY (%) (PREVIOUS WORK)
1	ANN	0	0.33	761	[634 127]	0.81	0.79	80	75.5 [14]
		1	0.78	750	[168 5321	0.80	0.82	80
2	DT	0	0.71	729	[519 210]	0.70	0.68	70	100[15]
		1	0.69	782	[240 542]	0.71	0.72	70
3	RT	0	0.83	729	[605 124]	0.78	0.73	77	100 [15]
		1	0.72	782	[222 560]	0.75	0.82	77
4	SVM	0	0.31	729	[594 135]	0.75	[594 135]	74	100 [15]
		1	0.67	782	[261 521]	0.72	[261 521]	74
5	VGG-16	0	0.99	761	[588 5]	0.98	[588 5]	98	96 [16]
		1	0.96	750	[23 592]	0.98	[23 592]	98
6	CNN	0	0.96	761	[568 25]	0.94	[568 25]	94	93.4 [16]
		1	0.96	0.93	0.94	750	[43 572]	94
7	RESNET50	0	0.99	0.89	0.93	745	[651 94]	94	90.1 [16]
		1	0.89	0.99	0.94	765	[4 751]	94
8	MOBILE NETV2	0	0.92	0.93	0.93	756	[704 52]	92	92.64[16]
		1	0.93	0.92	0.92	754	[62 692]	92

Table 10. Comparative analysis of ResNet-50 and MobileNetV2 for facemask incorrect wearing detection

Sl. No	Model	Class	F1 score	Precision	Support	Recall	Confusion Matrix	Accuracy (%)
1	ResNet-50	0	0.88	0.99	147	0.80	[117 23 7]	92
		1	0.90	0.85	137	0.97	[1 133 3]	92
		2	0.96	0.93	131	0.99	[0 1 130]	92
2	MobileNetV2	0	0.91	0.92	147	0.87	[131 9 7]	97.2
		1	0.94	0.90	137	0.97	[3 133 1]	97.2
		2	0.92	0.94	131	0.90	[8 5 118]	97.2
3	ResNet-50 [47]	-	-	-	-	-	-	47.91
4	MobileNetV2 [46]	-	-	-	-	-	-	92.64

Table 11. Assessment of ResNet-50 and MobileNetV2 performance in facemask classification

Sl. No	Method	Class	F1- score	Confusion Matrix	Recall	Precision	Accuracy (%)
1	ResNet-50	0	0.93	[252 17]	0.94	0.92	94
		1	0.95	[22 357]	0.94	0.95	94
2	MobileNetV2	0	0.63	[146 123]	0.54	0.74	75
		1	0.79	[52 327]	0.86	0.73	75
3	CNN	0	0.74	[198 71]	0.71	0.74	79
		1	0.82	[68 311]	0.82	0.81	79
4	MobileNetV2 [46]	-	-	-	-	-	97.08
5	ResNet-50 [47]	-	-	-	-	-	92.49
6	CNN [20]	-	-	-	-	-	70

Table 12. Performance Evaluation of ResNet-50 with proposed methods (Facemask Wearing)

Reference	Model	Classification	Accuracy (%)
[1]	ResNet-50	YES	89
[2]	ResNet-50	YES	81
[3]	ResNet-50	YES	95.8
[4]	ResNet-50	YES	88.2
[6]	ResNet-50	YES	98.6
[7]	ResNet-50	YES	97.9
[8]	ResNet-50	YES	92.5
This work	Proposed Model	YES	93.5

Table 13. Accuracy and performance of Social Distancing

Sl. No	Action	Model	Performance
Sl. No	Action	Model	F1-Score	Accuracy (%)
1	Person detection	ResNet-50	0.90	87.6
		MobileNetV2	0.95	89.9
2	Social distancing	SocialdistancingNet-19	0.89	94.5
3	[67]	SocialdistancingNet-19	—	92.8

Table 14. Comparison of results obtained from proposed method with previously reported works (Contact tracing)

Sl. No	Model	Accuracy of previously reported work (%)	Accuracy (%)
1	VGG-16 [45]	89.6	92.3
2	MobileNetV2[46]	78.5	93.4
3	ResNet-50[61]	83.4	96.1
4	SVM [47]	91	94.6
5	DT [77]	87	91.3
6	KNN [77]	88	92.5
7	LR [77]	89	93.6

Table 15. Summary of Optimal Performance in a Multi-Node Edge and Cloud Computing Configuration for Residential Premises

Sl. No.	Block	Specific item	Average error at system level	Average error at node level
1	Sensor type	Camera	3.80%	4%
		IR thermometer	3.50%	5%
		Proximity sensor	2.80%	3%
		Ultra-sonic	2.40%	3%
2	Decision support for facemask	Low (30 person/ hour)	8.5%	9.2%
		Medium (50 persons/ hour)	8.8%	11%
		High (70 persons / hour)	9.3%	12%
3	Decision support for social distance	Low (30 person/ hour)	3.40%	9%
		Medium (50 persons/ hour)	4.50%	12%
		High (70 persons / hour)	4.50%	13%
4	Decision support for contact tracing	Low (30 person/ hour)	3.80%	8.5%
		Medium (50 persons/ hour)	3.40%	10%
		High (70 persons / hour)	4.50%	12%
5	Latency variation (average)	Low (30 person/ hour)	7.5%	9%
		Medium (50 persons/ hour)	11%	14%
		High (70 persons / hour)	14%	15%

Table 16. Analysis of Computational Complexity across Various Deep Learning Methods

Sl. No	Specific Purpose	Training Data Set (80%)	Testing Data Set (20%)	Technique	Response Time (ms)	Testing Time (s)	Training Time (s)
1	Face Mask wearing	6042	1510	ANN	22.4	446	1790
				CNN	24.05	432	1875
				RF	20.9	446	1700
				SVM	22	430	1750
				DT	22.4	446	1790
				VGG-16	141.75	775	9300
				RESNET-50	133	770	8750
				MOBILENET	137.5	775	9025
2	Improper wearing of Face Mask	1664	415	RESNET-50	36.67	800	3000
				MOBILENET	41.33	820	3300
3	Classification of Face Mask	2598	648	CNN	24.58	475	1950
				RESNET-50	39.25	645	3000
				MOBILENET	42.16	670	3200
4	Social Distance Detection	4098	970	RESNET-50	60.5	570	4200
		3843	705	MOBILENET	65.16	590	4500
		2598	648	SOCIAL DISTANCING NET-19	58.83	570	4100
5	Contact Tracing	8450	2050	DT	112.83	430	7200
		8450	2050	RF	112.83	430	7200
		8450	2050	LR	112.83	430	7200
		8450	2050	SVM	112.83	430	7200
		2598	648	VGG-16	25.67	460	2000
		3843	705	MOBILENET	36.33	470	2650
		4098	970	RESNET-50	38.5	490	2800

Table 17. Analysis of Response Variability in Social Distance Enforcement by SOCIALDISTANCINGNET-19

Sl. No	Samples	k- Value	Actual Distance (cm) (A)	Distance from k-Value (cm) (B)	% error between (A) & (C)	Value Predicted by AI technique (cm) (C)	Dimension	Pixel Size (mm)
1	1	85.7	210	200	10.0	189	224 x 224 x 32	2.565
	2	43.85	220	210	9.3	199.5	112 x 112 x 32	5.131
	3	44.54	195	185	9.9	175.7	256 x 256 x 64	4.49
2	1	0.051	161	150	11.8	142	7 x 7 x 1280	3284
	2	63.48	182	170	11.3	161.5	96 x 96 x 32	2.993
	3	91.31	200	185	12.0	176	256 x 256 x 32	2.245
3	1	82.8	180	170	10.6	161	244 x 244 x 32	2.355
	2	81.87	205	195	9.8	185	224 x 224 x 32	2.565
	3	97.99	210	200	10.0	189	512 x 512 x 64	2.245

Table 18. Comparing Confusion Matrices from Different Classifiers

Sl. No	Classifier		Predicted
Sl. No	Classifier		normal	Attack
1	Linear Regression (benchmark)	normal	32421	56
		attack	145	22415
2	VGG-16 (benchmark)	normal	32545	23864
		attack	125	56
3	MobileNetV2	normal	32572	48
		attack	125	23864
4	ResNet-50	normal	30071	2549
		attack	119	23870

Table 19. Assessment of the proposed model’s performance in predicting malicious activity

Sl.No	Pre-trained Model	Parameters
Sl.No	Pre-trained Model	Sensitivity (%)	Precision (%)	F-Measure (%)	Specificity (%)	Accuracy (%)
1	ResNet-50	90.66	74.26	75.73	97.07	90.07
2	MobileNetV2	93.93	90.56	82.12	97.82	94.89
3	VGG-16(benchmark)	92.32	74.05	80.35	96.28	88.36

Table 20. Generation of text description from the decision states of the DPTMs

Sl. No.	Facemask Detection		Proper Facemask wearing		Classification of Facemask		Social Distancing		Text Description
	Y1	N1	Y2	N2	Y3	N3	Y4	N4
1	0	0	0	0	0	0	0	0	Null
2	0	0	0	0	0	0	0	1	Social Distance Not Violated
3	0	0	0	0	0	0	1	0	Social Distance Violated
4	0	0	0	0	0	0	1	1	Null
5	0	0	0	0	0	1	0	0	Not Proper Facemask
6	0	0	0	0	0	1	0	1	Not Proper Face Mask Social Distance Not Violated
7	0	0	0	0	0	1	1	0	Not Proper Face Mask Social Distance Violated
8	0	0	0	0	0	1	1	1	Null
9	0	0	0	0	1	0	0	0	Proper Facemask
10	0	0	0	0	1	0	0	1	Proper Facemask Social Distance Violated

Table 21. Key attributes of the proposed method via previously reported works

Attributes

[6]

[7]

[46]

[47]

[59]

[60]

[67]

[64]

[80]

[81]

[82]

[83]

Proposed
Method

Facemask Detection

Y

N

Y

Proper Facemask wearing

N

Y

N

Y

N

Y

Classification of Facemask

N

Y

N

Y

N

Y

Social Distancing

N

Y

N

Y

Contact Tracing

Y

N

Y

Cloud Computing

N

Y

N

Y

N

Y

Edge Computing

N

Y

N

Y

N

Y

N

Y

Text to Image converter

N

Y

N

Y

Cyber Attack Detection

N

Y

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.