1. Introduction
Human Activity Recognition (HAR) is the field of study that focuses on the development of computational methods for automatically determining and naming activities using sensory observations, trying to recognize and comprehend what someone is performing at a time by using the information that is accessible [
1,
2]. It plays a crucial role in various domains, including healthcare [
3,
4,
5], sports [
6,
7], workout exercises [
8,
9,
10,
11,
12,
13,
14], security [
15], and human-computer interaction [
16]. HAR methods can be categorized into two primary groups [
10,
17,
18,
19,
20,
21]:
sensor-based and
vision-based. Using cameras situated in the human environment, utilizing the vision-based method, human activity features are extracted from images and video streams [
22]. In the sensor-based method, data is gathered using sensors like accelerometers, gyroscopes, and magnetometers [
20]. Sensors, unlike the equipment required for vision-based methods, are lightweight and portable [
22].
The HAR research field aims to continually enhance recognition performance to its highest level [
19], while simultaneously addressing the challenge of attaining high accuracy in recognition using minimal computational resources [
23]. Demanding models with increased layers and neurons for deeper and wider models have led to significant improvements in accuracy but have also created challenges related to computational and memory requirements, particularly for mobile and embedded systems with strict constraints, necessitating the development of novel solutions to minimize network size and ensure fast inference without sacrificing accuracy [
24].
This paper introduces an automatic activity recognition system that efficiently classifies human activities by reducing the number of features and parameters. To attain this objective, we employ the LDA dimensionality reduction technique along with a straightforward neural network architecture that contains significantly fewer parameters than other existing models.
The primary contributions of this paper can be summarized as follows:
Our hybrid feature extraction method combines LDA and MLP for feature extraction and utilizes SVM Optimization through Stochastic Gradient Descent for classification to address two challenges: reducing feature vector dimensionality and accurately classifying smartphone-based human activities. The combined use of these techniques aims to enhance the classifier’s generalization performance and mitigate overfitting issues.
We perform a comparative analysis of various machine learning algorithms, including LR, GNB, DS, SVM, KNN, SVM-SGD Classifier, and RF. This comparison is conducted on a dataset consisting of original features, features extracted by LDA, and features extracted by LDA and MLP.
Our work involves the development of a neural network model with a limited number of parameters, surpassing previous deep learning-based approaches in accurately recognizing human activities.
To validate our model, we compare it with several models from state-of-the-art studies conducted on the UCI-HAR dataset. The experimental results demonstrate that our proposed method outperforms the best-performing method on this dataset.
2. Literature Review
In this section, we discuss the background research associated with sensor-based human activity recognition, along with various methods such as Machine Learning-based HAR, Deep Learning-based HAR, Hybrid Deep Learning-based Model HAR, Ensemble Learning-based HAR, and Dimensionality Reduction-based HAR.
2.1. Machine Learning-Based HAR
In recent years, the utilization of smartphone sensors for HAR has attracted much attention. Results of [
25] using DT, LR, and MLP for accurate and fast detection of activities based on the WISDM dataset showed that although none of the three learning algorithms was the best for all six performed activities, the MLP with 91.7% accuracy was better overall. A comparison between KNN, MLP, Naive Bayes (NB), and SVM showed the best performance for Multi-class SVM [
26]. Due to the good results in [
27,
28], SVM were also used for HAR.
Also, studies regarding the selection of features showed that changes in this field can significantly affect the final results. These works [
29,
30,
31] showed the effect of features on the accuracy of classification models. The results of the comparison between the performance of different supervised classification algorithms using five inertial sensors for human activity recognition showed 93.5% accuracy for support vector machines with quadratic Kernel classifier and 94.6% accuracy for ensemble classifier with pack and boost [
32]. Another comparison in [
33] between RF, KNN, convolutional neural networks (CNN) as well as a feature selection method, principal component analysis(PCA) indicated that the CNN result was the best on both WISDM and UCI HAR datasets. Using an optimal feature selection method and SVM to identify human activities showed an average classification performance of 96.81% [
34]. Garcia-Gonzalez et al. [
35] proposed a SVM model using smartphone sensors for real-life human activity recognition. The RF Classifiers compared to Artificial Neural Networks (ANN), DR Classifiers, and KNN for activities such as performing sit-up, walking downstairs, walking upstairs, walking on the toe, walking normally, and walking on the heel achieved across all evaluation metrics an impressive accuracy score of 97.67% [
36].
Although most of the studies conducted in the laboratory were focused outside of a real environment, the results of real-life human activity recognition showed that tree-based models, such as Random Forest with an accuracy of 92.97% performed better than other models [
37]. The modified Guided Regularized Random Forest (mGRRF), a novel RF-based feature selection method achieved an average accuracy of 98.19% for KNN, 97.77% for RF, 99.29% for SVM, 99.03% for LR, and 99.36% for XGB [
38].
2.2. Deep Learning-Based HAR
In addition to great results, the lack of manual feature selection makes deep learning more attractive and the use of CNN and LSTM has outstanding results in this field [
33] HAR solution proposed with handcrafted features and CNN, achieving 90.42% on WISDM and 94.35% on UCI datasets 95.32% accuracy achieved without handcrafted features on UCI dataset. CNN-based methods achieved 98.33% accuracy for nine activities [
39] and 93.926% and 97.5% on the UCI dataset[
40] [
41] . For recognition complex activities in [
42] proposed (FRDCNN) architecture, a robust and fast convolutional neural network. It uses signal processing algorithms and a data compression module to provide fast and accurate recognition, predicting activities in real-time with 95.27% accuracy in 0.0029 seconds.
Teng et al. [
43] demonstrated that the layer-wise CNN with local loss outperforms the global loss on five publicly available datasets, namely UniMib-SHAR, UCI HAR, OPPORTUNITY, WISDM, and PAMAP2. The F1 score of a deep neural network was evaluated at 95.78%, 92.63%, and 95.85% for the UCI-HAR, OPPORTUNITY, and WISDM datasets, respectively, by identifying optimal hyper-parameters[
44].
A waist-mounted wearable device to detect six daily activities (walking upstairs, walking downstairs, walking, standing, Laying, and sitting) through three parallel convolutional neural networks to extract local features and create feature combination models with different kernel sizes showed the accuracy of 97.49% for the UCI dataset and 96.27% for self-recorded data[
45]. Using the UCI-HAR dataset, an efficient and lightweight CNN-LSTM model, which has a better activity detection capability than traditional algorithms, was shown to have accuracy of 97.89% [
46]. Channel-selectivity CNN for sensor-based HAR tasks achieved lower test errors than static layers on five public HAR datasets such as OPPORTUNITY dataset, UCI-HAR dataset, UniMib-SHAR dataset, WISDM dataset, and PAMAP2 dataset. The real-time human activity classification method in[
47] showed better performance for CNN compared to SVM, BLSTM, LSTM, and MLP models on Pamap2 and UCI datasets. Another application of deep learning was automatic activity recognition based on wearable inertial sensors for the activities of construction workers using the ConIoT-VTT dataset, the results of the proposed WorkerNeXt model was the accuracy of 99.71% with F1score of 99.72% [
48].
The results [
49] showed that CNNs have a more suitable structure than RNN types, namely LSTM, Bi-LSTM, and GRU. Setting more intensive meta-parameters and using more complex preprocessing techniques when generating samples also affect the improvement. The new method [
50] using AReM, PAMAP2, and Mhealth datasets and the Bi-LSTM neural network obtained 95.46%, 93.41%, and 95.79% of the average F1 score respectively. It showed that choosing the size of the window and implementing the appropriate voting method has a significant effect on its improvement.
In [
51], evaluation of the results of applying batch normalization and hyperparameter tuning using Bayesian optimization, for the LSTM-based deep model showed 97.71% accuracy, F1 score, precision, and recall on the public PAMAP2 dataset,96.66%, 96.85%, and 96.55% respectively.
2.3. Hybrid Deep Learning-Based Model HAR
Effectiveness of combining CNN and RNN in [
52] showed that integrating CNNs with powerful four RNN, namely LSTM, BiLSTM, GRU and BiGRU on PAMAP2 dataset have an outstanding level of performance according to several F-score measures, accuracy, sensitivity, and specificity. Models including two-way RNNs perform better than models based on one-way RNNs.
The results of the 4-layer CNN-LSTM network in [
53] showed a high accuracy rate of 99.39% using the UCI-HAR dataset. The authors in [
54] proposed multi-input CNN-GRU-based human activity recognition The accuracies obtained on PAMAP2, WISDM, and UCI-HAR datasets were 95.27%, 97.21%, and 96.20% respectively. The accuracy of CNN-BiLSTM model using CNN with different dimensions and bidirectional kernels Long-term short-term memory (BiLSTM) for capturing features at different resolutions showed 97.05% on UCIHAR and 98.53% on WISDM datasets[
55].
The results of hybrid deep learning model consists of CNN, LSTM ,and BiLSTM in [
56] exhibited the accuracies of 98.38% on the Human Activity dataset with transition and basic activities and 96.11% on the HAPT dataset. In [
57] article, a hybrid deep CNN-LSTM with Self-Attention model using Wearable Sensors for the classification of daily activities and get accuracy 93.11% and 98.76% for UCI-HAR and MHEALTH datasets. In [
58] a CNN-based Bi-LSTM parallel model with an attention mechanism and noisy data was designed for human activity recognition. The ConvBLSTM-PMwA model with proper setting of meta-parameters achieved 96.71% accuracy and time consuming 14.71 on the UCI dataset and 95.86% accuracy and time consuming 12.11 ms on the WISDM dataset. Hybrid models[
59] combining LSTM and CNN techniques with the best results, achieved a peak accuracy of 94.80% on a real-life dataset. An innovative HAR DL-based model named HAR-DeepConvLG [
60] achieved a classification accuracy of 98.48%, 97.52%, 98.55% and 97.85% on the WISDM, UCI-HAR, USC-HAR and PAMP2 datasets, respectively. In [
61] evaluations on two datasets, StanWiFi and CSI-HAR showed accuracy of 99.62% and 98.66% for CNN-GRU-AttNet hybrid model using automatic extraction spatio-temporal characteristics of information compared to other deep learning models (CNN, BiLSTM, LSTM, GRU, and BiGRU). A residual multi-feature fusion shrinkage network (RMFSN) [
62] achieved the accuracy of 98.35%, 93.89%, and 98.13% on three public datasets WISDM, OPPORTUNITY, and UCI-HAR. UC Fusion presented in [
63], a method focusing on the fusion of unique and common features in wearable multisensor systems for HAR. The results of UC Fusion’s performance on the UCI-HAR and WISDM datasets demonstrated average recognition accuracy of 96.84% and 98.85%.
2.4. Ensemble Learning-Based HAR
The study explores Deep LSTM ensembles using three benchmarks, Opportunity, PAMAP2, and Skoda. It introduces random parameter estimation and different loss functions for basic language learners, improving classification performance[
64]. In[
65] proposed a Learning model using Softmax Regression, Random Forest, XGBoost, and Extra Trees. Experiment results show that the model proposed could achieve around 95% accuracy. Among the nine methods used to find the best model, the best results correspond to a simple and fully connected network (multi-layer perceptron) with compressed data using Fisher linear discriminant analysis (FLDA) was 98.6% accuracy[
66].
An ensemble method has been proposed to enhance user identification efficiency. Two basic deep learning models, CNN and LSTM were adopted for testing the proposed method using data from the UCI HAR and USC HAD datasets. The results showed high accuracy levels for all users with 91.78% and 92.43% accuracy for the two models respectively, while the USC HAD finding model demonstrated an acceptable level with 95.86% accuracy for walking-related activities[
67]. In [
68] results of a multi-branch CNN-BiLSTM model with automatic feature extraction from the raw sensor data with minimal data pre-processing showed 96.05%,94.29%, and 96.37% accuracies, on WISDM, PAMAP2, and UCI-HAR datasets directly on raw data with nominal pre-processing. Experimental results obtained from an ELA[
69] model consisting of DNN, stacking CNN + GRU, and GRUto classify the activity in terms of recall, precision, accuracy, and F1-score achieved 96.8%, 96.8%, 96.7%, and 96.8% respectively. Ensem-HAR, a set of four classification models, "ConvLSTM-net”, ”CNNnet”, StackedLSTM-net”, and ”CNNLSTM-net” on three benchmark datasets, WISDM, PAMAP2 ,and UCI-HAR achieved 98.70%, 97.45%, and 95.05% accuracy[
70]. Attention-Based Residual BiLSTM Networks [
71] achieved overall accuracies of 97.89%, 99.01%, and 98.37% on three public datasets: KU-HAR, WISDM, and UCI-HAR respectively.
In [
72], For gait recognition using multimodal wearable inertial sensors, a sequential convolutional LSTM network called SConvLSTM was proposed to achieve 99.3%, 96.6%, and 97.6% F1-scores for three datasets WISDM, UCI-HAR and HuGaDB.
A hybrid model named DWCNN, consisting of Deep Convolutional Neural Network (DCNN) and Continuous Wavelet Transform (CWT) was developed to learn time-frequency domain features. The model was tested on five public datasets and achieved F1 scores of 98.56%, 99.56%, 97.26%, 93.52%, and 83.52%, respectively[
73]. DCapsNet, a model utilizing convolution layers and batch normalization (BN) to accelerate convergence, was proposed for the recognition of human activities and gait [
74]. The accuracy of model fit was 97.92% and 99.30% for the UCI-HAR and WISDM datasets, and for WhuGAIT dataset number 1 and dataset number 2 94.7% and 97.16% respectively. An ensemble learning framework based on extreme learning machine (ELM) including self-learning dimensionality reduction and “Subsampled Randomized Hadamard Transformation”(SRHT) was able to show 99.16% accuracy using UCI-HAR [
75]. ConvLSTM achieved 98.90%, 98.09%, and 96.5% accuracy for Opportunity, WISDM and UCI-HAR datasets. The model used sparse learning, 1D CNNs, LSTM layers and a self-attention mechanism[
76].
2.5. Dimensionality Reduction-Based HAR
A robust model using the Enveloped Power Spectrum (EPS) for feature extraction and the LDA for dimensionality reduction with Multi-class Support Vector Machine (MCSVM) for HAR classification achieved 98.71% F1 score, 98.67% accuracy on the UCI-HAR dataset, and 100% F1 score, 100% accuracy on the DU-MD dataset [
77]. Another example of using the method of feature extraction and dimensionality reduction using EPS, LDA, and multi-class SVM to classify human activities achieved an overall accuracy of 98.67% for the UCI-HAR dataset[
78]. The suitability of CNNs for automating feature extraction, LSTMs for modeling time series, and AEs to reduce the dimensions caused a new framework (HAR-CAEL) with better performance to be presented by combining AE, CNN, and LSTM architectures. Evaluation of the performance of the HAR-CAEL model with some meta-parameters, including batch size, optimizer type ,and number of cycles showed accuracies of 98.57% and 97.98% on WISDM and UCI datasets[
79]. DMEFAM is a new framework for effective implementation extracting multiple features and improving HAR accuracy by combining Bi-GRU, SA, CBAM, and ResNet18. The best experimental results depict accuracies of 99.2% on the DAAD dataset, 96.0% on the UCI-HAR dataset, and 97.9% on the WISDM dataset with the number of three layers for Bi-GRU and SA[
80]. Classification of similar activities based on standardization and data normalization led to the introduction of a model that includes the random classifier of the first layer and the forest model is based on the XGBoost feature selection algorithm, extracting similar human activities in the second layer using Kernel Fisher Diagnostic Analysis (KFDA) and applying the SVM model with feature mapping for similar obtained 97.69%, 97.92%, 98.12% and 90.6% detection accuracy UCI DSA, UCI HAR, WISDM ,and IM-WSHA database [
81].
3. Methodology
3.1. Overview
The proposed HAR system follows a comprehensive workflow
1 that includes data collection, data preprocessing, feature extraction, model training, feature extraction for other algorithms, classification, and evaluation of models.
In the data collection step, we gather data from the widely accepted HAR-UCI dataset, which is a standard public resource commonly used in activity recognition research. The collected data then undergoes preprocessing to ensure its quality and suitability for analysis. This involves standardizing the feature values. To effectively represent the data and reduce its dimensionality, we employ LDA with five components in the feature extraction step. LDA is used to maximize the separation between different activity classes while efficiently reducing the dimensionality of the dataset. This technique helps capture the most discriminative information from the data.
We create a neural network with five hidden layers in the model training step. This neural network architecture is designed to learn complex relationships between the features and the corresponding activity labels. By iteratively adjusting the network’s parameters during the training process, the model becomes capable of recognizing patterns and making accurate predictions. The output from the fourth hidden layer of the neural network is then extracted as features for utilization in other machine learning algorithms. These extracted features capture high-level representations that can be effectively utilized in subsequent analysis and classification tasks. In the classification step, we employ various machine learning algorithms such as RF, DT, KNN, SVM, SVM-SGD, GNB, and LR. Each algorithm is trained on the extracted features to determine its effectiveness in accurately classifying different activities.
The trained models are evaluated using suitable evaluation metrics such as accuracy, precision, recall, and F1 score in the evaluation step. This helps us understand the strengths and weaknesses of each algorithm and choose the most appropriate one for accurate classification.
The workflow depicted in
Figure 1 illustrates the sequential and interconnected nature of the proposed framework.
Figure 1.
Workflow diagram of the proposed method.
Figure 1.
Workflow diagram of the proposed method.
3.2. Data Collection and Data Preprocessing
We assess the performance of our model using a commonly used activity recognition dataset known as UCI-HAR [
82]. The UCI-HAR dataset holds widespread recognition and acceptance among the academic community, guaranteeing a comprehensive evaluation of our model across diverse backgrounds and application scenarios. The following sections provide detailed information about the dataset under investigation and the preprocessing methods we employed.
3.2.1. Dataset Overview
The UCI-HAR dataset was collected using the built-in MEMS IMU of a Samsung Galaxy S2 smartphone, which captured triaxial acceleration and triaxial gyroscope data. A total of 30 subjects participated in the study and performed six different gait activities: WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, and LAYING. The smartphone was securely attached to each subject’s waist, and data were recorded at a sampling rate of 50 Hz. Video recordings were made during the data acquisition process to facilitate the manual labeling of the gait activities. Before analysis, the sensor signals (accelerometer and gyroscope) underwent pre-processing. Noise filters were implemented to enhance the quality of the data. The processed signals were then divided into fixed-width sliding windows with a duration of 2.56 seconds and a 50% overlap, resulting in 128 readings per window. A Butterworth low-pass filter was used to separate the sensor acceleration signal into two distinct signals: body acceleration and gravity. The cutoff frequency of the filter was set to 0.3 Hz, as gravity primarily consists of low-frequency components. From each window, a comprehensive set of features was derived by analyzing variables in both the time and frequency domains. The dataset comprises a total of 10,299 samples, which were divided into training and test sets by the dataset publisher in a 7:3 ratio. This division resulted in 7,352 samples for training and 2,947 samples for testing.
3.2.2. Standardize Features
The process of standardizing features by removing the mean and scaling to unit variance is a common preprocessing technique in machine learning. This technique ensures that features are on a similar scale, which can be beneficial for certain algorithms and models.
The standard score, also known as the z-score, is used to calculate the standardized value of a sample. The formula for calculating the z-score is as follows:
In this formula, x represents the value of the sample, u represents the mean of the training samples, and s represents the standard deviation of the training samples.
By applying this standardization process, we ensure that features are centered around zero and have a unit variance, which can help in achieving a more consistent and comparable scale across different features.
3.3. Feature Extraction
The objective of this paper is to investigate the enhancement of human physical activity recognition performance through the utilization of a hybrid feature extraction approach, combining Linear LDA and MLP. Furthermore, this paper also utilizes SVM optimization through Stochastic Gradient Descent (SGD) for accurately classifying the activities.
By utilizing LDA, a statistical technique, a new feature space is sought to project the data, aiming to maximize the separation between classes and effectively predict the class label of test features [
83]. It extracts k independent features from the d original features in the dataset to effectively separate the classes, resulting in a smaller number of produced components than the number of classes minus 1 [
84].
Mathematically, LDA aims to find a projection matrix
that maximizes the between-class scatter and minimizes the within-class scatter. This can be formulated as the following optimization problem:
where
and
represent the between-class scatter matrix and the within-class scatter matrix, respectively. To solve the optimization problem, one can perform an eigenvalue decomposition of the matrix
and choose the eigenvectors associated with the largest eigenvalues.
Figure 2 and
Figure 3 illustrate the creation of pairwise plots to visualize the relationships between 5 LDA components that explain the variance between different labels. Moreover,
Figure 4 depicts a three-dimensional visualization of LDA using five components for both the training and test datsets.
In addition to LDA, this paper also investigates the use of MLP for feature extraction. MLP is a neural network architecture that consists of multiple layers of interconnected neurons, and it learns complex nonlinear relationships between input features and labels through the process of backpropagation. Mathematically, an MLP can be represented as a composition of affine transformations and activation functions, where the output of each neuron is computed using the input feature vector, weight matrix, bias vector, and activation function. The weights and biases of the MLP are optimized iteratively by minimizing a predefined loss function, such as mean squared error or cross-entropy.
An MLP can be used for feature extraction through its hidden layers, which act as feature extractors by learning and identifying patterns in the input data. Mathematically, the feature extraction process in an MLP can be represented as follows:
Assuming an input vector
x with a dimension of
, which illustrates the five input features. The layer’s weights and biases are represented as
and
accordingly. In this case
represents a matrix, with dimensions
where
H denotes the number of hidden units and
is a vector with dimensions
. The calculation for the hidden layers output can be expressed as:
Here
f denotes the activation function, like the rectified unit (ReLU) or sigmoid function. So, if
, then
includes the features taken from the input
x. These extracted features can then be used as input to subsequent layers for further processing or analysis. Each
represents the output of the corresponding hidden unit after processing the input
x. These
values represent the extracted features from the input
x. Each
can be considered a new representation of the input data, where the MLP has learned to emphasize certain aspects of the input features based on the training data.
Figure 5 presents the architecture of the MLP model. In this study, we obtained six features by extracting information from the fourth hidden layer, which comprises six neurons.
By combining LDA and MLP, the proposed methodology aims to leverage the strengths of both techniques. The LDA step reduces the dimensionality of the feature space, while preserving the discriminative information necessary for accurate activity recognition. The reduced feature set is then fed into the MLP, which learns the complex relationships between the features and activity labels. The weights and biases of the MLP are optimized to minimize the classification error, resulting in a more accurate activity recognition system.
3.4. Classification Algorithms
Classification is a method of machine learning in which the number of classes is predetermined, and each instance is labeled with one of these classes to serve as an algorithm guide (Labels that exist provide guidance for the algorithm; hence classification is a supervised learning approach). A mapping between a predefined class and the training data can be established by learning training data [
85]. In this research, machine algorithms such as LR, GNB, DS, SVM, KNN, SVM-SGD Classifier, RF were applied to the datasets.
Logistic Regression. Although it is called a regression model, logistic regression is regarded as a classification model [
86]. With supervised classification problems, the objective of the algorithms is to determine the boundary between classes given the discrete nature of the classes [
87]. Decision boundaries vary in geometric shape from simple to complex, depending on the problem instance. The shape of decision boundaries is generally considered differently by different machine learning algorithms. Decision boundaries are assumed to be linear in logistic regression. As long as the classes are linearly separable, it is effortless to implement and yields outstanding performance. The binary output variable in logistic regression can be extended to include more than one class, but the basic version results in a binary output [
88].
Support Vector Machine. SVM divides the sample space into two categories based on a hyperplane, which is discovered through training and learning [
89]. Maximizing the margin between the hyperplane and the support vectors is the objective of the SVM [
90]. Using Cover’s theorem, SVM maps input data into a high-dimensional space to find the hyperplane with the maximum margin [
91]. Cover’s theorem states that [
92] “we can map the input space to a high-dimensional space, in which a linear function will be found”. Machine learning algorithms can be trained efficiently using
Stochastic Gradient Descent (SGD) [
93]. It is applicable to optimizing SVM and logistic functions, providing a rapid method to minimize loss functions [
94].
Gaussian Naive Bayes. Specifically, Gaussian Naive Bayes is used when the features are continuous [
95] and a particular type of Naive Bayes algorithm [
96]. Its main advantage is that it requires a small amount of training data to determine the projected parameter needed for classification (Its characteristic of parameter independence may allow it to be used with fewer parameters than other techniques) [
97].
Decision Trees. As a classifier and a regression model, decision trees can be used to solve problems in both classification and regression by dividing the data into smaller segments and filling in the leaf nodes with results [
98]. Decision trees are popular because they are able to provide a simple representation of complex problems that are easier to interpret, as well as their ability to support inference tasks through the production of logic rules for classification [
99].
Random Forest. The random forest algorithm acts as an ensemble classifier tree learner [
100], constructing multiple decision trees. It leverages decision trees by generating a multitude of them through sampling individuals and variables from the training dataset [
101]. In a random forest, nodes are split using the best split from a randomly chosen subset of predictors, while standard trees consider the best split among all variables for each node [
102].
K-nearest neighbors. KNN is built upon the most basic assumption that underlies all predictions: As a result of the same characteristics, most likely, they will produce similar results [
103]. When new data are provided to the algorithm, the algorithm compares the new data to the k closest data to determine a data class [
104]. Though simple, the method is grounded in non-parametric density estimation [
105] and outperforms more sophisticated methods in many cases [
106].
3.5. Evaluation Metrics
To evaluate the effectiveness of the method, the study used performance metrics such as accuracy, precision, recall, and F1 score. These metrics were calculated using the generated confusion matrix values. The confusion matrix is a representation that illustrates how well a model performs in classifying samples for each category in classification. It provides information on true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN), which indicate correct and incorrect predictions for positive and negative classes. TP and TN signify correct predictions, while FP and FN represent incorrect predictions.
Equations to show performance metrics of accuracy, precision, recall, and F1-score, respectively:
Accuracy represents the proportion of correct predictions in relation to the overall number of input samples. By measuring the precision of the model, we can identify how accurate it is at classifying positive samples. in other words, using precision, we aim to identify all Positive samples as Positive correctly, and not misidentify a negative sample as Positive. On the other hand, as far as the recall is concerned, it doesn’t matter what happens to negative samples, just how positive samples are classified. As a result, if all positive samples are labeled Positive by the model, then the recall will be 100%, regardless of whether all negative samples are incorrectly labeled Positive. It is always possible to maximize one or the other but not both because of some tradeoff. This is because high recall requires the generation of incorrect results, which reduces precision [
107]. Therefore, the F1 score becomes necessary for them to maintain a trade-off.
4. Results and Discussion
In this section, we delve into the specifics of the experimental configuration and showcase the experimental results using metrics such as accuracy, precision, recall, and F-score.
4.1. Experimental Configuration
This article provides an implementation
1 of the MLP using the Keras framework with TensorFlow as the backend. The efficiency of the Deep Learning models utilized in this study was evaluated through experiments conducted on a Google Colab platform.
The model is trained by minimizing the sparse categorical crossentropy loss function, and the ModelCheckpoint callback is used to save the model during training. The implementation utilizes an SGD optimizer, with a maximum of 100 epochs and a batch size of 32. Default values are maintained for the remaining hyperparameters.
Additionally, this article utilizes the scikit-learn library for implementing various machine learning algorithms, including LDA. The matplotlib library is employed to visualize the results of the LDA analysis.
4.2. Hyperparameter Selection
The selected hyperparameter for RF for all feature types (Original features, LDA-based features, and LDA-MLP-based features) is the number of trees in the model’s forest (n_estimators), which has been set to 50.
The selected hyperparameter for DS for these feature sets is the maximum depth of the tree (max_depth), which is set to 10.
The hyperparameters chosen for KNN for all feature types are metric=’manhattan’ and ’n_neighbors’=7. The “metric" parameter refers to the distance metric used to measure the similarity between data points, determining how the algorithm calculates distances. On the other hand, the ’n_neighbors’ parameter specifies the number of nearest neighbors considered when making predictions and influences the size of the neighborhood used in the algorithm.
In the SVM algorithm, for all feature types, the hyperparameters kernel=’rbf’, C=1000, and gamma=0.0001 were selected. The kernel=’rbf’ indicates the use of a radial basis function as the kernel function, which allows for more complex decision boundaries. The parameter "C=1000" controls the penalty for misclassifications, with higher values indicating a stricter penalty. Lastly, "gamma=0.0001" defines the influence of each training example, with smaller values resulting in a smoother decision boundary.
The selected hyperparameters for LR for all feature types were max_iter=3000, solver=’liblinear’, and penalty=’l2’. The "max_iter" parameter represents the maximum number of iterations allowed for the solver to converge, while "solver=’liblinear’" specifies the algorithm used for optimization. Furthermore, the specification "penalty=’l2’" denotes the incorporation of L2 regularization, which aids in mitigating overfitting by introducing a penalty term to the loss function.
In the SVM-SGD it is worth noting that different hyperparameters were selected for each feature type. For the Original features, the selected hyperparameters were max_iter=1000, tol=1e-3, alpha=0.001, and penalty=’elasticnet’. The "max_iter" parameter determines the maximum number of iterations for convergence, while "tol" specifies the tolerance for early stopping. The "alpha" parameter controls the regularization strength, and "penalty=’elasticnet’" indicates the use of an elastic net penalty. For the LDA-based features, the chosen hyperparameters were max_iter=1000, tol=1e-3, alpha=0.1, and penalty=’l2’. Similarly, for the LDA-MLP-based features, the selected hyperparameters were max_iter=1000, tol=1e-3, alpha=0.1, and penalty=’elasticnet’. The "penalty=’l2’" indicates the use of L2 regularization, while "penalty=’elasticnet’" combines L1 and L2 regularization.
4.3. Comparison between the Models
Table 1 presents the classification accuracy, precision, recall, and F1-score obtained by utilizing original features, LDA-based feature extraction, and LDA-MLP-based feature extraction on the UCI-HAR dataset. It is evident that employing LDA as a feature extraction method across all classifiers and evaluation metrics has resulted in improved outcomes. Additionally, combining LDA with MLP has demonstrated enhanced results across all evaluation metrics compared to the original features. However, when applied to K-Nearest Neighbors, Support Vector Machine, Gaussian Naive Bayes, and Logistic Regression, the combined method yielded lower results compared to using LDA alone as a feature extractor. Lastly, in terms of classifier comparison, the SVM-SGD algorithm, in combination with LDA and MLP as feature extractors, achieved the highest scores across all metrics. It attained an accuracy of 99.52%, precision of 99.55%, recall of 99.53%, and F1-score of 99.54%.
4.4. Comparison with State-of-the-Art
To evaluate the classification performance of the proposed method, we performed a detailed comparative analysis against several state-of-the-art approaches. We aimed to assess the effectiveness and efficiency of our method in comparison to existing techniques. The assessment was specifically conducted on the UCI-HAR dataset, which is widely used in the field of human activity recognition.
For our comparative analysis, we carefully selected a set of recent studies that were published since 2021. This ensured that we included the most up-to-date and relevant approaches in our evaluation. We included studies that employed different methodologies, algorithms, and techniques, ranging from traditional machine learning models to deep learning architectures. It is important to note that our analysis focused on the most recent studies, ensuring that the comparison was based on the latest advancements in the field.
To present the results of our comparative analysis, we organized the findings in a comprehensive manner in
Table 2. This table includes crucial information such as the publication dates of the analyzed papers, the methods utilized in each study, and the corresponding performance metrics, such as accuracy, precision, recall, and F1-score. These metrics provide insights into the strengths and weaknesses of each method, allowing for a thorough evaluation of their classification capabilities.
Among the methods that were examined the proposed approach called LDA-MLP-SVM-SGD demonstrated the performance across all metrics. It achieved an accuracy rate of 99.52% precision of 99.55%, recall of 99.53% and an F1 score of 99.54%. These results highlight the effectiveness of the proposed method, in classifying the data. A few other methods also showed strong performance. For instance SRHT-ELM-HAR [
75] achieved an accuracy rate of 99.16% with a precision of 98% and recall of 99%. Another method called features + XGB [
38] performed better with an accuracy rate of 99.36%, precision of 99.32% recall of 99.30% and an F1 score of 99.31%.
Overall, the results demonstrate the advancements in classification methods, with the proposed LDA-MLP-SVM-SGD method outperforming other state-of-the-art techniques in terms of accuracy, precision, recall, and F1-score. However, the mGRRF features + XGB method [
38] also exhibits strong performance and is a competitive alternative. It is worth mentioning that according to
Table 1, LDA with SVM-SGD achieved an accuracy of 99.46%, precision of 99.47%, recall of 99.46%, and an F1-score of 9.47%. These results indicate that LDA with SVM-SGD outperforms mGRRF features + XGB in terms of classification performance.
Table 2.
Analyzing Several State-of-the-Art Methods for Comparison.
Table 2.
Analyzing Several State-of-the-Art Methods for Comparison.
References |
Year |
Method |
Accuracy (%) |
Precision (%) |
Recall (%) |
F1-score (%) |
[66] |
2021 |
MLP FLDA data |
98.6 |
99 |
99 |
99 |
[68] |
2021 |
CNN-BiLSTM |
96.37 |
- |
- |
96.31 |
[54] |
2021 |
Multi-Input CNN-GRU |
96.20 |
- |
- |
96.19 |
[58] |
2022 |
ConvBLSTM-PMwA |
96.71 |
- |
- |
- |
[71] |
2023 |
1DCNN+ResBLSTM+Attention |
98.37 |
98.42 |
98.43 |
98.42 |
[72] |
2023 |
SConvLSTM |
96.60 |
96.60 |
96.60 |
96.60 |
[60] |
2023 |
HAR-DeepConvLG |
97.52 |
97.58 |
- |
97.51 |
[76] |
2024 |
Self-attention deep ConvLSTM |
98.09 |
- |
- |
98.13 |
[63] |
2024 |
UC Fusion |
96.84 |
96.35 |
96.22 |
96.27 |
[74] |
2024 |
DCapsNet |
98.43 |
- |
- |
98.43 |
[62] |
2024 |
RMFSN |
98.13 |
- |
- |
98.22 |
[75] |
2024 |
SRHT-ELM-HAR |
99.16 |
98 |
99 |
99 |
[38] |
2024 |
mGRRF features + XGB |
99.36 |
99.32 |
99.30 |
99.31 |
Proposed method |
2024 |
LDA-MLP-SVM-SGD |
99.52 |
99.55 |
99.53 |
99.54 |
The
Table 3 provides a comparison of different methods, including their respective publication references, years of publication, method names, accuracy, F1-scores, and the number of parameters used. Among the listed methods, the proposed LDA-MLP method demonstrates the highest performance.
In 2022, the ResNet+HC method introduced by [
108] achieved an accuracy of 97.01%. In 2023, the DMEFAM method proposed by [
80] achieved an accuracy of 96.00% and an F1-score of 95.80%. With 1.6 million parameters, this method utilizes a larger number of parameters compared to the ResNet+HC method but still falls short in terms of accuracy. Also, in 2023, the GRU-INC method introduced by [
109] achieved an accuracy of 96.27% and an F1-score of 96.26%. Although it demonstrates a stronger performance than the previous methods, it still lags behind the proposed LDA-MLP method. The GRU-INC method utilizes 666,112 parameters. The Bi-HAR method presented by [
110] in 2023 achieved an accuracy of 97.89% and an F1-score of 96.47%, surpassing the previous methods in terms of accuracy. However, it requires a significantly larger number of parameters, totaling 15,017,152. In 2024, the RMFSN method developed by [
62] achieved an accuracy of 98.13% and an F1-score of 98.21%. While it demonstrates high accuracy, it still does not outperform the proposed LDA-MLP method. The RMFSN method utilizes 239,846 parameters.
The proposed LDA-MLP method, surpasses all other methods in terms of accuracy and F1-score. It achieves an exceptional accuracy of 99.49% and an F1-score of 99.50%, outperforming all other methods by a significant margin. Impressively, the LDA-MLP method achieves this outstanding performance while utilizing only 1,494 parameters, demonstrating its efficiency and effectiveness.
Table 3.
Analyzing Several State-of-the-Art Methods for Comparison with number of Parameters.
Table 3.
Analyzing Several State-of-the-Art Methods for Comparison with number of Parameters.
References |
Year |
Method |
Accuracy (%) |
F1-score (%) |
No. of Parameters |
[108] |
2022 |
ResNet+HC |
97.01 |
- |
0.42 M |
[80] |
2023 |
DMEFAM |
96.00 |
95.80 |
1.6 M |
[109] |
2023 |
GRU-INC |
96.27 |
96.26 |
666,112 |
[110] |
2023 |
Bi-HAR |
97.89 |
96.47 |
15,017,152 |
[62] |
2024 |
RMFSN |
98.13 |
98.21 |
239,846 |
Proposed method |
2024 |
LDA-MLP |
99.49 |
99.50 |
1494 |
5. Conclusions and Future Work
Our research introduces a hybrid feature extraction approach that combines the LDA and MLP methods. This hybrid technique effectively addresses the challenges of reducing feature vector dimensionality and accurately classifying smartphone-based human activities. By leveraging SVM Optimization with Stochastic Gradient Descent for classification, we aim to enhance the generalization performance of the classifier and mitigate overfitting issues.
Additionally, our work focuses on developing a neural network model with a constrained parameter count. This model outperforms prior deep learning methods in accurately recognizing human activities, showcasing its potential for real-world applications.
To validate the effectiveness of our proposed model, we conducted experiments comparing it with several models from state-of-the-art studies conducted on the UCI-HAR dataset. The experimental results provide compelling evidence that our approach outperforms the best-performing method on this dataset, reinforcing the superiority of our proposed method in terms of accuracy and performance.
Future work for our research will primarily focus on further exploring and refining the feature extraction and selection techniques employed in our study. We plan to investigate different feature selection algorithms to enhance the discriminative power of the selected features. Techniques like Recursive Feature Elimination (RFE), LASSO regression, or genetic algorithms can be employed to identify the most relevant and informative features for accurate classification.
References
- Raj, R.; Kos, A. An improved human activity recognition technique based on convolutional neural network. Scientific Reports 2023, 13, 22581. [Google Scholar] [CrossRef]
- Beddiar, D.R.; Nini, B.; Sabokrou, M.; Hadid, A. Vision-based human activity recognition: a survey. Multimedia Tools and Applications 2020, 79, 30509–30555. [Google Scholar] [CrossRef]
- Taylor, W.; Shah, S.A.; Dashtipour, K.; Zahid, A.; Abbasi, Q.H.; Imran, M.A. An intelligent non-invasive real-time human activity recognition system for next-generation healthcare. Sensors 2020, 20, 2653. [Google Scholar] [CrossRef] [PubMed]
- Serpush, F.; Menhaj, M.B.; Masoumi, B.; Karasfi, B.; others. Wearable sensor-based human activity recognition in the smart healthcare system. Computational intelligence and neuroscience 2022, 2022. [Google Scholar] [CrossRef] [PubMed]
- Mukherjee, A.; Bose, A.; Chaudhuri, D.P.; Kumar, A.; Chatterjee, A.; Ray, S.K.; Ghosh, A. Edge-based human activity recognition system for smart healthcare. Journal of The Institution of Engineers (India): Series B 2022, 103, 809–815. [Google Scholar] [CrossRef]
- Hsu, Y.L.; Yang, S.C.; Chang, H.C.; Lai, H.C. Human daily and sport activity recognition using a wearable inertial sensor network. IEEE Access 2018, 6, 31715–31728. [Google Scholar] [CrossRef]
- Zhuang, Z.; Xue, Y. Sport-related human activity detection and recognition using a smartwatch. Sensors 2019, 19, 5001. [Google Scholar] [CrossRef] [PubMed]
- Koskimäki, H.; Siirtola, P.; Röning, J. Myogym: introducing an open gym data set for activity recognition collected using myo armband. Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers, 2017, pp. 537–546.
- Koskimäki, H.; Siirtola, P. Recognizing Unseen Gym Activities from Streaming Data-Accelerometer Vs. Electromyogram. Distributed Computing and Artificial Intelligence, 13th International Conference. Springer, 2016, pp. 195–202.
- Vazan, M.; Masoumi, F.S.; Ou, R.; Rawassizadeh, R. Augmenting Vision-Based Human Pose Estimation with Rotation Matrix. arXiv preprint arXiv:2310.06068 2023, arXiv:2310.06068 2023. [Google Scholar]
- Ganesh, P.; Idgahi, R.E.; Venkatesh, C.B.; Babu, A.R.; Kyrarini, M. Personalized system for human gym activity recognition using an RGB camera. Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments, 2020, pp. 1–7.
- Fu, B.; Kirchbuchner, F.; Kuijper, A. Performing Realistic Workout Activity Recognition on Consumer Smartphones. Technologies 2020, 8, 65. [Google Scholar] [CrossRef]
- Soro, A.; Brunner, G.; Tanner, S.; Wattenhofer, R. Recognition and repetition counting for complex physical exercises with deep learning. Sensors 2019, 19, 714. [Google Scholar] [CrossRef]
- Morris, D.; Saponas, T.S.; Guillory, A.; Kelner, I. RecoFit: using a wearable sensor to find, recognize, and count repetitive exercises. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2014, pp. 3225–3234.
- Sunil, A.; Sheth, M.H.; Shreyas, E. ; others. Usual and unusual human activity recognition in video using deep learning and artificial intelligence for security applications. 2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT). IEEE, 2021, pp. 1–6.
- Malibari, A.A.; Alzahrani, J.S.; Qahmash, A.; Maray, M.; Alghamdi, M.; Alshahrani, R.; Mohamed, A.; Hilal, A.M. Quantum Water Strider Algorithm with Hybrid-Deep-Learning-Based Activity Recognition for Human–Computer Interaction. Applied Sciences 2022, 12, 6848. [Google Scholar] [CrossRef]
- Dang, L.M.; Min, K.; Wang, H.; Piran, M.J.; Lee, C.H.; Moon, H. Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recognition 2020, 108, 107561. [Google Scholar] [CrossRef]
- Almusawi, A.; Ali, A.H. Efficient Human Activity Recognition System Using Long Short-Term Memory. International Conference of Reliable Information and Communication Technology. Springer, 2021, pp. 73–83.
- Zhao, J.; Suleiman, B.; Alibasa, M.J. Feature Encoding by Location-Enhanced Word2Vec Embedding for Human Activity Recognition in Smart Homes. International Conference on Mobile and Ubiquitous Systems: Computing, Networking, and Services. Springer, 2022, pp. 191–202.
- Biswal, A.; Nanda, S.; Panigrahi, C.R.; Cowlessur, S.K.; Pati, B. Human activity recognition using machine learning: A review. Progress in Advanced Computing and Intelligent Engineering: Proceedings of ICACIE 2020 2021, 323–333. [Google Scholar]
- Islam, M.M.; Nooruddin, S.; Karray, F.; Muhammad, G. Human activity recognition using tools of convolutional neural networks: A state of the art review, data sets, challenges, and future prospects. Computers in Biology and Medicine 2022, 106060. [Google Scholar] [CrossRef]
- Wang, X.; Shang, J. Human Activity Recognition Based on Two-Channel Residual–GRU–ECA Module with Two Types of Sensors. Electronics 2023, 12, 1622. [Google Scholar] [CrossRef]
- Zebin, T.; Scully, P.J.; Ozanyan, K.B. Human activity recognition with inertial sensors using a deep learning approach. 2016 IEEE sensors. IEEE, 2016, pp. 1–3.
- Lattanzi, E.; Contoli, C.; Freschi, V. Do we need early exit networks in human activity recognition? Engineering Applications of Artificial Intelligence 2023, 121, 106035. [Google Scholar] [CrossRef]
- Kwapisz, J.R.; Weiss, G.M.; Moore, S.A. Activity recognition using cell phone accelerometers. ACM SigKDD Explorations Newsletter 2011, 12, 74–82. [Google Scholar] [CrossRef]
- Wu, Z.; Zhang, A.; Zhang, C. Human activity recognition using wearable devices sensor data, 2015.
- Chen, Z.; Zhu, Q.; Soh, Y.C.; Zhang, L. Robust human activity recognition using smartphone sensors via CT-PCA and online SVM. IEEE transactions on industrial informatics 2017, 13, 3070–3080. [Google Scholar] [CrossRef]
- Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. Training computationally efficient smartphone–based human activity recognition models. Artificial Neural Networks and Machine Learning–ICANN 2013: 23rd International Conference on Artificial Neural Networks Sofia, Bulgaria, September 10-13, 2013. Proceedings 23. Springer, 2013, pp. 426–433.
- Seto, S.; Zhang, W.; Zhou, Y. Multivariate time series classification using dynamic time warping template selection for human activity recognition. 2015 IEEE symposium series on computational intelligence. IEEE, 2015, pp. 1399–1406.
- Sousa, W.; Souto, E.; Rodrigres, J.; Sadarc, P.; Jalali, R.; El-Khatib, K. A comparative analysis of the impact of features on human activity recognition with smartphone sensors. Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web, 2017, pp. 397–404.
- Wang, A.; Chen, G.; Yang, J.; Zhao, S.; Chang, C.Y. A comparative study on human activity recognition using inertial sensors in a smartphone. IEEE Sensors Journal 2016, 16, 4566–4578. [Google Scholar] [CrossRef]
- Zebin, T.; Scully, P.J.; Ozanyan, K.B. Evaluation of supervised classification algorithms for human activity recognition with inertial sensors. 2017 IEEE SENSORS. IEEE, 2017, pp. 1–3.
- Ignatov, A. Real-time human activity recognition from accelerometer data using Convolutional Neural Networks. Applied Soft Computing 2018, 62, 915–922. [Google Scholar] [CrossRef]
- Ahmed, N.; Rafiq, J.I.; Islam, M.R. Enhanced human activity recognition based on smartphone sensor data using hybrid feature selection model. Sensors 2020, 20, 317. [Google Scholar] [CrossRef] [PubMed]
- Garcia-Gonzalez, D.; Rivero, D.; Fernandez-Blanco, E.; Luaces, M.R. A public domain dataset for real-life human activity recognition using smartphone sensors. Sensors 2020, 20, 2200. [Google Scholar] [CrossRef]
- Nia, N.G.; Kaplanoglu, E.; Nasab, A.; Qin, H. Human activity recognition using machine learning algorithms based on IMU data. 2023 5th International Conference on Bio-engineering for Smart Technologies (BioSMART). IEEE, 2023, pp. 1–8.
- Garcia-Gonzalez, D.; Rivero, D.; Fernandez-Blanco, E.; Luaces, M.R. New machine learning approaches for real-life human activity recognition using smartphone sensor-based data. Knowledge-Based Systems 2023, 262, 110260. [Google Scholar] [CrossRef]
- Thakur, D.; Biswas, S. Permutation importance based modified guided regularized random forest in human activity recognition with smartphone. Engineering Applications of Artificial Intelligence 2024, 129, 107681. [Google Scholar] [CrossRef]
- Zhou, B.; Yang, J.; Li, Q. Smartphone-based activity recognition for indoor localization using a convolutional neural network. Sensors 2019, 19, 621. [Google Scholar] [CrossRef]
- Dhanraj, S.; De, S.; Dash, D. Efficient smartphone-based human activity recognition using convolutional neural network. 2019 International conference on information technology (ICIT). IEEE, 2019, pp. 307–312.
- Jiang, X.; Lu, Y.; Lu, Z.; Zhou, H. Smartphone-based human activity recognition using CNN in frequency domain. Web and Big Data: APWeb-WAIM 2018 International Workshops: MWDA, BAH, KGMA, DMMOOC, DS, Macau, China, July 23–25, 2018, Revised Selected Papers 2. Springer, 2018, pp. 101–110.
- Qi, W.; Su, H.; Yang, C.; Ferrigno, G.; De Momi, E.; Aliverti, A. A fast and robust deep convolutional neural networks for complex human activity recognition using smartphone. Sensors 2019, 19, 3731. [Google Scholar] [CrossRef] [PubMed]
- Teng, Q.; Wang, K.; Zhang, L.; He, J. The layer-wise training convolutional neural networks using local loss for sensor-based human activity recognition. IEEE Sensors Journal 2020, 20, 7265–7274. [Google Scholar] [CrossRef]
- Xia, K.; Huang, J.; Wang, H. LSTM-CNN architecture for human activity recognition. IEEE Access 2020, 8, 56855–56866. [Google Scholar] [CrossRef]
- Yen, C.T.; Liao, J.X.; Huang, Y.K. Feature fusion of a deep-learning algorithm into wearable sensor devices for human activity recognition. Sensors 2021, 21, 8294. [Google Scholar] [CrossRef]
- Ankita.; Rani, S.; Babbar, H.; Coleman, S.; Singh, A.; Aljahdali, H.M. An efficient and lightweight deep learning model for human activity recognition using smartphones. Sensors 2021, 21, 3845. [CrossRef]
- Huang, W.; Zhang, L.; Teng, Q.; Song, C.; He, J. The convolutional neural networks training with channel-selectivity for human activity recognition based on sensors. IEEE Journal of Biomedical and Health Informatics 2021, 25, 3834–3843. [Google Scholar] [CrossRef] [PubMed]
- Mekruksavanich, S.; Jitpattanakul, A. Automatic Recognition of Construction Worker Activities Using Deep Learning Approaches and Wearable Inertial Sensors. Intelligent Automation & Soft Computing 2023, 36. [Google Scholar]
- Papadopoulos, K.; Jelali, M. A Comparative Study on Recent Progress of Machine Learning-Based Human Activity Recognition with Radar. Applied Sciences 2023, 13, 12728. [Google Scholar] [CrossRef]
- Tehrani, A.; Yadollahzadeh-Tabari, M.; Zehtab-Salmasi, A.; Enayatifar, R. Wearable Sensor-Based Human Activity Recognition System Employing Bi-LSTM Algorithm. The Computer Journal 2023, bxad035. [Google Scholar] [CrossRef]
- El Ghazi, M.; Aknin, N. Optimizing Deep LSTM Model through Hyperparameter Tuning for Sensor-Based Human Activity Recognition in Smart Home. Informatica 2024, 47. [Google Scholar] [CrossRef]
- Abbaspour, S.; Fotouhi, F.; Sedaghatbaf, A.; Fotouhi, H.; Vahabi, M.; Linden, M. A comparative analysis of hybrid deep learning models for human activity recognition. Sensors 2020, 20, 5707. [Google Scholar] [CrossRef]
- Mekruksavanich, S.; Jitpattanakul, A. Lstm networks using smartphone data for sensor-based human activity recognition in smart homes. Sensors 2021, 21, 1636. [Google Scholar] [CrossRef] [PubMed]
- Dua, N.; Singh, S.N.; Semwal, V.B. Multi-input CNN-GRU based human activity recognition using wearable sensors. Computing 2021, 103, 1461–1478. [Google Scholar] [CrossRef]
- Nafea, O.; Abdul, W.; Muhammad, G.; Alsulaiman, M. Sensor-based human activity recognition with spatio-temporal deep learning. Sensors 2021, 21, 2141. [Google Scholar] [CrossRef]
- Irfan, S.; Anjum, N.; Masood, N.; Khattak, A.S.; Ramzan, N. A novel hybrid deep learning model for human activity recognition based on transitional activities. Sensors 2021, 21, 8227. [Google Scholar] [CrossRef]
- Khatun, M.A.; Yousuf, M.A.; Ahmed, S.; Uddin, M.Z.; Alyami, S.A.; Al-Ashhab, S.; Akhdar, H.F.; Khan, A.; Azad, A.; Moni, M.A. Deep CNN-LSTM with self-attention model for human activity recognition using wearable sensor. IEEE Journal of Translational Engineering in Health and Medicine 2022, 10, 1–16. [Google Scholar] [CrossRef]
- Yin, X.; Liu, Z.; Liu, D.; Ren, X. A Novel CNN-based Bi-LSTM parallel model with attention mechanism for human activity recognition with noisy data. Scientific Reports 2022, 12, 7878. [Google Scholar] [CrossRef] [PubMed]
- Garcia-Gonzalez, D.; Rivero, D.; Fernandez-Blanco, E.; Luaces, M.R. Deep learning models for real-life human activity recognition from smartphonesensor data. Internet of Things 2023, 24, 100925. [Google Scholar] [CrossRef]
- Ding, W.; Abdel-Basset, M.; Mohamed, R. HAR-DeepConvLG: Hybrid deep learning-based model for human activity recognition in IoT applications. Information Sciences 2023, 646, 119394. [Google Scholar] [CrossRef]
- Mekruksavanich, S.; Phaphan, W.; Hnoohom, N.; Jitpattanakul, A. Attention-based hybrid deep learning network for human activity recognition using WiFi channel state information. Applied Sciences 2023, 13, 8884. [Google Scholar] [CrossRef]
- Zeng, F.; Guo, M.; Tan, L.; Guo, F.; Liu, X. Wearable Sensor-Based Residual Multifeature Fusion Shrinkage Networks for Human Activity Recognition. Sensors 2024, 24, 758. [Google Scholar] [CrossRef] [PubMed]
- Liu, K.; Gao, C.; Li, B.; Liu, W. Human activity recognition through deep learning: Leveraging unique and common feature fusion in wearable multi-sensor systems. Applied Soft Computing 2024, 151, 111146. [Google Scholar] [CrossRef]
- Guan, Y.; Plötz, T. Ensembles of deep lstm learners for activity recognition using wearables. Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies 2017, 1, 1–28. [Google Scholar] [CrossRef]
- Xu, S.; Tang, Q.; Jin, L.; Pan, Z. A cascade ensemble learning model for human activity recognition with smartphones. Sensors 2019, 19, 2307. [Google Scholar] [CrossRef]
- Kaspour, S.; Raj, N.; Mishra, A.; Yassine, A.; Eustaquio Alves De Oliveira, T. Searching Efficient Models for Human Activity Recognition. Proceedings of the 2021 6th International Conference on Biomedical Imaging, Signal Processing, 2021, pp. 40–45.
- Mekruksavanich, S.; Jitpattanakul, A. Biometric user identification based on human activity recognition using wearable sensors: An experiment using deep learning models. Electronics 2021, 10, 308. [Google Scholar] [CrossRef]
- Challa, S.K.; Kumar, A.; Semwal, V.B. A multibranch CNN-BiLSTM model for human activity recognition using wearable sensor data. The Visual Computer 2022, 38, 4095–4109. [Google Scholar] [CrossRef]
- Tan, T.H.; Wu, J.Y.; Liu, S.H.; Gochoo, M. Human activity recognition using an ensemble learning algorithm with smartphone sensor data. Electronics 2022, 11, 322. [Google Scholar] [CrossRef]
- Bhattacharya, D.; Sharma, D.; Kim, W.; Ijaz, M.F.; Singh, P.K. Ensem-HAR: An ensemble deep learning model for smartphone sensor-based human activity recognition for measurement of elderly health monitoring. Biosensors 2022, 12, 393. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Liu, Y.; Yuan, H. Attention-Based Residual BiLSTM Networks for Human Activity Recognition. IEEE Access 2023. [Google Scholar] [CrossRef]
- Shi, L.F.; Liu, Z.Y.; Zhou, K.J.; Shi, Y.; Jing, X. Novel deep learning network for gait recognition using multimodal inertial sensors. Sensors 2023, 23, 849. [Google Scholar] [CrossRef] [PubMed]
- Vuong, T.H.; Doan, T.; Takasu, A. Deep Wavelet Convolutional Neural Networks for Multimodal Human Activity Recognition Using Wearable Inertial Sensors. Sensors 2023, 23, 9721. [Google Scholar] [CrossRef] [PubMed]
- Sezavar, A.; Atta, R.; Ghanbari, M. DCapsNet: Deep capsule network for human activity and gait recognition with smartphone sensors. Pattern Recognition 2024, 147, 110054. [Google Scholar] [CrossRef]
- Thakur, D.; Pal, A. Subsampled Randomized Hadamard Transformation-based Ensemble Extreme Learning Machine for Human Activity Recognition. ACM Transactions on Computing for Healthcare 2024, 5, 1–23. [Google Scholar] [CrossRef]
- Ullah, S.; Pirahandeh, M.; Kim, D.H. Self-attention deep ConvLSTM with sparse-learned channel dependencies for wearable sensor-based human activity recognition. Neurocomputing 2024, 571, 127157. [Google Scholar] [CrossRef]
- Ahmed Bhuiyan, R.; Ahmed, N.; Amiruzzaman, M.; Islam, M.R. A robust feature extraction model for human activity characterization using 3-axis accelerometer and gyroscope data. Sensors 2020, 20, 6990. [Google Scholar] [CrossRef]
- Bhuiyan, R.A.; Amiruzzaman, M.; Ahmed, N.; Islam, M.R. Efficient frequency domain feature extraction model using EPS and LDA for human activity recognition. 2020 3rd IEEE International Conference on Knowledge Innovation and Invention (ICKII). IEEE, 2020, pp. 344–347.
- Thakur, D.; Roy, S.; Biswas, S.; Ho, E.S.; Chattopadhyay, S.; Shetty, S. A Novel Smartphone-Based Human Activity Recognition Approach using Convolutional Autoencoder Long Short-Term Memory Network. 2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI). IEEE, 2023, pp. 146–153.
- Wang, Y.; Xu, H.; Liu, Y.; Wang, M.; Wang, Y.; Yang, Y.; Zhou, S.; Zeng, J.; Xu, J.; Li, S.; others. A Novel Deep Multifeature Extraction Framework Based on Attention Mechanism Using Wearable Sensor Data for Human Activity Recognition. IEEE Sensors Journal 2023, 23, 7188–7198. [Google Scholar] [CrossRef]
- Tan, Q.; Qin, Y.; Tang, R.; Wu, S.; Cao, J. A Multi-Layer Classifier Model XR-KS of Human Activity Recognition for the Problem of Similar Human Activity. Sensors 2023, 23, 9613. [Google Scholar] [CrossRef] [PubMed]
- Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. ; others. A public domain dataset for human activity recognition using smartphones. Esann, 2013, Vol. 3, p. 3.
- Elakkiya, M.K. ; others. Toward improving the accuracy in the diagnosis of schizophrenia using functional magnetic resonance imaging (fMRI). In Cognitive Systems and Signal Processing in Image Processing; Elsevier, 2022; pp. 293–318.
- Anowar, F.; Sadaoui, S.; Selim, B. Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Computer Science Review 2021, 40, 100378. [Google Scholar] [CrossRef]
- Bi, Z.j.; Han, Y.q.; Huang, C.q.; Wang, M. Gaussian naive Bayesian data classification model based on clustering algorithm. 2019 International Conference on Modeling, Analysis, Simulation Technologies and Applications (MASTA 2019). Atlantis Press, 2019, pp. 396–400.
- Subasi, A. Practical machine learning for data analysis using python; Academic Press, 2020.
- Gudivada, V.N.; Irfan, M.T.; Fathi, E.; Rao, D.L. Cognitive analytics: Going beyond big data analytics and machine learning. In Handbook of statistics; Elsevier, 2016; Vol. 35, pp. 169–205.
- Bartosik, A.; Whittingham, H. Evaluating safety and toxicity. In The Era of Artificial Intelligence, Machine Learning, and Data Science in the Pharmaceutical Industry; Elsevier, 2021; pp. 119–137.
- Li, S. others. Text classification Based on Machine Learning Methods 2019.
- Gove, R.; Faytong, J. Machine learning and event-based software testing: classifiers for identifying infeasible GUI event sequences. In Advances in computers; Elsevier, 2012; Vol. 86, pp. 109–135.
- Vaibhaw, J.S.; Pattnaik, P. Braincomputer interfaces and their applications. An industrial IoT approach for pharmaceutical industry growth 2020, 2, 31–54. [Google Scholar]
- Yeung, D.S.; Cloete, I.; Shi, D.; wY Ng, W. Sensitivity analysis for neural networks; Springer, 2010.
- Sakr, C.; Patil, A.; Zhang, S.; Kim, Y.; Shanbhag, N. Minimum precision requirements for the SVM-SGD learning algorithm. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2017, pp. 1138–1142.
- Wilbur, W.J.; Kim, W. Stochastic gradient descent and the prediction of MeSH for PubMed records. AMIA Annual Symposium Proceedings. American Medical Informatics Association, 2014, Vol. 2014, p. 1198.
- Yadav, N.; Kumar, A.; Bhatnagar, R.; Verma, V.K. City crime mapping using machine learning techniques. The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019) 4. Springer, 2020, pp. 656–668.
- Jahromi, A.H.; Taheri, M. A non-parametric mixture of Gaussian naive Bayes classifiers based on local independent features. 2017 Artificial intelligence and signal processing conference (AISP). IEEE, 2017, pp. 209–212.
- Makruf, M.; Bramantoro, A.; Alyamani, H.J.; Alesawi, S.; Alturki, R. Classification methods comparison for customer churn prediction in the telecommunication industry. International Journal of Advanced and Applied Sciences 2021, 8. [Google Scholar]
- Basha, S.M.; Rajput, D.S. Survey on evaluating the performance of machine learning algorithms: Past contributions and future roadmap. In Deep Learning and Parallel Computing Environment for Bioengineering Systems; Elsevier, 2019; pp. 153–164.
- Amor, N.B.; Benferhat, S.; Elouedi, Z. Qualitative classification with possibilistic decision trees. In Modern Information Processing; Elsevier, 2006; pp. 159–169.
- Adetiloye, T.; Awasthi, A. Predicting short-term congested traffic flow on urban motorway networks. In Handbook of Neural Computation; Elsevier, 2017; pp. 145–165.
- Pink, C.M. Forensic ancestry assessment using cranial nonmetric traits traditionally applied to biological distance studies. In Biological distance analysis; Elsevier, 2016; pp. 213–230.
- Liaw, A.; Wiener, M.; others. Classification and regression by randomForest. R news 2002, 2, 18–22. [Google Scholar]
- Richman, J.S. Multivariate neighborhood sample entropy: a method for data reduction and prediction of complex data. In Methods in enzymology; Elsevier, 2011; Vol. 487, pp. 397–408.
- Chanal, D.; Steiner, N.Y.; Petrone, R.; Chamagne, D.; Péra, M.C. Online diagnosis of PEM fuel cell by fuzzy C-means clustering 2022.
- Nadkarni, P. Chapter 10-Core Technologies: Data Mining and, 2016.
- Ebbels, T.M. Non-linear methods for the analysis of metabolic profiles. In The Handbook of Metabonomics and Metabolomics; Elsevier, 2007; pp. 201–226.
- Chakraborty, S.; Sambhavi, S.; Nandy, A. Deep Learning in Gait Abnormality Detection: Principles and Illustrations. Bioinformatics and Medical Applications: Big Data Using Deep Learning Algorithms 2022, 63–72. [Google Scholar]
- Han, C.; Zhang, L.; Tang, Y.; Huang, W.; Min, F.; He, J. Human activity recognition using wearable sensors by heterogeneous convolutional neural networks. Expert Systems with Applications 2022, 198, 116764. [Google Scholar] [CrossRef]
- Mim, T.R.; Amatullah, M.; Afreen, S.; Yousuf, M.A.; Uddin, S.; Alyami, S.A.; Hasan, K.F.; Moni, M.A. GRU-INC: An inception-attention based approach using GRU for human activity recognition. Expert Systems with Applications 2023, 216, 119419. [Google Scholar] [CrossRef]
- Venkatachalam, K.; Yang, Z.; Trojovskỳ, P.; Bacanin, N.; Deveci, M.; Ding, W. Bimodal HAR-An efficient approach to human activity analysis and recognition using bimodal hybrid classifiers. Information Sciences 2023, 628, 542–557. [Google Scholar] [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).