1. Introduction
In recent years, with the continuous development of artificial intelligence (AI) technology, image and speech emotion recognition technologies have become one of the important applications. The AI technologies have significant application in many fields, such as driver monitor [
1], fraud detection [
2], medical care [
3] and education [
4]. In many application scenarios, embedded devices play an important role, so it has become particularly important to achieve neural network model inference on embedded devices such as FPGA chips.
For data with temporal dependencies, such as consecutive images, speech, and natural language, using temporal model can enhance the performance of the model. Therefore, the model architecture used in this paper first applies CNNs to extract local features from the data. To enhance the model’s learning performance on time-series level, LSTM neural networks are then used to model the feature sequences. Finally, FCNN is employed for results classification.
Emotion recognition is a technology that utilizes various signals, such as facial images and speech, to analyze and identify emotional states. In the case of using facial images for emotion recognition, the current main approach is to extract feature values from the image using 2-D convolutional neural networks, followed by prediction and classification, as demonstrated in papers [
5,
6]. Emotion recognition technologies are crucial for understanding and analyzing human emotions, as it can assist us in better understanding the emotional state of the user and provide more accurate and human-like solutions for related application areas [
7]. Unlike speech recognition, where data preprocessing is crucial and involves steps such as audio cropping, noise reduction, and feature extraction, image recognition requires less preprocessing since raw images are typically used with minimal manipulation. Speech recognition heavily depends on high-quality preprocessing to handle variability, while image recognition can often learn features directly from raw images. Many studies have shown that the quality of data preprocessing and feature extraction significantly affects the performance of speech emotion recognition models. Experimental results in paper [
8] show that the accuracy of machine learning models is affected by using different speech features for training. Consequently, in order to obtain accurate and stable performance of artificial intelligent models on emotion recognition, this paper focuses on the emotion recognition based on consecutive facial images.
However, the accuracies of facial recognition mainly depend on the quality of the input signals, such as the contrast, brightness, and focus of images for image signals. Consequently, it may significantly reduce the reliability of emotion recognition if any of the elements in the image fails to meet the standard requirements, such as overexposure or blurring. For the above reasons, this paper proposes consecutive facial pre-processing and recognition methods, to complete more suitable emotion recognition methods based on the user’s environment.
1.1. Field-Programmable Gate Array
The implementation of intelligent systems such as automatic emotion recognition technology in embedded systems faces many challenges, e.g., real-time requirements, resource constraints, and low power consumption requirements. Therefore, when implementing facial emotion recognition technology, the choice of hardware platform is crucial to the system’s efficiency and performance. Consequently, the FPGA (Field-Programmable Gate Arrays) have become one of the ideal platforms for implementing emotion recognition due to their highly customizable features, parallel processing capabilities, and low power consumption.
The FPGA is a reconfigurable embedded device commonly used in digital logic and digital signal processing applications. FPGA’s high flexibility and programmability enable their wide application in various fields, including IC testing [
9], embedded systems [
10,
11,
12], and the IoT (Internet of Things) [
13].
The features of FPGA include:
Reconfigurability: FPGAs are reconfigurable [
14] and can define their digital logic circuits through programming, allowing developers to redesign the FPGA’s functions according to application requirements repeatedly.
High parallel processing capability: FPGAs have multiple independent logic circuits and data paths that can run in parallel, enabling them to efficiently perform parallel processing for multiple tasks and hence provide high-performance computing power.
Low latency and high-frequency operation: Due to the fact that FPGA’s logic circuits composed of gate arrays and have high optimization capabilities, it can achieve low latency and high-frequency operation. This makes it ideal for applications requiring high-speed processing.
Customizability: FPGAs are highly flexible in customization and can be designed and optimized according to application requirements. This includes design of logic circuits, data paths, memory, and interfaces.
Software and hardware co-design: FPGAs provide the ability to co-design software and hardware on a single chip [
15]. This provides higher system integration and performance.
Suitable for rapid development and testing: FPGAs have a rapid development cycle. Developers can quickly develop and test them within a shorter period [
16].
1.2. Experimental Protocol
This paper utilizes two deep learning frameworks, TensorFlow and Keras, to train emotion recognition models for consecutive facial images signals on a PC. The parameters of the models are transferred to an FPGA chip. The neural network model inference algorithms are used to simulate computation of model inference in the deep learning frameworks and then obtain the final classification results. For consecutive facial emotion recognition, this paper dynamically captures 30 frames of images from a video as the consecutive image data. The facial images are extracted by using the open-source face detection model from OpenCV. The CLDNN (Convolutional Long Short-Term Memory Fully Connected Deep Neural Networks) model architecture proposed in paper [
17] is used to build and train the ML model. The trained model is then deployed on the FPGA chip for model inference.
6. Conclusions
This paper proposed the methods based on deep learning for consecutive facial emotion recognition. The proposed model was implemented on an embedded system with FPGA chip without the need for a deep learning framework during the model inference process. For consecutive facial emotion recognition, this paper captured 30 frames of an image sequence to represent a consecutive image segment. The Haar cascade frontal face detection model from OpenCV was utilized to extract the facial regions from the images, followed by grayscale conversion and resizing to reduce computational burden on the embedded device. The preprocessed images were then fed into local feature learning blocks to extract local features from individual frames. These features were then packaged into a feature sequence representing a consecutive image segment. The feature sequence was then passed through an LSTM layer for temporal sequence learning. Finally, a fully connected layer was used for classification.
Next, the parameters of the deep learning models for consecutive facial emotion recognition, as well as the test dataset, were fed into FPGA’s memory for model inference. This research implemented the neural network model inference algorithms in Python. Then, through high-level synthesis, the algorithms were automatically transformed from the high-level language into circuit functionality. This allowed us to realize model inference on the embedded device without the need of deep learning frameworks. For the model inference of consecutive facial emotion recognition, the proposed method achieved the same test accuracy as that tested on a PC using deep learning frameworks. This indicated that the neural network model inference algorithms proposed in this paper can achieve the same performance as using the deep learning frameworks. The average testing time for a single consecutive image data was 11.70 seconds, with an average testing time of 0.39 seconds per single image of size 100×100 pixels. The implemented hardware had an FPS of 2.56. The experimental results for the designed FPGA chip verify that the implemented AI (Artificial Intelligent) chip based on FPGA is feasible and is suitable for the AI edge computing application.
Finally, according to the experimental results in
Section 4, the proposed deep learning model applied on the three databases RADESS, BAUM-1s and IEMOCAP databases achieves much higher recognition rates than those in the other papers. This demonstrate that the proposed methods outperform the methods in the other literatures.
Figure 1.
The CLDNN model architecture, composed of multiple CNNs, LSTM neural networks, and DNNs.
Figure 1.
The CLDNN model architecture, composed of multiple CNNs, LSTM neural networks, and DNNs.
Figure 2.
The database is split into training dataset, validation dataset and testing dataset.
Figure 2.
The database is split into training dataset, validation dataset and testing dataset.
Figure 3.
Calculates the frame interval based on the total number of frames in a video file, and captures 30 frames of video using the frame interval to represent a consecutive image data.
Figure 3.
Calculates the frame interval based on the total number of frames in a video file, and captures 30 frames of video using the frame interval to represent a consecutive image data.
Figure 4.
Pre-processing of the videos in proposed facial emotion recognition method.
Figure 4.
Pre-processing of the videos in proposed facial emotion recognition method.
Figure 5.
Uses the OpenCV face detection model to capture the facial part in the images.
Figure 5.
Uses the OpenCV face detection model to capture the facial part in the images.
Figure 6.
The LFLB used in this paper, including a 2-D convolutional layer, a batch normalization layer, and a max pooling layer.
Figure 6.
The LFLB used in this paper, including a 2-D convolutional layer, a batch normalization layer, and a max pooling layer.
Figure 7.
The flowchart of the proposed method for consecutive facial emotion recognition.
Figure 7.
The flowchart of the proposed method for consecutive facial emotion recognition.
Figure 8.
The relationship between the number of memory units in the LSTM layer and the accuracy of the proposed consecutive facial emotion recognition model.
Figure 8.
The relationship between the number of memory units in the LSTM layer and the accuracy of the proposed consecutive facial emotion recognition model.
Figure 9.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper.
Figure 9.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper.
Figure 10.
Up-sampling for BAUM-1s database. (a) before up-sampling (b) after up-sampling.
Figure 10.
Up-sampling for BAUM-1s database. (a) before up-sampling (b) after up-sampling.
Figure 11.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper on BAUM-1s database.
Figure 11.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper on BAUM-1s database.
Figure 12.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper on eNTERFACE’05 database.
Figure 12.
The confusion matrix before and after normalized, (a) and (b) respectively, obtained from 10-fold cross-validation of the consecutive facial emotion recognition method proposed in this paper on eNTERFACE’05 database.
Table 2.
The experimental environment of the proposed emotion recognition methods.
Table 2.
The experimental environment of the proposed emotion recognition methods.
Experimental environment |
CPU |
Intel® Core™ i7-10700 CPU 2.90GHz Manufacturer: Intel Corporation, Santa Clara, CA, USA |
GPU |
NVIDIA GeForce RTX 3090 32GB Manufacturer: NVIDIA Corporation, Santa Clara, CA, USA |
IDE |
Jupyter notebook (Python 3.7.6) |
Deep learning frameworks |
TensorFlow 2.9.1, Keras 2.9.0 |
Table 3.
The quantity and proportion of data for each emotion in the RAVDESS database.
Table 3.
The quantity and proportion of data for each emotion in the RAVDESS database.
Label |
Number of Data |
Proportion |
Angry |
376 |
15.33% |
Calm |
376 |
15.33% |
Disgust |
192 |
7.83% |
Fear |
376 |
15.33% |
Happy |
376 |
15.33% |
Neutral |
188 |
7.39% |
Sad |
376 |
15.33% |
Surprised |
192 |
7.83% |
Total |
2,452 |
100% |
Table 4.
The quantity and proportion of data for each emotion in the BAUM-1s database.
Table 4.
The quantity and proportion of data for each emotion in the BAUM-1s database.
Label |
Number of Data |
Proportion |
Angry |
59 |
10.85% |
Disgust |
86 |
15.81% |
Fear |
38 |
6.99% |
Happy |
179 |
32.90% |
Sad |
139 |
25.55% |
Surprised |
43 |
7.90% |
Total |
544 |
100% |
Table 5.
The quantity and proportion of data for each emotion in the eNTERFACE’05 database.
Table 5.
The quantity and proportion of data for each emotion in the eNTERFACE’05 database.
Label |
Number of Data |
Proportion |
Angry |
211 |
16.71% |
Disgust |
211 |
16.71% |
Fear |
211 |
16.71% |
Happy |
208 |
16.47% |
Sad |
211 |
16.71% |
Surprised |
211 |
16.71% |
Total |
1,263 |
100% |
Table 6.
The impact of using different numbers of LFLBs on model accuracy.
Table 6.
The impact of using different numbers of LFLBs on model accuracy.
Number of LFLBs |
Number of Local Features |
Accuracy |
3 |
2704 |
28.89% |
4 |
784 |
52.96% |
5 |
256 |
88.58% |
6 |
64 |
99.51% |
7 |
16 |
32.68% |
Table 7.
The parameters of the proposed CLDNN model for consecutive facial emotion recognition.
Table 7.
The parameters of the proposed CLDNN model for consecutive facial emotion recognition.
Model Architecture |
Information |
LFLB 1 |
Conv2d (Input) Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 5, Strides =1 Pool_size = 5, Strides = 2 |
LFLB 2 |
Conv2d Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 5, Strides =1 Pool_size = 5, Strides = 2 |
LFLB 3 |
Conv2d Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 5, Strides =1 Pool_size = 5, Strides = 2 |
LFLB 4 |
Conv2d Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 5, Strides =1 Pool_size = 5, Strides = 2 |
LFLB 5 |
Conv2d Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 3, Strides =1 Pool_size = 3, Strides = 2 |
LFLB 6 |
Conv2d Batch_normalization Max_pooling2d |
Filters = 16, Kernel_size = 3, Strides =1 Pool_size = 3, Strides = 2 |
Concatenation |
Packages every 30 image features into a consecutive facial image feature sequence |
Flatten |
|
Reshape |
|
LSTM |
Unit = 20 |
Batch_normalization |
|
Dense (Output) |
Unit = 8, Activation = “softmax” |
Table 8.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing.
Table 8.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing.
|
Training |
Validation |
Testing |
Loss |
Acc |
Loss |
Acc |
Loss |
Acc |
Fold 1 |
0.0308 |
1.0000 |
0.0749 |
1.0000 |
0.4998 |
0.9919 |
Fold 2 |
0.0366 |
1.0000 |
0.0745 |
1.0000 |
0.4517 |
1.0000 |
Fold 3 |
0. 0192 |
1.0000 |
0.0415 |
1.0000 |
0.1363 |
1.0000 |
Fold 4 |
0.0206 |
1.0000 |
0.0428 |
1.0000 |
0.2667 |
0.9959 |
Fold 5 |
0.0369 |
1.0000 |
0.0593 |
1.0000 |
0.2978 |
0.9919 |
Fold 6 |
0.0310 |
1.0000 |
0.0703 |
1.0000 |
0.4527 |
0.9959 |
Fold 7 |
0.0179 |
1.0000 |
0.0382 |
1.0000 |
0.1913 |
0.9959 |
Fold 8 |
0.0118 |
1.0000 |
0.0206 |
1.0000 |
0.0459 |
0.9959 |
Fold 9 |
0.0225 |
1.0000 |
0.0378 |
1.0000 |
0.1757 |
0.9919 |
Fold 10 |
0.0348 |
1.0000 |
0.0769 |
1.0000 |
0.2238 |
0.9919 |
Average |
0.0262 |
1.0000 |
0.0536 |
1.0000 |
0.2741 |
0.9951 |
Table 9.
The accuracy, precision, recall, and F1-score of each emotion calculated by the confusion matrix of the proposed consecutive facial emotion recognition method.
Table 9.
The accuracy, precision, recall, and F1-score of each emotion calculated by the confusion matrix of the proposed consecutive facial emotion recognition method.
Label |
Accuracy |
Precision |
Recall |
F1-score |
Angry |
0.9992 |
0.9967 |
0.9967 |
0.9967 |
Calm |
0.9967 |
0.9949 |
0.9850 |
0.9899 |
Disgust |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
Fear |
0.9992 |
0.9960 |
1.0000 |
0.9980 |
Happy |
0.9996 |
0.9976 |
1.0000 |
0.9988 |
Neutral |
0.9976 |
0.9559 |
1.0000 |
0.9774 |
Sad |
0.9992 |
1.0000 |
0.9917 |
0.9959 |
Surprised |
0.9988 |
1.0000 |
0.9875 |
0.9937 |
Average |
0.9988 |
0.9926 |
0.9951 |
0.9938 |
Table 10.
Comparison of the experimental results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the RAVDESS database.
Table 10.
Comparison of the experimental results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the RAVDESS database.
Method |
Classes |
Accuracy |
E. Ryumina, et al. [18]
|
8 |
98.90% |
F. Ma, et al. [20] |
6 |
95.49% |
A. Jaratrotkamjorn, et al. [22] |
8 |
96.53% |
Z. Q. Chen, et al. [23] |
7 |
94% |
Proposed model |
8 |
99.51% |
Table 11.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing on BAUM-1s database.
Table 11.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing on BAUM-1s database.
|
Training |
Validation |
Testing |
Loss |
Acc |
Loss |
Acc |
Loss |
Acc |
Fold 1 |
0.0505 |
0.9137 |
0.9556 |
0.9327 |
0.5868 |
0.8600 |
Fold 2 |
0.0623 |
0.9951 |
0.2346 |
0.9405 |
0.8079 |
0.8600 |
Fold 3 |
0. 0852 |
0.9764 |
0.5582 |
0.9428 |
0.6573 |
0.8800 |
Fold 4 |
0.0705 |
0.9553 |
0.4763 |
0.9053 |
0.4489 |
0.8600 |
Fold 5 |
0.0792 |
0.9202 |
0.8127 |
0.9492 |
0.5791 |
0.8400 |
Fold 6 |
0.0801 |
0.9015 |
0.6274 |
0.9266 |
0.4527 |
0.8600 |
Fold 7 |
0.0928 |
0.9589 |
0.3468 |
0.9134 |
0.4802 |
0.9200 |
Fold 8 |
0.0893 |
0.9668 |
0.7501 |
0.9431 |
0.6054 |
0.9000 |
Fold 9 |
0.0934 |
0.9015 |
0.4307 |
0.9519 |
0.4205 |
0.9400 |
Fold 10 |
0.0683 |
0.9907 |
0.6919 |
0.9203 |
0.6596 |
0.8600 |
Average |
0.0772 |
0.9480 |
0.5884 |
0.9326 |
0.5698 |
0.8780 |
Table 12.
The accuracy, precision, recall, and F1-score of each emotion of the proposed consecutive facial emotion recognition method on BAUM-1s database.
Table 12.
The accuracy, precision, recall, and F1-score of each emotion of the proposed consecutive facial emotion recognition method on BAUM-1s database.
Label |
Accuracy |
Precision |
Recall |
F1-score |
Angry |
0.9520 |
0.9600 |
0.6857 |
0.8000 |
Disgust |
0.9340 |
0.9140 |
0.7727 |
0.8374 |
Fear |
0.9920 |
0.7143 |
1.0000 |
0.8333 |
Happy |
0.9960 |
0.9722 |
1.0000 |
0.9859 |
Sad |
0.8860 |
0.7887 |
0.9333 |
0.8550 |
Surprised |
0.9960 |
1.0000 |
0.9667 |
0.9831 |
Average |
0.9593 |
0.8915 |
0.8931 |
0.8825 |
Table 13.
Comparison of the results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the BAUM-1s database.
Table 13.
Comparison of the results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the BAUM-1s database.
Paper |
Classes |
Accuracy |
F. Ma, et al. [20] |
6 |
64.05% |
B. Pan, et al. [30] |
6 |
55.38% |
P. Tiwari [31] |
8 |
77.95% |
Proposed model |
6 |
87.80% |
Table 14.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing on eNTERFACE’05 database.
Table 14.
The loss and accuracy of the proposed consecutive facial emotion recognition method during training, validation, and testing on eNTERFACE’05 database.
|
Training |
Validation |
Testing |
Loss |
Acc |
Loss |
Acc |
Loss |
Acc |
Fold 1 |
0.0437 |
0.9752 |
0.2978 |
0.9563 |
0.4727 |
0.9603 |
Fold 2 |
0.0389 |
0.9747 |
0.1925 |
0.9632 |
0.2063 |
0.9683 |
Fold 3 |
0.0471 |
0.9968 |
0.3194 |
0.9491 |
0.2167 |
0.9762 |
Fold 4 |
0.0423 |
0.9604 |
0.3751 |
0.9578 |
0.4335 |
0.9603 |
Fold 5 |
0.0312 |
0.9823 |
0.3530 |
0.9684 |
0.5808 |
0.9603 |
Fold 6 |
0.0430 |
0.9521 |
0.2496 |
0.9467 |
0.3345 |
0.9524 |
Fold 7 |
0.0488 |
0.9816 |
0.1693 |
0.9546 |
0.4768 |
0.9683 |
Fold 8 |
0.0495 |
0.9873 |
0.3847 |
0.9619 |
0.3880 |
0.9762 |
Fold 9 |
0.0456 |
0.9768 |
0.2319 |
0.9443 |
0.2920 |
0.9841 |
Fold 10 |
0.0345 |
0.9765 |
0.3890 |
0.9691 |
0.4589 |
0.9762 |
Average |
0.0424 |
0.9763 |
0.2962 |
0.9571 |
0.3860 |
0.9682 |
Table 15.
The accuracy, precision, recall, and F1-score of each emotion of the proposed consecutive facial emotion recognition method on eNTERFACE’05 database.
Table 15.
The accuracy, precision, recall, and F1-score of each emotion of the proposed consecutive facial emotion recognition method on eNTERFACE’05 database.
Label |
Accuracy |
Precision |
Recall |
F1-score |
Angry |
0.9817 |
0.9377 |
0.9862 |
0.9613 |
Disgust |
0.9976 |
1.0000 |
0.9842 |
0.9920 |
Fear |
0.9786 |
0.9794 |
0.9154 |
0.9463 |
Happy |
1.0000 |
1.0000 |
1.0000 |
1.0000 |
Sad |
0.9905 |
0.9341 |
1.0000 |
0.9659 |
Surprised |
0.9881 |
0.9793 |
0.9450 |
0.9618 |
Average |
0.9894 |
0.9718 |
0.9718 |
0.9712 |
Table 16.
Comparison of the results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the eNTERFACE’05 database.
Table 16.
Comparison of the results of cross-validation for the proposed consecutive facial emotion recognition method with other related researches on the eNTERFACE’05 database.
Paper |
Classes |
Accuracy |
F. Ma, et al. [20] |
6 |
80.52% |
B. Pan, et al. [30] |
6 |
86.65% |
P. Tiwari [31] |
7 |
61.58% |
Proposed model |
6 |
96.82% |
Table 17.
Accuracies and execution time of testing the proposed consecutive facial emotion recognition model on FPGA.
Table 17.
Accuracies and execution time of testing the proposed consecutive facial emotion recognition model on FPGA.
|
Accuracy (%) |
Execution time (sec) |
Fold 1 |
99.19 |
11.19 |
Fold 2 |
100.00 |
12.04 |
Fold 3 |
100.00 |
12.15 |
Fold 4 |
99.59 |
11.24 |
Fold 5 |
99.19 |
11.20 |
Fold 6 |
99.59 |
12.17 |
Fold 7 |
99.59 |
12.06 |
Fold 8 |
99.59 |
11.45 |
Fold 9 |
99.19 |
11.92 |
Fold 10 |
99.19 |
11.59 |
Average |
99.51 |
11.70 |
Table 18.
Execution time and proportions of each layer in proposed consecutive facial emotion recognition model on FPGA.
Table 18.
Execution time and proportions of each layer in proposed consecutive facial emotion recognition model on FPGA.
Layer |
Execution time (sec) |
Proportion (%) |
Conv2D_1 |
1.3649 |
11.66 |
Batch_Normalization_1 |
0.0008 |
Less than 0.01 |
Max_Pooling2D_1 |
0.8733 |
7.46 |
Conv2D_2 |
6.9281 |
59.21 |
Batch_Normalization_2 |
0.0010 |
Less than 0.01 |
Max_Pooling2D _2 |
0.2260 |
1.93 |
Conv2D_3 |
1.5648 |
13.37 |
Batch_Normalization_3 |
0.0009 |
Less than 0.01 |
Max_Pooling2D _3 |
0.0755 |
0.64 |
Conv2D_4 |
0.4721 |
4.03 |
Batch_Normalization_4 |
0.0006 |
Less than 0.01 |
Max_Pooling2D _4 |
0.0386 |
0.32 |
Conv2D_5 |
0.0468 |
0.40 |
Batch_Normalization_5 |
0.0006 |
Less than 0.01 |
Max_Pooling2D _5 |
0.0248 |
0.21 |
Conv2D_6 |
0.0322 |
0.27 |
Batch_Normalization_6 |
0.0005 |
Less than 0.01 |
Max_Pooling2D _6 |
0.0221 |
0.18 |
LSTM |
0.0071 |
0.06 |
Batch_Normalization_7 |
0.0000 |
Less than 0.01 |
Dense (Softmax) |
0.0002 |
Less than 0.01 |
Total |
11.70 |
100% |