1. Introduction
The small bowel (SB) of human is formed with a complex looped shape, configuration, and an extremely large length (around 6 m). For the SB diseases diagnosis, endoscopy can be used to decide the causes of the tumor, cancer, bleeding, and Crohn’s disease [
1]. In 2001, Capsule Endoscopy (CE, also called Wireless Capsule Endoscopy, WCE) was approved by the Food and Drug Administration in the United States. It is a noninvasive technology that was designed to provide diagnostic imaging of the SB primarily, as through instrumental examinations this part of the human body is difficult to inspect. It represents the latest endoscopic technique that has revolutionized the treatment and diagnosis of diseases of the upper gastrointestinal (GI) tract, SB, and colon. CE device consists of a CMOS camera sensor with a microchip, LED, a RF transmitter, and battery. The clinical examination which involves the use of CE can be executed in an ambulatory or hospital setting on an outpatient basis. After fasting overnight (8−12 hours), a small capsule is swallowed by the patient. The capsule provides a wireless circuit and micro-imaging video technology for the acquisition and transmission of images. Software that gives localization of the device during its passage through the intestine is encompassed in the system. The capsule is propelled by peristaltic movements when it goes through the SB. While moving along the GI tract, images are captured at a fixed frame rate (2 frames per second - fps) even though the newest model of CE manufactured by Given Imaging (PillCam SB 3 capsule) is able to achieve a frame rate of 2–6 fps based on capsule speed as it travels through the SB [
2]. These images are transferred to a data recorder, worn on a belt outside the patient’s body, and about eight hours after swallowing, the patient returns to the clinic where data and images are downloaded. Within 24−48 hours, the capsule is passed into the patient’s stools.
CE has been considered a first-line examination tool for diagnosing various kinds of diseases, including ulcers, polyps, bleeding, and Crohn’s disease [
3]. A single scan may include up to ten thousand of images of the GI tract for each patient, but in only a few of them, evidence of abnormalities may appear. A very common abnormality found in the GI tract is bleeding [
4]. To detect this many researchers have contributed with high-performance classifiers. Detection of bleeding at an early age is critical since it is a precursor for inflammatory bowel diseases such as Crohn’s disease and Ulcerative colitis (UC). Bleeding is not only limited to the stomach, even anywhere in the GI tract can occur [
5]. It can be considered a common abnormality detected by CEs. Which is often defined as "bleeding of unknown origin that recurs or persists or is visible after an upper endoscopy and/or negative endoscopy result" [
6]. The major challenge is that residual traces and blood spots don't have any typical shapes or textures, and their colors can range from light red to dark intense red and brown, making it difficult to distinguish the blood from other digestive contents. This diversity of color might depend on the position of the camera, the bleeding timing [
7], and the neighboring condition of the intestinal content [
8]. Bleeding is not a single pathology, and it may be caused by a variety of small intestinal diseases, such as open wounds, vascular lesions, angiodysplasia, ulcers, Crohn's disease, and tumors. To discriminate pathology, both texture and color features have been used.
Since the CE diagnostic process captured over 57,000 images, manual reviewing is a labor-intensive task for physicians and time-consuming in order to detect bleeding regions[
9], which may involve several challenges including complex background and low contrast, variations in the lesion, and color. This may affect the accuracy of subsequent classification and segmentation [
10,
11]. These issues complicate objective disease diagnosis and necessitate the opinions of many specialists to avoid misdiagnosis
As a result, there is a strong need for an alternate technique to detect bleeding automatically in the GI tract. Some research has been put into the automated inspection and analysis of CE images. Software suites that use computational techniques are often made available with the brand of a particular capsule and are used by a lot of people. The benefit is represented by the efficiency and availability of a tool that can detect bleeding regions automatically and improve diagnostic accuracy. Commercial software built by Given Imaging aims to recognize active blood spontaneously, although the reported sensitivity and specificity are not satisfying [
8]. Although having many advantages, the research area in CE technology is not widespread. For instance, at present, it is challenging for physicians to go through the entire collection of more than 50,000 frames in order to find a disease. Due to visual fatigue and the relatively small size of the lesion region, the disease may go undetected in its early stages. The fact that software packages already available on the market are based on low-level, hand-crafted feature extraction algorithms that have poor generalizability should not be ignored. Additionally, because feature extraction and classification phases are separated in hand-crafted feature-based techniques, it is difficult to make reliable diagnostic decisions.
There has been published several informative, original articles and reviews on bleeding detection from CE images over the last 15 years. The authors of [
12] reviewed the clinical applications and developments of small bowel CE, i.e., small bowel tumors, Celiac disease, and Crohn’s disease. They gave an insight into the potential future prospects of small bowel CE. In [
13] authors discussed different imaging methods: signal-processing, color- and image-processing, artificial intelligence for representing, analyzing, and evaluating CE images. The study [
14] calculated the performance metrics: accuracy, positive and negative predictive values, sensitivity, specificity, and compared the diagnostic accuracy of video CE and double-balloon enteroscopy in cases of obscure GI bleeding of vascular origin. Another study [
15] discussed the market's available CE models, diagnostic yield, safety profile image quality and technical evolution of small-bowel CE. In [
16], authors reviewed and analyzed the computational methods from the literature that can be applied in software, which can improve the diagnostic yield of video CE. Another research group [
17] reviewed deep learning based approaches for CE which is used to solve a variety of issues, e.g., detection of polyp/ulcer/cancer, bleeding/hemorrhage/angiectasia and hookworms. In [
18], authors reviewed the state-of-the-art approaches of machine-vision-based analysis of CE video. It mainly concentrated on the study of shot boundaries and GI pathology detection. In the literature [
19] have assessed the accuracy of video CE to identify active hemorrhage in upper GI. In another article [
20] discussed the deep learning methods for WCE, in which only PubMed repository was used for article selection. Moreover, none of these review articles particularly focused on only bleeding detection algorithms for CE. The foremost contributions of this paper are summarized as follows:
A taxonomy for computer-aided bleeding detection algorithms for capsule endoscopy was identified.
Various color spaces and feature extraction techniques were used to boost the bleeding detection performance, which was discussed in depth.
From the observation of existing literature, a direction to the computer-aided bleeding detection research community was provided.
This work only emphasizes on only state-of-the-art bleeding detection algorithms using CE, that is why this paper differentiates from various recent review papers. This review is done by gathering the required information from the recent research works and organizing it according to taxonomy, and analyzing the performance of bleeding detection methods, and providing a path for future research. Moreover, in order to improve the current acceptance of computer aided bleeding detection algorithms with the help of CE, it is hoped that this effort will capture advanced techniques which will be more acceptable in real life applications.
Color Space
Color space is a specific color arrangement. There are many different color spaces such as RGB (red-R, green-G, blue-B), HSV (hue-H, saturation-S, value-V), YIQ (luminance-Y, chrominance-IQ: in phase-I and quadrature-Q), YCbCr, CMYK (cyan-C, magenta-M, yellow-Y, and key (black)), CIE-Lab, CIE XYZ, etc. Form the literature review, the color spaces used for feature extraction can be categorized into four groups: RGB, HSV, Other and Combined color space.
RGB: Images are represented in the RGB color space as an m-by-n-by-3 numeric array whose components indicate the intensity levels of red, green, and blue color channels. The range of numeric values is determined by the image's data type. Different types of RGB color space are available such as linear RGB, sRGB (standard red, green, blue), adobe RGB, and so on. In CE images or videos, a bleeding zone is distinguished by the presence of a brightly red or dark red zone. Many studies utilized RGB color to extract features for the identification of bleeding images or regions from CE images. The studies [
46,
47] presented an automated obscure bleeding detection technique on the GI tract based on statistical RGB color features that can classify bleeding and non-bleeding images from CE images. By using the same RGB components of each pixel of CE images, the study [
48] presented a system to automatically detect bleeding zones in the CE images. From the first-order histogram in RGB planes, the approach extracted bleeding color information from CE image zones by calculating mean, standard deviation, skew, and energy [
49]. Zhao et al. presented two-dimensional color coordinate systems in RGB color space to segment abnormality in the CE video. The approach combined two descriptors to extract features: the first was based on image color content, while the second was based on image edge information [
37]. Another research group proposed using color vector similarity coefficients to evaluate the color similarity in RGB color space in order to detect bleeding in CE images [
50]. Yun et al. presented a method using color spectrum transformation (CST) for the identification of bleeding in CE images. This approach included a parameter compensation step that used a color balance index (CBI) in RGB color space to compensate for irregular image conditions [
51]. The study [
52] suggested an automatic bleeding image detection technique utilizing RGB color histogram as a feature extractor and bit-plane slicing to detect bleeding and non-bleeding images from CE videos. In [
53] utilized superpixel segmentation in RGB color format to extract bleeding information for an automatic obscure bleeding detection technique. According to [
21], an automated bleeding detection approach was presented by using color-based per-pixel feature extraction techniques. Ghosh et al. [
36] presented an automatic bleeding detection approach based on an RGB color histogram of block statistics to extract features from CE videos. For reducing computational complexity and flexibility, the approach utilized blocks of surrounding pixels rather than individual pixel values. The traditional machine learning model used single pixels for training and testing data. So, the model was unable to eliminate a few very small-judged bleeding zones which are not bleeding. To address this issue, a cluster of pixels-based feature extraction techniques was used in some research to extract features from bleeding CE images. A cluster of pixels in RGB color space was utilized instead of single pixels in an automatic bleeding classification system which improved the sensitivity [
54].
Instead of directly using RGB color space, a G/R composite color plane was utilized to extract features from the CE images [
35]. Another research [
38] extracted statistical features from the overlapping spatial blocks of the CE images based on the G/R color plane. A transform color plane R/G pixel intensity ratio was utilized for the extraction of bleeding information from the CE images [
55]. Rather than considering individual pixels, Ghosh et al. considered the surrounding neighborhood block of that individual pixel and the R/G ratio plane for bleeding feature extraction from CE images [
56]. Shi et al. [
39] used a temporal red-to-green ratio (R/G) feature value to detect bleeding regions.
According to [
57], the authors presented an average pixel intensity ratio in RGB color space to extract features in CE images for an automatic bleeding detection approach. The study [
58] presented rapid bleeding detection from CE video. The red ratio (RR) in RGB space was used to extract a feature from each superpixel of CE images. Also, the RR feature for individual pixels was utilized for feature extraction of bleeding in CE images [
25]. The various coefficient of RGB color space for bleeding and non-bleeding superpixel blocks are RG, RB, and GB two-dimensional space. Liu et al. [
59] presented an automatic detection gastric hemorrhage system based on the coefficient of variation in RG two-dimensional color space coefficient of a different super-pixel block of CE images. Another transformation form of RGB color space is OHTA color space. In [
60], OHTA color space was utilized to extract the feature of bleeding from CE images.
A custom RGB color space was proposed by [
31], which was similar to the CMYK color space and was used to extract features for automatic blood detection in CE video. Kundu et al. [
61], presented a normalized RGB color space histogram-based feature extraction method to identify bleeding in CE images. Another research group employed a two-stage saliency map extraction method to localize the bleeding areas in the CE images. The first-stage saliency map was constructed using a color channel mixer, and the second-stage saliency map was derived from the RGB color space's visual contrast [
62].
Some studies applied algorithms in RGB color space to extract features from CE images or videos. Using the advanced pattern recognition techniques, a MapReduce framework was presented for the identification of bleeding frames and segmentation of the bleeding zones. For classification, the system encodes RGB color information from raw data of CE images using a K-means clustering algorithm. And for the segmentation of the bleeding zone, a density-based algorithm (DBSCAN) was utilized [
63]. Hwang et al. presented an automatic bleeding region detection system by using the Expectation Maximization (EM) clustering algorithm in RGB color space [
64].
HSV: HSV (hue-H, saturation-S, value-V) is an alternative representation of the RGB color space that correlates better to the human perception system. The HSV color space generated from cartesian RGB primaries, and its components and colorimetry are related to the color space from which it is derived. Several studies employed the HSV color space histogram as the color feature descriptor to extract features from CE images or videos in an automated bleeding detection system [
65,
66]. In [
67], the RGB image was transformed into HSV color and calculated several statistical parameters like variance, kurtosis, skewness, entropy, etc. from the histogram of CE images. The extracted features were applied for a bleeding detection method. The contrast, cluster shade, and cluster prominence, entropy were computed to extract bleeding features from Gray Level Co-occurrence Matrix (GLCM) in HSV color space [
68]. Usman et al. suggested a pixel-based method for the detection of bleeding regions in CE videos. The HSV color space was utilized to compute the bleeding information [
29]. The study [
69], presented a color-based segmentation using HSV color space to detect the bleeding regions in CE images. Giritharan et al. presented a bleeding detection method based on the HSV color space with dominant color, and co-occurrence of dominant colors for feature extraction to classify bleeding lesions [
70]. In [
71], suggested HSV color moments to extract bleeding features of CE images achieved the highest accuracy compared to the local binary pattern (LBP), local color moments, and gabor filter. Using a block-based color saturation approach in HSV color space, the CE images were classified as bleeding or non-bleeding [
72]. A two-stage analysis system was proposed in [
73], where both blocks and pixels-based color saturation methods used HSV color space to extract bleeding features. Color saturation and hue were obtained for the study by converting the input videos or images to the HSV color space. In [
74], authors compared texture features with color features extracted from HSV color space for the classification of bleeding and showed color features provided better results. A fuzzy logic edge detection technique was applied in HSV color space by [
1], to extract features of bleeding and non-bleeding from the CE images. According to [
9], a feature selection strategy was proposed based on HSV color transformation to extract geometric features from CE images for classifying bleeding images. To extract the color features, another research group [
75] applied the HSV color scale-invariant feature transform (HSV-SIFT) for CE abnormality detection.
Various statistical features are computed from hue, saturation, and value channels of HSV color spaces. Such as the hue space (H) provided a useful feature for color objects or surfaces. The hue space was utilized by [
76], to extract features of CE images for an automatic bleeding detection approach. Another strategy used the combination of hue-saturation (HS) color histogram with relevant features (64 bins) for extraction of information to identify suspected blood abnormality [
77]. An HSI (hue-H, saturation-S, intensity-I) color space is another variation of HSV color space. A binary feature vector in HSI color space was more effective to extract features in a bleeding detection approach [
78]. A segmentation approach in which the average saturation from the HSI color space, as well as the skewness and kurtosis of the uniform LBP histogram, were used as features for automated segmentation to detect bleeding in CE images [
42]. Another research group suggested an HSI color histogram to follow a moving background and bleeding color distributions through time in the first stage. Cui et al. [
79] presented six color features in HSI color space to classify bleeding and normal CE images.
Other color spaces: Besides the most popular RGB and HSV color space, a few articles utilized different color spaces to extract bleeding features from CE images such as YIQ (luminance-Y, chrominance-IQ: in phase-I and quadrature-Q), YCbCr, CIE Lab, CIE XYZ and K-L (Karhunen-Loeve transform) color space. The study [
80] analyzed only the Q value of the YIQ color scheme to determine the ROI section. Then a composite space Y.I/Q of YIQ color space was presented to extract bleeding by computing the mean, median, skewness, and minima of the pixel values. Based on the YIQ color histogram, another article proposed an automatic bleeding detection scheme from CE images [
81]. A YCbCr color space was presented to collect information from CE images in order to identify images with lesions [
82]. Yuan et al. investigated several color histograms, including RGB, HSV, YCbCr, and LAB, and proposed YCbCr color space to extract bleeding features for discrimination of bleeding images from normal CE images [
83]. The study of [
84], suggested a second component I of CIE Lab color space for localization of bleeding region in CE images. Mathew et al. proposed a bleeding zone detection system based on contourlet transform in CIE XYZ color space [
85]. Another feature extraction color space was Karhunen-Loeve (K-L) transformation which was utilized for fuzzy region segmentation of CE images [
86]. The study [
87], proposed a computer-aided bleeding and ulcer detection approach based on the covariance of second-order statistical features in K-L color space. A K-means color group was suggested as a color feature extractor for super-pixel segmentation to find bleeding regions from CE videos [
88].
Combined multiple color spaces: To detect bleeding, a group of color features was computed using multiple color spaces of CE images. The method of [
26] employed two distinct enhancing operations: the first is for RGB, and the second is for grayscale color space for identifying bleeding images in CE. The study of [
89] determined the ROI of bleeding CE images using YIQ color space and extract features from the ROI using CMYK color space. Based on RGB and HSV color spaces, the CE images were defined using statistical characteristics to extract bleeding features [
33,
90]. A 9-D feature was extracted at the superpixel level from the RGB and HSV color spaces during the segmentation stage of the study [
91]. Some studies utilized a combination of RGB and HSI color spaces to extract features for the bleeding detection approach in CE images [
92,
93,
94,
95]. Five color spaces (RGB, HSV, Lab, YCbCr, and CMYK) were used to extract the features in [
96]. In the study [
7], authors proposed the R channel with respect to the G and B channels and the ratio of G and B channels as features in RGB color space, and HSV color space was chosen for the saturation feature. The color features were extracted by using the color components X = {R, G, B, L, a, b, H, S, V, F1, F2, F3} in RGB, CIE-Lab, and HSV color space from each super-pixel of CE images [
97]. In [
98] the color components H, S, a, b from HSV, Lab color space, and Ros (Rosenfeld-Troy) metric were used. Ten features including normalized excessive red (NER), Hue, sum RGB, chroma, etc., were used for analyzed CE video frames in [
99]. For the segmentation of bleeding region from bleeding CE images, deltaE color differences were used to extract features which applied 9 color shades (red, orange, brown, maroon, purple, pink, mahogany, brown, bittersweet) for characterizing different types of bleeding [
44]. The recommended probability density function (PDF) fitting-based feature extraction technique which used in YIQ, HSV, and CIE-LAB color spaces [
100]. In [
101], 40 features were extracted from five different channels including R in RGB, V in HSV, Cr in YCbCr, and a, L in Lab color spaces. The article [
102] investigated 21 color components for feature extraction of CE images such as U/Y, V/Y, I/Y, and Q/Y in RGB, YUV, YIQ, HSB, CIE XYZ, CIE L*a*b* color spaces. In [
103], a combination of HSV and YIQ color spaces was demonstrated using normal PDF to detect GI diseases from CE videos. According to [
30], HSV color space was used for threshold analysis of the classification model, and CIE-Lab color space was used in the trainable model for edge detection of images.
Texture
Texture feature is used to partition images into ROI and to classify those regions as bleeding and non-bleeding. It provides information about an image in the spatial pattern of colors or intensities that repeats. In [
104,
105] a traditional texture representation model named uniform local binary pattern (LBP) was used to differentiate bleeding and normal regions. The study of [
34], extracted the texture features (LBP) from the suspicious areas of images and their surroundings for classifying bleeding. Zhao et al. [
37] extracted a LBP based on contourlet transforms as texture features to segment the abnormality in WCE. As a color texture feature, Li et al. [
12] integrate chrominance moments and uniform LBP to discriminate bleeding region from normal region. Charfi et al.[
69] also extract texture features (LBP) for segmentation of WCE images in order to prevent false detections. For recognizing bleeding regions, a 6D color texture feature vector {x = (R, G, B, H, S, I)} was developed in the article [
94]. Pogorelov et al. [
106] presented a bleeding detection system computed texture features to extract additional information from the captured image frames. By using a histogram on the index image, a distinguishable color texture feature is developed in [
52] for automatic bleeding image detection.
The gray-level co-occurrence matrix (GLCM) is a statistical approach to assessing texture that considers the spatial interaction of pixels. The GLCM functions describe the texture of CE images by computing how frequently pairs of pixels with given values and in a specified spatial relationship appear in an image, generating a GLCM, and afterward extracting statistical measures from this matrix. In the study [
107], the authors proposed an efficient normalized GLCM for extracting the bleeding features of the CE images. In [
108] proposed a texture feature descriptor based algorithm that operates on the normalized GLCM of the magnitude spectrum of the images for a real-time computerized GI hemorrhage detection system. The study of [
77] compared two types of texture feature which are GLCM and homogeneous texture descriptor (HTD) with various numbers of color histogram bins. Rathnamala et al. [
44] extracted texture attributes from the Gaussian mixture model superpixels of the WCE images.
Extraction Domain
According to the taxonomy, the extraction domain is the process of extracting bleeding features from CE images. All the review studies are categorized into three parts depending on the extraction domain. Global feature: when the features are extracted from the whole frames or images. Local feature: when the features are extracted from pixel-level or a portion of an image (specific block size, ROI, POI). Combined local and global features: when the features are extracted from both pixel level and image level.
Global feature: The entire image information was used in the global feature extraction technique. Using statistical features (such as mean, mode, variance, moment, entropy, energy, skewness, kurtosis, etc.), several articles [
1,
67,
76] extracted bleeding features from whole CE images. In [
49], statistical color features of bleeding images were extracted from the RGB plane's first-order histogram. In another study [
68], statistical features were measured from Gray Level Co-occurrence Matrix after applying an Undecimated Double Density Dual Tree Discrete Wavelet Transform on CE images. Cui et al. [
79] applied six color statistical features to identify bleeding features from the full image feature. Zhou et al. [
78] utilized color information to extract the bleeding features from the image feature. In [
9], color, shape, and surf were used for feature extraction from whole images.
Local feature: A pixel-level feature extraction approach was proposed to several studies in order to accurately identify bleeding images [
29,
35,
50,
55,
57]. Instead of computing different features from each pixel, a few researchers proposed block-based local feature extraction techniques for reducing time and computational cost [
36,
46]. The study [
38] investigated various overlapping block sizes: 3×3, 5×5, 7×7 and 9×9 and proposed a 7×7 block size to extract features of CE images. Maghsoudi et al. [
109] divided the original image of 512×512 pixels into 256 sub-images with a resolution of 32×32 pixels for feature extraction. In [
74], 576×576 pixels input was sliced into nine non-overlapping blocks with 64×64 pixels. Another research group divided each CE image into blocks of 64×64 pixels and analyzed the 64×64 = 4096 pixels in each block to recognition of hemorrhage [
73]. The CE images are surrounded by a large black background which provided unwanted features. As a result, it reduces the performance of the model. To address this, a few studies introduced regions of interest (ROI) for proper feature extraction. In [
83], and [
88] authors selected an ROI from a maximum square inside the circular CE images without loss of main information. The ROI was 180×180 pixels in size, chosen from a total of 256×256 pixels. An elliptical ROI was selected inside the image to extract local features [
33]. According to [
89], an ROI of the bleeding CE image is determined using YIQ color space. After that, CMYK (cyan-C, magenta-M, yellow-Y, and black-K) values are computed within the ROI pixels which was applied to discriminate the bleeding and non-bleeding pixels. An ROI was selected based on the Q value of the YIQ color space and a composite space Y.I/Q was used to capture bleeding information from the ROI section of CE images [
80]. The pixels of interest (POI) are another technique for extracting local features that depend on the intensity values of pixels. In [
100,
103] utilized POI instead of whole CE image to extract features for the classification of bleeding images.
Combined local and global features: Studies used local and global features to develop a robust and accurate computer-aided bleeding detection system. The studies of [
64,
94,
95] proposed a bleeding detection application software that was tested on pixel level and image level. A region-level block-based and an image-level global feature extraction technique were applied in [
38] to identify bleeding images. There were two stages presented in [
106], the first used only local color features to categorize bleeding images, while the second included global texture and color features to classify bleeding pixels. In [
36], a block-based local feature extraction technique was presented, and then global features were extracted using a color histogram to classify bleeding and non-bleeding images. Ghosh et al. [
56] presented an article that used the maximum pixel value of each proposed spatial block and global feature of an image to extract the bleeding features. Another study [
58] used pixels to remove the edge zone and group pixels adaptively based on the red ratio in RGB color space for superpixel segmentation. Another study proposed a global feature descriptor based on magnitude spectrum entropy and a local textural descriptor based on contrast, sum entropy, sum variance, difference variance, and difference average that operating on the normalized GLCM [
108]. Few researchers are proposed various machine learning and deep learning algorithms to extract both global and local features of CE images. The study of [
22] proposed a bleeding detection method by using a genetic algorithm for the feature selection of CE images. Using an unsupervised K-means clustering algorithm, some studies extracted features for automatic bleeding detection in CE [
38], [
63]. Three pre-trained deep convolutional neural networks (CNNs) named ResNet50, VGG19, and InceptionV3 models were used to extract features of CE images suggested by [
23].
Algorithm
Initially, researchers proposed various threshold values to detect bleeding [
50,
51]. Various machine learning (ML) and deep learning (DL) algorithms are currently being used for accurate bleeding detection.
Machine Learning (ML): Several ML algorithms are applied for computer-aided bleeding detection systems to detect bleeding effectively from CE images or videos. Such as Support Vector Machine (SVM), K-Nearest Neighbors (KNN), K-means clustering, Naïve Bayes, Random Tree, Random Forest, Artificial Neural Networks (ANN), Probabilistic Neural Networks (PNN), Multilayer Perceptron (MLP), etc.
SVM is one of the most popular supervised ML algorithms that is used to detect bleeding and non-bleeding images or zones from CE images or videos. The majority of studies used SVM based on the information of extracted features of input images including color space and texture. In 2008, Liu et al. [
47] developed an automated obscure bleeding detection technique on the GI tract that could classify bleeding and non-bleeding images from CE images by using SVM algorithm. An automated bleeding detection approach was presented by [
46] which provided an accuracy of 97.67%. Another research group suggested an automatic bleeding image detection technique utilizing an SVM classifier to detect bleeding and non-bleeding frames from CE videos. The approach reported 94.50% accuracy, 93.00% sensitivity and 94.88% specificity [
52]. The study [
35], utilized the SVM classifier to train with 200 bleeding and 200 non-bleeding CE images and achieved 97.96%, 97.75%, and 97.99% of accuracy, sensitivity, and specificity respectively. Studies in [
48,
110] suggested a system to automatically detect bleeding in the CE images by using the SVM classifier. A recent study suggested a quadratic support vector machine (QSVM) classifier for an automated bleeding detection approach which was proposed in [
1]. A fuzzy logic technique was applied to extract the feature of the images. The model achieved 98.2%, 98%, and 98% of accuracy, sensitivity, and specificity respectively. Joshi et. al. [
111] presented an SVM classification model based on an improved Bag of Visual Words to detect bleeding from CE images
Different kernel functions of the SVM algorithm including linear, polynomial (homogeneous), and radial basis function (RBF) were used in bleeding detection in CE image research. The SVM classifier with linear kernel was utilized in a real-time computerized gastrointestinal hemorrhage detection method in CE videos. The results obtained 99.19%, 99.41%, and 98.95% of accuracy, sensitivity, and specificity respectively [
108]. Liu et al. [
59] presented an automatic detection gastric hemorrhage system SVM classifier with RBF as the kernel function which achieved 95.8% of accuracy, 87.5% of sensitivity, 98.1% of specificity, and 12.5% of miss detection rate, 1.9% of false detection rate respectively. Another research [
21] also used the RBF kernel of the SVM classifier to discriminate between bleeding and non-bleeding images that achieved 98.0%, 97.0%, and 98.0% of accuracy, specificity, and sensitivity respectively. In addition [
93], the SVM classifier model was used to detect bleeding using chi-square kernel and histogram intersection. The combination of spatial pyramids with a robust hue histogram improved the accuracy by about 8%.
KNN is the second most popular supervised ML algorithm to detect bleeding from CE images or videos. The algorithm processes all existing pixels of CE images and classifies new pixels based on similarities. In [
81], the KNN classifier was also used to train CE videos that achieved 97.50% of accuracy with 94.33% of sensitivity and 98.21% specificity. Kundu et al. employed a KNN model for detecting bleeding from CE images which achieved an accuracy of 98.12%, sensitivity of 94.98%, and specificity of 98.55%. The article [
91] presented a bleeding detection approach in CE images that was compared to various ML algorithms such as SVM, AdaBoost, and KNN and, the KNN achieved the best results with 99.22% accuracy. In another research, a KNN classifier was employed by [
76] to distinguish the characteristics of bleeding and non-bleeding images. The classifier was trained with 200 color CE images and achieved 99.0% of accuracy.
Artificial Neural Network (ANN) or Neural Network (NN) is a ML algorithm of computer systems that are designed inspired by biological neural networks. The approach [
49], employed an artificial neural network (ANN) with 3 input neurons, 22 hidden neurons, and 2 output neurons with a minimum squared error loss function. The ANN classifier also was applied in preprocessing step to analyze the pixels in the CE images [
25]. ANN also known as neural network (NN). A NN cell classifier was applied in [
65] to categorize bleeding and non-bleeding patches from CE images. Another research proposed back propagation NN to detect bleeding regions that achieved 97% of sensitivity and 90% of specificity. A Probabilistic Neural Network (PNN) is a radial basis function and Bayesian Theory-based feedforward neural network. One such study [
95], applied PNN to detect bleeding zone in CE images that achieved 93.1% of sensitivity and 85.6% of specificity. Multilayer Perceptron (MLP) is a type of fully connected feedforward ANN that is widely used in statistical pattern recognition. Several articles [
32,
105,
109,
112,
113], employed MLP neural networks to classify bleeding images and bleeding regions from CE images. The article [
102] proposed Vector Supported Convex Hull classification algorithm which was compared to that of SVM configured with two alternative feature selection approaches. The model achieved a 98% of sensitivity and specificity ratio for bleeding detection. Another research [
90] suggested a computer-aided color feature-based bleeding detection technique using a modified ant colony optimization algorithm. The model achieved 98.82%, 99.66%, and 98.01% accuracy, sensitivity, and specificity respectively.
Other ML: Naive Bayes classifier is another ML classification algorithm based on Bayes Theorem. The studies [
53], and [
88] used a Naive Bayes classifier to detect bleeding from CE images. In [
37], another ML algorithm K-means clustering was applied to extract important features for summarizing CE video clips. Random Tree and Random Forest is a tree-based ML algorithm for making decisions. In the study [
67], a Random Tree classifier was trained with 100 bleeding and 100 non-bleeding images for a computer-aided bleeding detection system. The classifier achieved an accuracy of 99%, sensitivity of 98%, and specificity of 99%. In [
68], both Random Tree and Random Forest classifier models outperformed for bleeding detection compared with MLP, and Naive Bayes models. Both models provided 99.5%, 99%, and 100% accuracy, sensitivity, and specificity. The Random Forest model was also used in [
66] which achieved 95.7% of sensitivity and 92.3% of specificity.
Combined multiple ML: In the study [
98] a block-based segmentation technique using local features was presented and several ML algorithms like linear discriminant analysis; SVM, random decision forests, and ADA boost were applied for discriminating between bleeding and non-bleeding images. Using SVM and K-means algorithm, a GI bleeding detection approach was presented in [
63] to detect bleeding images and regions of bleeding images which reported less computation time with 98.04% of accuracy and 84.88% of precision. According to [
77], an automatic detection system is designed to identify suspected blood indicators from CE images. The authors compared various ML classifiers like SVM, and NN. ISVM, which trained with 136 normal with 214 abnormal images and achieved a maximum of 98.13% accuracy for the TAF-SVM algorithm. In addition, an automatic bleeding detection approach in the CE video was suggested by Ghosh et al. by using a cluster-based feature. The clustering information of the CE images was applied to the SVM classifier to detect bleeding zone from CE images and obtained a precision of 97.05%, FPR of 1.1%, and FNR of 22.38% [
38].
Deep Learning (DL): The most widely used deep learning approach for image classification and segmentation is the convolutional neural network (CNN). The CNN model has been utilized in a number of studies to detect bleeding in CE images [
40,
114,
115,
116,
117,
118]. The study of [
27], presented a Fully Convolutional Neural Network (FCN) model for automatic blood region segmentation systems. Another study [
119], proposed a Look-Behind FCN algorithm for abnormalities detection (polyps, ulcers, and blood) in CE images that achieved an accuracy of 97.84%. In [
120], a LeNet model was trained and adopted pre-trained AlexNet, VGG-Net, and GoogLeNet models to identify intestinal hemorrhage. In [
45], the authors applied a pre-trained AlexNet model for identification and a SegNet model for the segmentation of intestinal bleeding. Another CNN algorithm named U-Net architecture was proposed in [
28,
43] to segment bleeding areas from CE images and videos. Xing et al. [
24] proposed a Saliency-aware Hybrid Network algorithm based on two densely connected convolutional networks (DenseNets) for an automatic bleeding detection system. The authors of [
121] developed a CNN model for detecting bleeding zones that were trained using SegNet layers with bleeding, non-bleeding, and background classes. A blood content detection using ResNet architecture with 50 layers was suggested in [
122] which achieved an accuracy of 99.89%, sensitivity of 96.63%, and specificity of 99.96%. Hwang et al. used a CNN model based on VGGNet to identify lesions with a 96.83% accuracy [
123]. Another CNN model for classifying bleeding images from CE was provided in another work [
41]. The model was created utilizing MobileNet and a custom-built CNN. To identify small-bowel angioectasia, the authors of [
124] used a 16-layer Single Shot MultiBox Detector (SSD) deep CNN method.
Combined ML & DL algorithms: In the study [
125], deep CNNs (VGG16 and VGG19) were applied to extract features from CE images. A KNN algorithm was proposed to classify bleeding images that achieved 99.42% and 99.51% of accuracy and precision rate. An automatic bleeding region segmentation technique was presented by [
126] using MLP and CNN models individually. In another study [
23], authors proposed pre-trained deep CNNs (VGG19, InceptionV3, and ResNet50) models to extract bleeding features, and ML algorithms (SVM, KNN, Linear Regression) were utilized to distinguish the bleeding and non-bleeding images. In [
127] applied DenseNet for feature extraction and the features were trained with an MLP algorithm to classify GI track abdominal infections. In addition, the study [
128] applied the CNN model to extract bleeding features and the SVM classifier was used to detect bleeding.
4. Discussion
A CE device typically recorded video in the GI tract around 8 hours long. Few studies utilize video to detect bleeding abnormalities.
Figure 3. shows the overview of used domain and algorithms of the papers that are reviewed in this study. To detect bleeding from CE images, researchers used three tasks which include classification, segmentation and combined (classification + segmentation). Articles those proposed classification task showed an average accuracy of 95.89% ± 3.11, sensitivity of 94.91% ± 4.46 and specificity of 94.95% ± 4.35. For the segmentation algorithms, achieved average accuracy is obtained around 94.95% ± 2.91, sensitivity of 95.18% ± 3.99, and specificity of 96.30% ± 1.98. Articles that proposed combined task achieved an average accuracy of 97.17% ± 2.08, sensitivity of 96.10% ± 4.89, and specificity of 96.62% ± 4.15. Based on the above literature analysis, the combined task performed better. One significant benefit of the current methods is their ability to identify bleeding in images/frames from CE and pinpoint the specific bleeding region. However, a limitation of these methods is their inability to measure the extent or depth of the bleeding area.
Features extraction is an essential part of bleeding detection from CE images. The feature values are extracted from the color channels of CE images. The performance of the bleeding detection algorithms directly depends on the feature values. To identify bleeding, lots of color spaces were presented. RGB is the popular color space to extract features. Because it is the default color space. Apart from RGB, there are several studies proposed individual color channels (R or G or B, etc.) or color channel pixel ratios (R/G, G/R, etc.), or various color spaces (HSV, YIQ, YCbCr, CIE Lab, CIE XYZ, K-L, etc.) to extract appropriate bleeding features. A few researchers applied two or more color spaces together to extract features. According to the taxonomy, all the suggested color spaces were categorized into four groups: RGB, HSV, Combined (multiple color spaces), and Other (YIQ, YCbCr, CIE Lab, CIE XYZ, etc.). It is important to acknowledge that the performance results presented in this article are directly extracted from the original paper. The box plots in all the figures compare the performance between groups of methods, rather than individual algorithms. The statistical measures such as mean, median, 25th percentile, and 75th percentile is utilized for each group. The performance results to detect bleeding for different color spaces are shown in
Figure 4. using a box plot. According to the figure, color space does not provide any performance benefits. All the color spaces provided similar results except the ‘Other’ color space. When comparing all color spaces, the RGB space has a slightly higher recall value. It should be noted that the recall performance criterion is the most important in the detection of bleeding. On top of that, the RGB color space achieved lower variance for accuracy, recall, and specificity. The current methods make a significant contribution by investigating all potential color spaces to detect bleeding in capsule endoscopy
A typical approach for extracting bleeding features from whole CE images is the global feature extraction domain. Because the approach analyzes the entire image at once, the complexity and processing time are increased. It provides an average result which is shown in
Figure 5. using a box plot. To address the problem, various studies proposed a pixel-level feature extraction domain, where the technique analyzed each pixel of the CE image. The technique improved the result but did not reduce the complexity and computation time. Several authors selected a portion of CE images (specific block size, ROI, POI) in the preprocessing for feature extraction as name as local feature extraction domain. The technique improved the result as well as reduced the complexity and processing time. Recently, a few researchers applied both global and local feature extraction domains in a computer-aided bleeding detection system, which significantly enhanced the detection accuracy compared to the individual domain. Texture and statistical values (mean, mode, variance, moment, entropy, energy, skewness, kurtosis, etc.) are calculated using the feature values. Finally, a classification or segmentation algorithm was used by using the extracted values to detect bleeding from CE images. According to
Figure 5., the combined feature extraction domain outperformed the other domains in terms of accuracy, sensitivity, and specificity because it was tested at both the pixel and image levels. Also, a CNN model is used to train the model at the pixel level, and after that, a classification model is applied to detect bleeding. The current feature extraction methods have certain limitations such as introducing bias (which is influenced by the chosen algorithm), increased complexity, overfitting, and reduced generalizability
The majority of the literature review proposed various ML algorithms that were trained with the texture and statistical features to identify bleeding in CE images. Before introducing advanced ML algorithms, researchers set a threshold value in the feature extraction values to detect bleeding in CE. The review found from the literature that the maximum number of articles used KNN, SVM, MLP, NN, and CNN algorithms. Besides these algorithms, a few ML algorithms (PCA, Random Tree, Random Forest, fuzzy means, Expectation Maximization clustering, and Supported Convex Hull) were used, which are called in this study "Other ML". Also, a few DL algorithms (FCN, SegNet, U-Net, DenseNet, ResNet50, VGGNet, and MobileNet) were used, which are referred to as "Other DL". But the performance of the ML technique is based on the extracted features of color channels. The color channel intensity values overlapped between bleeding and non-bleeding pixels. As a result, utilizing ML approaches to distinguish between bleeding and non-bleeding is problematic. In the last few years, researchers proposed a DL technique to identify bleeding from CE images. DL is an end-to-end classification and segmentation approach that extracted features automatically at the pixel level. Unlike ML, the DL approach does not require a separate feature extraction stage and it extracts features automatically to provide more efficient outcomes. The performance results for different state-of-the-art ML and DL algorithms are shown in
Figure 6. using a box plot. From the figure, it is observed that both the KNN and DL algorithms outperformed compared to the other algorithms. While the existing methods have made significant contributions to the development of classification and segmentation algorithms for detecting bleeding with satisfactory performance, they have often been tested on a limited number of test samples such as images. Furthermore, deep learning algorithms have not yet incorporated attention mechanisms to further enhance their performance
The most commonly used color spaces for available state-of-the-art bleeding detection algorithms from CE images are RGB, HSV, YIQ, YCbCr, CIE Lab, CIE XYZ, K-L, etc. Among them from the above review we can see, the RGB space has a slightly higher recall value and achieved lower variance for accuracy, recall, and specificity. A computer-aided system also improves execution speed because no color conversion operations are required when using the RGB color space. So, for practical use RGB color space is the best option there is no need to convert them into other color domains. For the feature extraction method, the combined global and local feature extraction domain showed greater detection accuracy compared to the individual domain. Which makes them more suitable for practical use. For bleeding detection from CE images, most of the literature proposed ML algorithms, which include SVM, KNN, PCA, MLP, NN, Random Tree, Random Forest, etc. Also, the maximum number of articles for DL, used CNN, FCN, SegNet, U-Net, DenseNet, ResNet50, VGGNet, and MobileNet. From the above review of the literature, KNN and DL algorithms outperformed compared to the other algorithms. For ML algorithms the color channel intensity values overlapped between bleeding and non-bleeding pixels. While DL approach does not require a separate feature extraction stage and it extracts features automatically, which is more effective for practical use.