1. Introduction
Endoscopic technology plays an important role in the diagnosis and treatment of diseases. However, the practical implementation of medical endoscopy is frequently impeded by low-light environments, stemming from the intricate physiological structures of internal organs and the utilization of point directional light sources. This impedes the ability of physicians to accurately identify and localize lesions or areas of pathology. Low-light image enhancement has emerged as an effective method to address endoscopic image quality issues, aiming to enhance the visibility and interpretability of the images.
Image enhancement methods typically fall into two main categories: traditional algorithms and deep learning approaches. Traditional image enhancement models include histogram equalization (HE) [
1,
2,
3,
4,
5,
6] and the Retinex model [
7,
8,
9,
10,
11,
12,
13,
14]. Histogram equalization enhances contrast by adjusting the dynamic pixel range of an image to approach a uniform distribution. Since histogram equalization does not take into account the pixel relationships between images, it leads to the problem of information loss and over-enhancement of illumination. To solve this problem, Ibrahim et al. [
2] proposed a histogram regionalization method that assigns a new dynamic range to each partition. The Contrast Limited Adaptive Histogram Equalization (CLAHE) [
1] algorithm enhances image contrast by adaptively adjusting the histogram of local regions, preventing the over-amplification of noise and improving overall visibility. The Retinex method aims at presenting the image in a human-perceptible way, which assumes that the image consists of two parts: reflections and illumination. Typically, only illumination is considered and reflectance is treated as a constant. The approach centers on the lighting component, improving the perceived image quality by emphasizing the role of lighting in our visual experience. Guo et al. proposed the LIME algorithm [
11] to enhance low-light images by estimating the illumination map. Tan et al. [
15] decomposed the image into two layers: the detail layer and the base layer. The vascular information is extended through the channels in the detail layer, while adaptive light correction is applied to the base layer. In EIEN [
16], the image is decomposed into light and reflection components, which are then processed separately. Finally, the reconstructed image is obtained by multiplying the enhanced optical and reflected components. Tanaka et al. [
17] proposed a gradient-based low illumination image enhancement algorithm emphasizing the gradient of the enhancement of the dark region. Wang et al. [
13] proposed an initial illumination weighting method to improve the illumination uniformity of an image by incorporating the inverse square law of illumination while maintaining exposure, chromatic aberration, and noise control. This method effectively improves the illumination and uniformity of endoscopic images from both visual perception and objective evaluation. Fang et al. [
3] proposed a conventional algorithm to enhance the illumination of endoscopic images, which is based on a modified unsharpened mask and the CLAHE algorithm. Acharya et al. [
4] presented an adaptive histogram equalization technique based on a genetic algorithm. The framework incorporates a genetic algorithm, histogram segmentation, and a modified probability density function. LR3M [
18] considers noise generated during low-light image or video enhancement and applies two stages to enhance the image and suppress noise, respectively. These traditional algorithms provide the benefits of high reliability and interpretability. Nevertheless, they often involve manual feature selection in their physical models, and the effectiveness of enhancement results depends on the accuracy of the selected features.
Deep learning methods have made impressive and significant advances in enhancement results in recent years, attributed to their capacity to automatically extract features from a large dataset of images for low-light image enhancement. The pioneering LLNet [
19] was the first deep learning network designed to enhance images captured in low-light natural scenes. Subsequently, numerous deep learning methods for enhancing image illumination have emerged [
20,
21,
22,
23,
24,
25,
26]. Many researchers employed adversarial generative networks to generate synthetic datasets. Adversarial generative networks play an important role in synthesizing datasets to overcome data pairing issues, with commonly employed synthetic dataset generation methods like Pix2Pix [
27] and CycleGan [
28]. Zero-DCE [
29] estimated the light intensity as a specific curve and designed a non-referenced loss function for deep training within a given image dynamic range, which is in line with the requirements of lightweight networks. FLW [
30] designed a lightweight enhancement network with global and local feature extraction adjustments, proving effective for enhancing low-light images. While these algorithms have yielded satisfactory outcomes in enhancing natural images, their efficacy is constrained when applied to medical endoscopic images. The internal cavity environment of endoscopes exhibits weak texture characteristics due to non-Lambertian reflections from tissues, and the structural configuration of the internal cavity, coupled with the use of point light sources, leads to images displaying uneven light and darkness. Utilizing existing algorithms directly in such environments proves ineffective in enhancing image brightness in cavity depressions, and they fail to consider overall image brightness uniformity and overexposure, both critical for expanding the surgeon's field of view and executing surgical maneuvers. The existing network model introduces a certain degree of smoothing effect on the detailed information of tissue structure in endoscopic images during the brightness enhancement process. However, the detailed information in weak texture images serves as a crucial basis for diagnosis and treatment, necessitating emphasis. Unlike natural images, endoscopic application environments demand strict color fidelity maintenance, and prevailing methods typically exhibit substantial color bias in such settings, rendering them unsuitable for direct application to scene brightness enhancement.
This paper specifically designs a low-light endoscopic image illumination enhancement network. The network comprises a decomposition module, a global illumination module, a local feature extraction module with dual attention, and a denoising module. The loss function accounts for color difference, structure, and illumination aspects. Experimental results on the Endo4IE [
31] dataset demonstrate that the proposed method outperforms existing state-of-the-art methods in terms of Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS).
In summary, this approach contributes the following key elements:
A novel network architecture is proposed for the global and local enhancement of low-light images in endoscopic environments. The network addresses the global brightness imbalance and the weak organizational texture commonly found in endoscopic images by integrating global illumination, local detail enhancement, and noise reduction, thereby achieving a balanced enhancement of brightness in endoscopic images;
The global illumination enhancement Module mitigates the luminance inhomogeneity in endoscopic images resulting from the use of point light sources and the tissue structure environment. This is achieved by enhancing the overall image illumination perspective. Inspired by the Retinex methodology, the module extracts the overall image illumination through the decomposition of the model and optimizes the higher-order curve function using histogram information to automatically supplement the image luminance;
Addressing the weak texture characteristics of endoscopic images, the local enhancement module incorporates a feature enhancement with a dual-attention mechanism. This mechanism enhances the local detailed feature expression of images by integrating curvilinear attention and spatial attention, effectively improving the detailed expression of the image organizational structure.
In this paper, Section 2 details the proposed image enhancement method,
Section 3 covers the related experiments,
Section 4 provides the conclusions, and
Section 5 offers discussions.
Author Contributions
Conceptualization, En Mou. and Huiqian Wang.; methodology, En Mou. and Yu Pang.; software, En Mou. and Enling Cao.; validation, En Mou., Huiqian Wang. and Yu Pang.; formal analysis, En Mou.; investigation, En Mou. and Meifang Yang.; resources, En Mou. and Huiqian Wang.; data curation, En Mou.; writing—original draft preparation, En Mou., Meifang Yang., and Yuanyuan Chen.; writing—review and editing, En Mou. and Chunlan Ran; visualization, En Mou. and Yu Pang.; supervision, Huiqian Wang.; project administration, Huiqian Wang.; funding acquisition, Huiqian Wang. All authors have read and agreed to the published version of the manuscript.