1. Introduction
Image enhancement is the process of applying specific techniques to boost an image’s visual quality. These techniques can imply diverse criteria, such as increasing the image contrast, noise reduction, highlighting relevant details, or adjusting brightness and color saturation. The main goal of image enhancements is to make the information within the image more easily interpretable and perceptible to human viewers, as well as to boost the automatic process of applications such as pattern recognition, medical analysis, and computer vision, among others [
1,
2,
3].
Traditional image enhancement methods are divided into two well-defined categories [
4,
5]. The first one is the spatial domain enhancement that directly modifies pixel values to adjust aspects such as contrast and image detail, using techniques like smoothing and highlighting filters. The second one is the frequency domain enhancement, which transforms the image into a mathematical domain to adjust the frequency components, allowing fine detail enhancement and reduction of undesirable patterns through techniques such as high-pass or low-pass filtering. Both categories have specific applications and are chosen according to the type of improvement desired.
It is remarkable that not all images require the same enhancement process since the enhancement strategy relies upon the image’s specific characteristics. For example, a low-contrast image can significantly benefit from spatial domain enhancement techniques, like the contrast adjustment or the histogram equalization approach, which enhance the image’s sharpness [
6,
7]. On the other hand, in medical images such as magnetic resonances, the fine details highlighting can be critical; therefore, frequency domain enhancement techniques are more suitable for adjusting the high frequencies and spotlighting the image’s internal structures [
8]. Hence, there is no unique ideal operator for all images, and there is no unique quantitative metric that automatically evaluates the image quality. Automatic image enhancement is a process that produces enhanced images without human intervention and is an extremely complicated task in image processing [
9].
Image enhancement methods are commonly parametric, implying that their efficacy significantly relies upon fine-tuning various parameters. Within this context, evolutionary algorithms (EAs) stand out as highly effective tools. Pioneer work in this area is that of Bhandarkar, et al. [
10], where a genetic algorithm (GA) is employed for an image segmentation problem. The outcomes showed that the GA outperforms traditional image segmentation methods regarding accuracy and robustness against noise. In the following years, numerous studies have proposed different approaches to address image enhancement problems through EAs. In [
11], an optimization-based process solved through the particle swarm optimization (PSO) algorithm is employed to enhance the contrast and details of images by adjusting the parameters of the transformation function. This adjustment takes into account the relationship between local and global information. Subsequently, in [
12], an accelerated variant of PSO for the aforementioned transformation function is proposed, achieving a more efficient algorithm in terms of convergence.
The use of EAs to solve image enhancement problems remains a common trend in recent years. This is observed in works such as [
13], where the differential evolution (DE) algorithm is employed to maximize image contrast through a modified sigmoid transformation function. This function adjusts parameters that control the contrast magnitude and the balance between bright and dark areas, with optimal values determined through the evolutionary process. Similarly, Bhandari and Maurya [
14] develop a novel optimized histogram equalization method, preserving average brightness while improving the contrast by means of a cuckoo search algorithm. The proposed method uses plateau boundaries to modify the image histogram, avoiding the extreme brightness changes often caused by traditional histogram equalization. In [
15], GA is applied to optimize the histogram equalization through optimal subdivision considering different delimited light exposition regions. In [
16] the directed searching optimization is applied in the medical image enhancement process, improving the contrast while preserving the texture in specified zones through a threshold parameter. The popularity of EAs has led to more sophisticated approaches, as in [
17], where hybridization between whales optimization and chameleon swarm algorithms is proposed specifically to find the optimal parameters of the incomplete beta function and gamma dual correction. Several other EAs have been applied to image enhancement problems, such as monarch butterfly optimization [
18], chimp optimization algorithm [
19], sunflower optimization [
20], slime mold algorithm [
21], among others.
In the last decade, multi-objective optimization in image processing has also been the subject of several investigations. The relevance of the multi-objective approach lies in the need to balance multiple quality criteria simultaneously. In many cases, improving one image characteristic may worsen another. This inherent conflict between particular objectives requires an approach that obtains a set of optimal solutions, known as the Pareto front. For example, in [
22], a PSO variant is proposed to address a multi-objective problem aimed to simultaneously maximize the available information quantity (by means of the entropy) and minimize the resulting image distortion (measured by the structural similarity index). In [
23], GA is employed to maximize the Poisson log-likelihood function (used to measure the quantitative accuracy) and the generalized scan-statistic model (measures the detection performance). Similarly, in [
24], a multi-objective cuckoo search algorithm is employed to tune the parameter of adaptive histogram equalization. The objectives were to enhance contrast by maximizing entropy and to minimize fast noise variance estimation. In [
25], the Non-dominated Sorting Genetic Algorithm based on Reference Points (NSGA-III) is used to obtain optimal parameters for anisotropic diffusion, aiming to produce effective filtering results. The proposed methodology seeks to balance two competing objectives: image noise content and contrast.
The Pareto front approach implemented in image enhancement tasks offers a variety of solutions. A decision-maker can select the most suitable solution based on the specific needs of the application. In the context of multi-objective optimization, the incorporation of preferences in the decision-maker is known as preference articulation [
26,
27,
28]. Even though preference articulation is a fundamental aspect of multi-objective optimization, its explicit use in works regarding image enhancement with a multi-objective optimization approach seems scarce.
Therefore, the current work is focused on image enhancement, considering two essential properties of image processing: contrast and details. Improving these properties through transformation functions is generally compromised, meaning that increasing contrast can lead to a significant loss of details in the image. Hence, a multi-objective optimization problem with two objective functions is established: one related to the image contrast and the other to the image details. These functions are measured by means of the entropy and the standard deviation for the image contrast, while the pixels’ quantity and intensity in high-frequency regions measure the image details. The contrast enhancement is performed by the sigmoid transform function, and the detail enhancement is performed by the unsharp masking and highboost filtering. To address the problem, the NSGA-II [
29] is used with a posterior preference articulation, meaning that specific solutions are selected once the Pareto front is finally computed. The current work offers the following contributions:
The trade-off between the enhancement of image contrast and details is set as a multi-objective problem. Unlike the traditional mono-objective approach, which only provides an optimal solution with a predefined priority, the current proposal offers the best solutions regarding the compromises between both criteria along the Pareto front.
A posterior preference operator is articulated, providing three key images from the Pareto front: the image with maximum contrast, the image with maximum detail, and the image at the knee of the front, which represents the image closest to the utopia point. This operator allows the user to select the most suitable solutions to their particular needs.
An experiment is conducted with images of two categories: medical and natural scene images. Both categories represent research fields where image processing is an essential endeavor. The results of this experiment demonstrate that the NSGA-II achieves images of superior quality compared to the original instances. Furthermore, a thorough analysis is conducted regarding the suitability of the obtained images according to the established preferences. For medical images, the evaluation focuses on how the selected solutions enhance the clarity and detail of relevant structures, which is crucial for diagnostics and analysis. For natural scene images, the analysis shows how the solutions improve contrast and detail, making the images more visually appealing and impactful.
The remainder of this paper is organized as follows:
Section 2 outlines the sigmoid correction and unsharp masking with highboost filtering methods used in this study. In
Section 3, we present the proposed multi-objective optimization model and posterior preference articulation method.
Section 4 provides the benchmark results, including experimental design, graphical analysis, and quantitative evaluations. Finally,
Section 5 offers conclusions and suggestions for future research directions.
4. Benchmark Results and Discussion
In this session, the fundamental aspects of the experiment are described. Subsequently, the results are presented and discussed in two parts: first, the visual analysis of the results is conducted, followed by a discussion of the numerical results from the images in relation to the well-established indicators.
4.1. Experimental Design
A set of twenty images is selected to assess the effectiveness of the method developed in this work. This set is divided into two groups: the first includes 10 natural scene images extracted from the Kodak dataset [
32], specifically from kodim01 to kodim10 (hereinafter referred to as Natural1 to Natural10, in that order). The second group consists of 10 medical images (referred to as Medical1 to Medical10, respectively) selected from various libraries, including brain images [
33,
34], blood composition images (white blood cells of the basophil and eosinophil types) [
35,
36], X-rays [
37], ocular nodules [
38], dental infections [
39], microphotographs of pulmonary blood vessels [
40], and traumatic forearm positioning [
41].
The NSGA-II algorithm is executed for each image with a maximum of 30,000 function evaluations, aiming to produce a Pareto front containing the best solutions. From this front, solutions are extracted according to the defined articulation preference operator. The objective is to evaluate the quality of these solutions in terms of contrast and details, complemented by a visual analysis to determine the suitability of each image for specific purposes. Finally, the similarity of the enhanced images to the original ones is assessed using the structural similarity index (SSIM), which allows for quantifying the degree of similarity between the processed and original images.
The parameter values used in this experiment are as follows: population size (), number of variables (), number of objective functions (), number of evaluations (), mutation probability (), crossover probability (), simulated binary crossover parameter (), polynomial mutation parameter (), details threshold (), lower bound (), and upper bound ().
4.2. Graphical Results
Table 1,
Table 2,
Table 3 and
Table 4 present the results obtained through the multi-objective optimization image enhancement approach. Specifically,
Table 1 and
Table 2 show the results for natural images, while
Table 3 and
Table 4 display medical images. The tables are organized as follows: the first and second columns list the image names and their corresponding original, unenhanced versions. The third to fifth columns showcase the selected points from the Pareto front, representing the maximum contrast, knee point, and maximum detail, respectively. The final column illustrates the obtained Pareto front through the optimization process, with red, green, and orange points indicating the images that achieved maximum contrast, knee point, and maximum detail, respectively.
As observed in the results, the images extracted from the Pareto front significantly maximize both contrast and detail compared to the original images. In all study cases, the original image is dominated by the solutions extracted from the fronts, demonstrating the approach’s effectiveness in improving visual quality. However, the differences among the three enhanced images for each problem require a more detailed analysis.
In the natural images, the differences among the three preferred images are more subtle, given that these are high-quality images with inherently low contrast, specifically selected for contrast enhancement exercises. More pronounced differences are observed in the images Natural1, Natural6, and Natural8. For the Natural6 image, there is a general improvement in overall details. However, specific regions, such as the highlighted flower in the yellow box, may lose details compared to the original image, which retains more information. This suggests that for future work, it may be advisable to apply local and/or adaptive image enhancement techniques to preserve details in specific regions while maintaining overall image quality.
For medical images, there are instances where differences are more perceptible. For example, in the Medical3 image, the maximum contrast solution makes it difficult to visualize the internal details of the basophil (white blood cell), which could result in less accurate interpretation. In contrast, the knee solution and maximum detail solution provide a clearer view of the interior of the white blood cell. Similarly, in the Medical5 image, the maximum contrast solution highlights the bone structures of the hand and arm. However, the maximum detail image offers a more precise view of the internal structures within the bones, which is crucial for a more detailed evaluation. Another notable example is the Medical8 image, where the maximum detail solution offers a more detailed view of the internal structure of the eosinophil (another type of white blood cell). However, the maximum contrast image improves the visibility of red blood cells. As shown in the yellow box, this solution reveals a red blood cell that is nearly imperceptible in the other solutions. An interesting case is the Medical6 image, where only a few non-dominated solutions are present on the Pareto front. Despite the similarities among the preferred solutions, the nodules are much more perceptible in the enhanced images than in the original image.
The solutions extracted from the Pareto front represent optimal trade-offs between contrast and detail. For natural images, these three alternatives can be considered useful based on aesthetic criteria or in subsequent automatic processes that require prioritizing one property over another. In the case of medical images, these alternatives allow for a more precise evaluation suited to different diagnostic needs, providing a flexible approach to enhance the visualization of critical details according to the clinical context.
4.3. Quantitative Results
Table 5 and
Table 6 present noteworthy information regarding several criteria following the next structure. For each image whose name is presented in the first column, a set of three rows displays the outcomes of the Pareto front’s maximum contrast, knee point, and maximum details solutions. The outcomes per individual regarding their entropy, normalized standard deviation, number of pixels, and pixel intensity are displayed from the third to the sixth column. The seventh and eighth columns display the objective function values of the multi-objective optimization problem. The last column displays the SSIM with respect to the original images, where all those images archived values above 0.7, i.e., SSIM>0.7, are in
boldface, implying that these images accomplished the enhancement with an acceptable similarity to the original image.
As can be seen, the maximum contrast solutions generally yield higher entropy and normalized standard deviation, indicating a broader range of pixel intensities and greater variability in the enhanced images. In contrast, maximum detail solutions focus on enhancing the finer details within the images. This often results in lower entropy and normalized standard deviation compared to maximum contrast solutions but may increase the number and the intensity of high-frequency pixels (indicating more detailed textures). These results highlight differences in contrast and detail between the solutions extracted from the Pareto front that may not be perceptible in the previous visual discussion. If we examine some cases, we can find images such as Natural2, Medical1, and Medical4, where the extreme points on the Pareto front do not show a visual difference. However, their associated values for contrast and detail exhibit numerical differences.
Through the reported values of the objective functions, it can be observed that all exposed solutions are non-dominated, indicating that they represent an optimal trade-off between the two evaluated criteria. Regarding the SSIM index, 65% of the solutions exhibit values above 0.7, indicating a generally high level of structural similarity. Furthermore, by only analyzing the medical images, 85% of the solutions reached SSIM outcomes beyond 0.7, indicating that the proposal is a trustworthy tool when dealing with this kind of information. Nonetheless, if only the natural scene images are evaluated, the number of solutions that archived this SSIM outcome decreases to 50%. This may be an influence of the artificially imposed low contrast in this set of images. Consequently, future work should consider incorporating SSIM as an additional objective function could be considered, especially for images where image fidelity can be crucial.
5. Conclusions
The conflict between contrast and detail in image processing is presented as a multi-objective problem. Through this approach, a set of optimal solutions is obtained, forming a Pareto front in all cases, highlighting the trade-off that can exist between these two properties. Therefore, it is demonstrated that a single-objective approach to this problem will only lead to a particular solution among all the optimal solutions obtained through the multi-objective approach.
A proposed model integrates the sigmoid transformation function and UMH into the NSGA-II. Additionally, a posterior preference articulation is added, which selects three key solutions from the Pareto front: the maximum contrast solution, the maximum detail solution, and the knee point solution. These three solutions showed significant superiority in terms of contrast and detail compared to the original images. Furthermore, the outcomes visually and numerically demonstrated how these three image solutions, though all optimal solutions differ in terms of entropy, standard deviation, number of detail pixels, and detail intensity. This influenced the perception of fundamental characteristics within the images, highlighting the relevance of the proposed preferences in different contexts.
Despite an improvement in the overall details of the images, specific regions may lose details compared to the original image. This suggests that for future work, it may be advisable to apply local and/or adaptive image enhancement techniques to preserve details in specific regions while maintaining overall image quality. It is also advisable to incorporate SSIM or a similar performance index as an objective function to improve image fidelity, especially in medical applications where the image’s information trustworthiness is crucial.