1. Introduction
The task of locating the global minimum of a continuous and differentiable function
f can be defined as:
with
S:
This task finds application in a variety of real world problems, such as problems from physics [
1,
2,
3], chemistry [
4,
5,
6], economics [
7,
8], medicine [
9,
10] etc. The methods aimed at finding the global minimum are divided into two major categories: deterministic methods and stochastic methods. The most frequently encountered techniques of the first category are interval techniques [
11,
12], which partition the initial domain of the objective function until a promising subset is found to find the global minimum. The second category includes the vast majority of methods and in its ranks one can find methods such as Controlled Random Search methods [
13,
14,
15], Simulated Annealing methods [
16,
17,
18], Differential Evolution methods [
19,
20], Particle Swarm Optimization (PSO) methods [
21,
22,
23], Ant Colony optimization methods [
24,
25], etc. Furthermore, a variety of hybrid techniques have been proposed, such as hybrid Multistart methods [
26,
27], hybrid PSO techniques [
28,
29,
30] etc. Also, many parallel optimization methods [
31,
32] have appeared during the past years or methods that take advantage of the modern GPU processing units [
33,
34].
One of the basic techniques included in the area of stochastic techniques is Genetic Algorithms, initially proposed by John Holland [
35]. The operation of genetic algorithms is inspired by biology, and for this reason, it utilizes the idea of evolution through genetic mutation, natural selection, and crossover [
36,
37,
38].
Genetic algorithms can be combined with machine learning to solve complex problems and optimize models. More specifically, the genetic algorithm has been applied in many machine learning applications, such as in the article by Ansari et al, which deals with the recognition of digital modulation signals. In this article, the genetic algorithm is used to optimize machine learning models by adjusting their features and parameters to achieve better signal recognition accuracy [
39]. Additionally, in the study by Ji et al, a methodology is proposed that uses machine learning models to predict amplitude deviation in hot rolling, while genetic algorithms are employed to optimize the machine learning models and select features to improve prediction accuracy [
40]. Furthermore, in the article by Santana, Alonso, and Nieto, which focuses on the design and optimization of 5G networks in indoor environments, the use of genetic algorithms and machine learning models is identified for estimating path loss, which is critical for determining signal strength and coverage indoors [
41].
Another interesting article is by Liu et al, which discusses the use of genetic algorithms in robotics [
42]. The authors propose a methodology that utilizes genetic algorithms to optimize the trajectory and motion of digital twin robots. A similar study was presented by Nonoyama et al [
43], where the research focused on optimizing energy consumption during the motion planning of a dual-arm industrial robot. The goal of the research is to minimize energy consumption during the process of object retrieval and placement. To achieve this, both genetic algorithms and particle swarm optimization algorithms are used to adjust the robot’s motion trajectory, thereby increasing its energy efficiency.
The use of genetic algorithms is still prevalent even in the business world. In the article by Liu et al [
44], the application of genetic algorithms in an effort to optimize energy conservation in a high-speed Methanol Spark Ignition engine fueled with Methanol and gasoline blends is discussed. In this study, genetic algorithms were used as an optimization technique to find the best operating conditions for the engine, such as the air-fuel ratio, ignition timing, and other engine control variables, aiming to save energy and reduce energy consumption and emissions. In another research, the optimization of the placement of electric vehicle charging stations is carried out [
45]. Furthermore, in the study by Chen and Hu [
46], the design of an intelligent system for agricultural greenhouses using genetic algorithms is presented to provide multiple energy sources. Similarly, in the research by Min, Song, Chen, Wang, and Zhang [
47], an optimized energy management strategy for hybrid electric vehicles is introduced using a genetic algorithm based on fuel cells in a neural network under startup conditions.
Moreover, genetic algorithms are extremely useful in the field of medicine, as they are employed in therapy optimization, medical personnel training, genetic diagnosis, and genomic research. More specifically, in the study by Doewes, Nair & Sharma [
48], data from blood analyses and other biological samples were used to extract characteristics related to the presence of the SARS-CoV-2 virus that causes COVID-19. In this article, genetic algorithms are used for data analysis and processing to extract significant characteristics that can aid in the effective diagnosis of COVID-19. Additionally, there are studies that present the design of dental implants for patients using artificial neural networks and genetic algorithms [
49,
50]. Lastly, the contribution of genetic algorithms is significant in both implant techniques [
51,
52] and surgeries [
53,
54].
The current work aims to improve the efficiency of the genetic algorithm in global optimization problems, by introducing a new way of initializing the population’s chromosomes. In the new initialization technique, the k-means [
55] method is used to find initial values of the chromosomes that will lead to finding the global minimum faster and more efficient than chromosomes generated by some random distribution. Also, the proposed technique discards chromosomes which, after applying the k-means technique, are close to each other.
The rest of this article is organized as follows: in
Section 2 the proposed method is discussed in detail, in
Section 3 the used test functions as well the experimental results are fully outlined and finally in
Section 4 some conclusions and future guidelines are listed.
2. The proposed method
The fundamental operation of a genetic algorithm mimics the process of natural evolution. The algorithm begins by creating an initial population of solutions, called chromosomes that represents a potential solution to the objective problem. The genetic algorithm operates by reproducing and evolving populations of solutions through iterative steps. Following the analogy to natural evolution, the genetic algorithm allows optimal solutions to evolve through successive generations. The main steps of the used genetic algorithm are described below:
-
Initialization Step
- (a)
Set as the number of chromosomes.
- (b)
Set the maximum number of allowed generations.
- (c)
Initialize randomly the chromosomes in in S.
- (d)
Set as the selection rate of the algorithm, with .
- (e)
Set as the mutation rate, with .
- (f)
Set iter=0.
-
Fitness calculation Step
- (a)
-
Fordo
- i
Calculate the fitness of chromosome .
- (b)
EndFor
-
Genetic operations step
- (a)
Selection procedure. The chromosomes are sorted according to their fitness values. The chromosomes with the lowest fitness values are transferred intact to the next generation. The remain chromosomes are substituted by offspings created in the crossover procedure. During the selection process for each offspring two parents are selected from the population using the tournament selection.
- (b)
-
Crossover procedure: For every pair
of selected parents two additional chromosomes
and
are produced using the following equations:
The value
is a randomly selected number with
[
56].
- (c)
Mutation procedure: For each element of every chromosome, a random number is drawn. The corresponding element is altered randomly if .
-
Termination Check Step
- (a)
Set
- (b)
If or the proposed stopping rule of Tsoulos [
57] is hold, then goto Local Search Step, else goto b.
Local Search Step. Apply a local search procedure to chromosome of the population with the lowest fitness value and report the obtained minimum. In the current work the BFGS variant of Powell [
58] was used as a local search procedure.
The current work proposes a novel method to initiate the chromosomes, that utilizes the well - known technique of k-means. The significance of the initial distribution in solution finding within optimization is essential across various domains and techniques. Apart from genetic algorithms, the initial distribution impacts other optimization methods like Particle Swarm Optimization (PSO)[
59], Evolution Strategies[
60], and neural networks[
61]. The initial distribution defines the starting solutions that will evolve and improve throughout the algorithm. If the initial population contains solutions close to the optimum, it increases the likelihood of evolved solutions being in proximity to the optimal solution. Conversely, if the initial population is distant from the optimum, the algorithm might need more iterations to reach the optimal solution or even get stuck in a suboptimal solution. In conclusion, the initial distribution influences the stability, convergence speed, and quality of optimization algorithm outcomes. Thus, selecting a suitable initial distribution is crucial for the algorithm’s efficiency and the discovery of the optimal solution in a reasonable time [
63,
64].
2.1. Proposed initialization Distribution
The present work replaces the randomness of the initialization of the chromosomes by using the k-means technique. More specifically, the method takes a series of samples from the objective function and then the k-means method is used to locate the centers of these points. These centers can then be used as chromosomes in the genetic algorithm.
The k-means algorithm emerged in 1957 by Stuart Lloyd in the form of Lloyd’s algorithm[
65], although the concept of clustering based on distance had been introduced earlier. The name ’k-means’ was introduced around 1967 by James MacQueen[
66]. The k-means algorithm is a clustering algorithm widely used in data analysis and machine learning. Its primary objective is to partition a dataset into k clusters, where data points within the same cluster are similar to each other and differ from data points in other clusters. Specifically, k-means seeks cluster centers and assigns samples to each cluster, aiming to minimize the distance within clusters and maximize the distance between cluster centers[
67]. The algorithm steps are presented in algorithm 1
Algorithm 1: The k-Means algorithm. |
- 1.
Set the number of clusters k
- 2.
-
Repeat
- (a)
Set
- (b)
-
For every point do
- i
Set.
- ii.
Set.
- (c)
EndFor
- (d)
-
For every center do
- i.
Set as the number of points in
- ii.
- (e)
EndFor
- 3.
Calculate the quantities as
- 4.
Stopthe algorithm, if there is no change in centers .
|
The algorithm terminates when cluster centers don’t change significantly between consecutive iterations, implying that the clusters have stabilized in their final form[
68,
69].
2.2. Chromosome rejection rule
An additional technique for discarding chromosomes where they are similar or close to each other is listed and applied below. Specifically, each chromosome is extensively compared to all the other chromosomes, and those that have very small or negligible Euclidean distance between them are sought, implying their similarity. Subsequently, the algorithm incorporates these chromosomes into the final initial distribution table, while chromosomes that are not similar are discarded.
Algorithm 2: Chromosome rejection rule |
- 1.
Set C the set of centers,
- 2.
Set a small positive number
- 3.
-
For every center Do
- (a)
-
For every center Do
- i.
If thenremove from C.
- (b)
EndFor
- (c)
If then
- 4.
EndFor
- 5.
Return the final set of centers C
|
2.3. The proposed sampling procedure
The proposed sampling procedure has the following major steps:
Take random samples from the objective function using uniform distribution
Calculate the k centers of the points using the k-means algorithm provided in algorithm 1.
Remove from the set of centers C, points that are closed to each other.
Return the set of centers C as the set of chromosomes.