1. Introduction
In scientific and engineering domains, optimization problems frequently involve objective functions that can only be obtained through "black-box" methods or simulations, and they often lack explicit derivative information. In black-box optimization cases, the development of derivative-free global optimization methods (DFGO) has been forced by the need to optimize various and often increasingly complex problems in practice because the analytic information about the objective function is unavailable [10,15,33–37]. The absence of derivative information requires the use of derivative-free global optimization (DFGO) methods. DFGO techniques are specifically designed to optimize functions when derivatives are unavailable or unreliable. These methods explore the function’s behavior by sampling it at various points in the input space.
This paper considers a global optimization problem
that require only the availability of objective function values but no derivative information, therefore numerical methods using gradient information can not be used to solve problem (
1). The objective function is supposed to be Lipschitz-continuous with some fixed but unknown Lipschitz constant, and the feasible domain is an
n-dimensional hyper-rectangle
.
Global optimization approaches can be categorized into two main types: deterministic [1,5,6,26,28] and stochastic methods [12,42]. These methods address optimization problem (
1) using various domain partition schemes, often involving hyper-rectangles [5,25,42]. While many
DIRECT-type techniques employ hyper-rectangular partitions, other alternative approaches use simplicial partitioning [18,19] (as
DISIMPL-C [16] and
DISIMPL--V [17]) or diagonal sampling schemes (see [22,23,25], as adaptive diagonal curves [24]).
DIRECT-type algorithms, such as the DIRECT (DIvide RECTangles) algorithm [7–9] are the most widely used partitioning-based algorithms for global optimization problems. One of the challenges faced by these algorithms is the selection of potentially optimal rectangles (the most promising), which can lead to inefficiencies and increased computational costs. In this paper, we provide a comprehensive review of techniques and strategies aimed at reducing the set of selected potential optimal hyper-rectangles in DIRECT-type algorithms. We explore various approaches, including a novel grouping strategy which simplify the identification of hyper-rectangles in the selection procedure. This strategy consists in rounding or approximating the measurements (sizes) of hyper-rectangles, that are extremely small in size, by grouping them together into classes. This simplification can help in various computational or analytical tasks, making the problem more manageable without significantly compromising the accuracy of the analysis or optimization process.
Our review highlights the importance of reducing the number of function evaluations while maintaining the algorithm’s convergence properties. The recent papers by [29,38,39] provides a good and a comprehensive overview of techniques aimed at reducing the set of potentially optimal rectangles in DIRECT-type algorithms. It significantly contributes to the field of derivative-free global optimization and serves as a valuable resource for researchers and practitioners seeking to enhance the efficiency and effectiveness of such algorithms. Some suggested methods are summarized in [9,21,29,36,38].
In the context of Optimization Methods in Engineering Mathematics, the size of a hyper-rectangle often incorporate constraints imposed by the engineering problem. These constraints ensure that the optimization process adheres to real-world limitations, such as physical boundaries, safety margins, or resource constraints. For example, in structural engineering, the size of a hyper-rectangle could represent the permissible ranges for material properties, dimensions, or loads. In engineering optimization, reducing the size of a hyper-rectangle can represent the imposition of stricter constraints. This ensures that the optimized solution adheres to more stringent requirements, such as safety limits or design specifications.
We also use an additional assumption to improve this version allowing to evaluate the objective function only once at each vertex of each hyper-rectangle. The objective function values at vertices could be stored in a special vertex database, and then the result is directly retrieved from this database when required. In addition, an update to the modified optimization domain is applied for some test problems as used in the previous version [3].
The original DIRECT algorithm faces challenges when it comes to sampling points at the edges of the feasible region, which can slow down its convergence, particularly in cases where the best solution is located at the boundary. This limitation is especially pronounced in constrained problems. Recent research [13,32,38] has emphasized the importance of addressing this issue, showing that it’s possible to achieve faster convergence by employing strategies that sample points at the vertices of hyper-rectangles, especially when solutions are near the boundary.
Taking these insights into account, we’ve integrated one of the latest versions of DIRECT-type algorithms into our approach, a new diagonal partitioning and sampling scheme called BIRECTv (BIsection of RECTangles with Vertices) based on the BIRECT algorithm. In the BIRECTv framework, the objective function is evaluated at specific points within the initial hyper-rectangle. Instead of evaluating the objective function only at the vertices, as done in most DIRECT-type algorithms, BIRECTv samples two points along the main diagonal of the initial hyper-rectangle, located at and 1 of the way along the diagonal. This approach provides more comprehensive information about the objective function, and helps to improve convergence, particularly near boundaries.
The contributions of the paper can be summarized as follows:
A review of techniques and strategies aiming to reduce the set of selected potential optimally hyper-rectangles in DIRECT-type algorithms.
Introduction of a novel grouping strategy which simplify the identification of hyper-rectangles in the selection procedure in DIRECT-type algorithms.
The new approach incorporates a particular vertex database to avoid more than two samples in descendant subregions.
The improvements of BIRECTv algorithm positively impacted the performance of the BIRECTv algorithm.
The rest of this paper is organized as follows. In
Sect. 2.1, a review of the original
BIRECT algorithm is provided, while in
Sect. 2.2 a brief description of the new sampling and partitioning scheme called
BIRECTv algorithm is also discussed. In
Sect. 2.3, we incorporate a novel scheme for grouping and selecting potential optimal hyper-rectangles in
BIRECT-type algorithms. Numerical investigation and discussion of the results is given in
Sect. 3. Finally, in
Sect. 4, we conclude the paper and outline potential directions for future prospects.
2. Materials and Methods
This section provides an overview of the original BIRECT algorithm and its modifications.
2.1. The original BIRECT
The BIRECT (BIsection of RECTangles) algorithm, developed in [20], employs a diagonal space-partitioning approach and involves two primary procedures: sampling on diagonals and bisection of hyper-rectangles.
In the initialization step, the algorithm begins by evaluating the objective function at two initial points, "lower" and "upper" , positioned along the main diagonal of the normalized domain, considered as the first unit hyper-cube, . The hyper-cube representing the search space is then divided into a set of smaller hyper-rectangles obeying a specific sampling and partitioning scheme using the following critera (see Algorithm 1).
2.1.1. Selection criteria
A hyper-rectangle
is potentially optimal if the lower bound for
f computed by the left-hand side of (
2) is optimal for some fixed rate of change
among the hyper-rectangles of the current partition
. Inequality (3) helps prevent excessive emphasis on local search [7].
2.1.2. Division and sampling criteria
After the initial partitioning, BIRECT proceeds to future iterations by partitioning potentially optimal hyper-rectangles and evaluating the objective function at new sampling points.
New sampling points are generated by adding and subtracting a distance equal to half the side length of the branching coordinate from the previous points. This approach allows for the reuse of old sampled points in descendant subregions.
An important aspect of the algorithm is how the selected hyper-rectangles are divided. For each potentially optimal hyper-rectangle, the set of maximum coordinates (edges) is computed, and the hyper-rectangle is bisected along the coordinate (branching variable
) with the largest side length. The selection of the coordinate direction is based on the lowest index
j, prioritizing directions with more promising function values.
The partitioning process continues until a predefined number of function evaluations has been performed, or a stopping criterion is satisfied. The algorithm keeps track of the best (smallest) objective function value found over all sampled points in the final partition. The corresponding generated point at which this value was achieved provides an approximate solution to the optimization problem. The main steps of the BIRECT algorithm are outlined in Algorithm 1 (see [20] for a detailed pseudo-code).
The
BIRECT algorithm is a robust optimization technique that efficiently explores the search space, combines global and local search strategies, and strives to find the optimal or near-optimal solution for multidimensional optimization problems. For a more comprehensive understanding, additional details can be found in the original paper [20].
Algorithm 1 Main steps of BIRECT algorithm |
|
2.2. Description of the BIRECTv Algorithm
In this subsection, we return back to one of the most recent versions of
DIRECT-type algorithms (called
BIRECTv) developed in [3]. One effective strategy is to sample points at the vertices of the hyper-rectangles. This approach ensures that points near the boundaries are explored, increasing the chances of finding solutions located there. Sampling at vertices can significantly improve convergence when the optimal solution is at or near the boundary, see [32]. A description of two different partitioning schemes used in
DIRECT-type algorithms is shown in
Figure 1. The original
DIRECT algorithm primarily focuses on sampling within the interior of the feasible region, which means it may miss exploring points near the boundary. Therefore, it may require a large number of iterations to converge to the optimal solution. This slow convergence is because it relies on subdividing hyper-rectangles within the interior, and it may take many iterations before a hyper-rectangle boundary coincides with the solution. The studies conducted in [33,40] have indeed highlighted the significant impact of the limitation in convergence when the optimal solution lies at the boundary of the feasible region. This issue is particularly prevalent in constrained optimization problems, where solutions often lie at the boundary due to the constraints imposed on the variables.
However, a challenge arises when the newly created sampling points coincide with previously evaluated points at shared vertices. This leads to additional evaluations of the objective function, increasing the number of function evaluations per iteration. To address this issue, the paper suggested modifying the original optimization domain to obtain a good approximation of the global solution.
This approach was presented as an alternative to locate solutions that are situated near the boundary. The results of the experiments demonstrated that the proposed modification to the optimization domain positively impacted the performance of the BIRECTv algorithm. It outperformed the original BIRECT algorithm and the two popular DIRECT-type algorithms on the test problems. Additionally, the BIRECTv algorithm showed particular efficacy in solving high-dimensional problems.
2.3. Integrating Scheme for Identification of Potentially Optimal Hyper-rectangles in DIRECT-based Framework
In this section, we introduce an innovative grouping technique that streamlines the hyper-rectangle identification process during selection. This approach involves the rounding or approximation of measurements (sizes) for hyper-rectangles of exceedingly small dimensions. These are then organized into classes, yielding simplification that enhances the manageability of computational and analytical tasks. Importantly, this simplification doesn’t substantially impact the precision of the analysis or optimization process. The selection of the most promising hyper-rectangles in DIRECT-type algorithms is a crucial aspect of optimization.
Various strategies have been developed to enhance this selection process, resulting in different versions of the algorithm. In the DIRECT-l variant [2,9], the size of a hyper-rectangle is measured by the length of its longest side, which corresponds to the infinity-norm. This approach allows DIRECT-l to group more hyper-rectangles with the same measure, resulting in fewer distinct measures. Moreover, in DIRECT-l, only one hyper-rectangle from each group is selected, even if there are multiple potentially optimal hyper-rectangles in the same group. This reduces the number of divisions within a group. DIRECT-l is found to perform well for lower-dimensional problems that do not have an excessive number of local and global minima.
The aggressive version of DIRECT takes a different approach by selecting and dividing a hyper-rectangle of every measure in each iteration. While this strategy requires more function evaluations compared to other versions of DIRECT, it may be advantageous for solving more complex problems. The PLOR algorithm simplifies the set of potentially optimal hyper-rectangles to just two: the maximal and the minimal Lipschitz constants. This reduction allows the PLOR approach to be independent of user-defined parameters. It strikes a balance between local and global search during the optimization process by considering only these two extreme cases.
In two-phase globally and locally biased algorithms, the selection procedure during one of the phases operates similarly to the original DIRECT algorithm, considering all hyper-rectangles from the current partition. However, in the second phase, the selection of potentially optimal hyper-rectangles is constrained based on their measures. Globally-biased versions [17,24] focus on larger subregions, addressing the algorithm’s first weakness, while locally-biased versions [2,14] concentrate on smaller subregions, addressing the second weakness of DIRECT-type algorithms. These adaptations and strategies aim to improve the efficiency and effectiveness of DIRECT-type algorithms in addressing optimization challenges, particularly in scenarios with complex landscapes and varying dimensions [32,33].
The authors in [29] introduced an improved scheme by extending the set of potentially optimal hyper-rectangles for DIRECT-GL algorithm. These enhanced criteria are designed to reduce the computational cost of the algorithm by focusing on the most promising regions of the search space. By implementing the improved selection criteria, the algorithm becomes more efficient in identifying regions of interest within the optimization landscape. This leads to a reduction in the number of hyper-rectangles that need to be explored, saving computational resources and time. The enhancements introduced in this work are not limited to a specific type of problem or application. They can be applied to a wide range of optimization scenarios where DIRECT-type algorithms are utilized [30,31,34,39].
Let the partition of
at iteration
k be defined as
Let
is the set of indices identifying the subsets defining the current partition
. Let
a measure of
defined by
Let
, represents a subset of indices that correspond to elements of
with measue
having almost the same measure as
within a certain tolerance (
threshold=
), ranging from
to
, i.e., such that
.
The purpose is to identify potentially optimal hyper-rectangles. It looks for hyper-rectangles (indexed by I) where the norm value () is very close (within the defined tolerance) to the normalized norm value ().
The line 11 is used to reduce the set of potentially optimal hyper-rectangles. The code filters the hyper-rectangles and selects only those that meet a specific condition, which is having their norm value () close to the normalized norm value () within a tolerance of .
In summary, this line of code helps to focus on potentially more promising hyper-rectangles, discarding those that are not as close to the desired normalized norm value. It’s a way to efficiently narrow down the search space and improve the efficiency of the algorithm. An illustrative example for two different tolerance levels is given in
Figure 2.
The difference between the tolerance and lies in the level of precision used when comparing the and values to filter the potentially optimal hyper-rectangles.
1. Tolerance :
A tolerance of (0.01) means that the algorithm will consider hyper-rectangles whose and values are within 0.01 of each other.
It allows for a relatively larger difference between and , meaning the algorithm will be more lenient in selecting potentially optimal hyper-rectangles.
This might result in a larger set of potentially optimal hyper-rectangles, including some with relatively larger differences in their norm values.
2. Tolerance :
A tolerance of (0.0000001) means that the algorithm will consider hyper-rectangles whose and values are within of each other.
It uses a much smaller tolerance, making the algorithm much stricter in selecting potentially optimal hyper-rectangles.
This will result in a smaller set of potentially optimal hyper-rectangles, only including those with extremely close norm values.
The choice of tolerance depends on the specific problem and the desired level of precision in the algorithm. A larger tolerance may lead to faster execution, but it might also include some hyper-rectangles that are not truly optimal. On the other hand, a smaller tolerance will be more accurate but may require more computational effort to identify the potentially optimal hyper-rectangles. It’s a trade-off between efficiency and precision in the algorithm’s behavior.
Note: The algorithm assumes a zero-based index for the array elements, and the first index found satisfying the condition is returned. If no element satisfies the condition, the algorithm returns -1.
The algorithm essentially performs a linear search through the array and stops as soon as it finds the first element within the specified tolerance level. It’s important to choose an appropriate tolerance level depending on the application and the expected values in the array.