Preprint
Article

Convergence and Stability Improvement of Quasi-Newton Methods by Full-Rank Update of the Jacobian Approximates

Altmetrics

Downloads

81

Views

21

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

15 November 2023

Posted:

24 November 2023

You are already at the latest version

Alerts
Abstract
A system of simultaneous multi-variable nonlinear equations can be solved by the Newton’s method with local q-quadratic convergence if the Jacobian is analytically available. If this is not the case, then quasi-Newton methods with local q-superlinear convergence give solutions by approximating the Jacobian in some way. Unfortunately, the quasi-Newton condition (secant equation) doesn’t completely specify the Jacobian approximate in multi-dimensional case, so its full-rank update is not possible with classic variants of methods. The suggested new iteration strategy (“T-Secant”) allows full-rank update of the Jacobian approximate in each iteration by determining two independent approximates for the solution. They are used to generate a set of new independent trial approximates, then the Jacobian approximate can fully be updated. It is shown, that the T-Secant approximate is in the vicinity of the classic quasi-Newton approximate, providing that the solution is evenly surrounded by the new trial approximates. The suggested procedure increases the super-linear convergence of the secant method (φS = 1.618...) to super-quadratic (φT = φS + 1 = 2.618...) and the quadratic convergence of the Newton-method (φN = 2) to cubic (φT = φN + 1 = 3) in one dimensional case. The Broyden-type efficiency (mean convergence rate) of the suggested method in multi-dimensional case is an order higher than the efficiency of other classic low-rank update quasi-Newton methods as shown by numerical examples on a Rosenbrock-type test-function with up to 1000 variables. The geometrical representation (hyperbolic approximation) in single variable case helps explaining the basic operations and a vector-space description is also given in multi-variable case.
Keywords: 
Subject: Computer Science and Mathematics  -   Applied Mathematics

1. Introduction

Root-finding methods are essential for solving a great class of numerical problems, such as data fitting problem with m sampled data d = d j ( j = 1 , , m ) and n adjustable parameters x = x i ( i = 1 , , n ) with m n . It leads to the problem of least-squares solving of an overdetermined system of nonlinear equations
f x = 0
( x R n and f : R n R m ( m n )) where the solution x * minimizes the difference f x 2 = ϕ x d 2 between the data d and a computational model function ϕ x . The system of simultaneous multi-variable nonlinear Equations (1.1) can be solved by the Newton’s method when the derivatives of f x are available analitically and a new iterate
x p + 1 = x p J p 1 f p
that follows x p can be determined, where f p = f ( x p ) is the function value and J p = J ( x p ) is the Jacobian matrix of f at x p in the p t h iteration step. It is well-known that the local convergence of Newton’s method is q-quadratic if the initial trial approximate x 0 is close enough to the solution x * , J ( x * ) is non-singular and J ( x ) satisfies the Lipschitz condition
J ( x ) J ( x * ) L x x *
for all x close enough to x * . However, in many cases, the function ϕ x is not an analytical function, the partial derivatives are not known and Newton’s method cannot be applied. Quasi-Newton methods are defined as the generalization of Equation (1.2) as
x p + 1 = x p B p 1 f p
and
B p x p = f p
where
x p = x p + 1 x p
is the iteration step length and B p is expected to be the approximate to the Jacobian matrix J p without computing derivatives in most cases. The new iterate is then given as
x p + 1 = x p + x p
and B p is updated to B p + 1 according to the specific quasi-Newton method. Martinez [1] has been made a thorough survey on practical quasi-Newton methods. The iterative methods of the form 1.4 that satisfy the equation
B p + 1 x p = f p + 1 f p
for all k = 0 , 1 , 2 , are called “quasi-Newton” methods and Equation (1.8) is called the fundamental equation of quasi-Newton methods (“quasi-Newton condition” or “secant equation”). However, the quasi-Newton condition doesn’t uniquely specify the updated Jacobian approximate B p + 1 and further constraints are needed. Different methods offer their own specific solution. One new quasi-Newton approximate x p + 1 will never allow full-rank update of B p + 1 because it is an n × n matrix and only n components can be determined from the secant equation, making it an underdetermined system of equations for the elements B i , j , p + 1   i , j = 1 , n if n > 1 .
The suggested new strategy is based on Wolfe’s [2] formulation of a generalized secant method. The function
x f ( x ) , w h e r e x R n a n d f : R n R n , n > 1
is locally replaced by linear interpolation through n + 1 interpolation base points A p , B p , k k = 1 , , n . The variables x and the function values f are separated into two equations and an auxiliary variable q A is introduced. Then the Jacobian approximate matrix B p is split into a variable difference X p and a function value difference F p matrix and the zero x p + 1 A of the p th interpolation plane is determined from the quasi-Newton condition 1.5 as
x p + 1 A f p A = X p F p q p A
where
x p + 1 A = x p + 1 A x p A
The auxiliary variable q p A is determined from the 2nd row of Equation (1.10) and the new quasi-Newton approximate x p + 1 A comes from the 1st row of this equation. Popper [3] made further generalization for functions
x f ( x ) , w h e r e x R n a n d f : R n R m , m n > 1
and suggested to use pseudo-inverse solution for the over determined system of linear equations (where n is the number of unknowns and m is the number of function values). The auxiliary variable q p A is determined from the 2nd row of Equation (1.10) as
q p A = F p + f p A
where . + stands for the pseudo-inverse, and the new quasi-Newton approximate x p + 1 A comes from the 1st row of this equation as
x p + 1 A = x p A X p F p + f p A
The new iteration continues with n + 1 new base points A p + 1 , B p + 1 , k   k = 1 , , n . Details are given in Section 3.
Ortega and Rheinboldt [4] stated that a necessary condition of convergence is that the interpolation base points should be linearly independent and they have to be “in general position” through the whole iteration process. Experiences show that the low-rank update procedures often lead to a dead end, because this condition is not satisfied. The purpose of the suggested new iteration strategy is to determine linearly independent base points providing that the Ortega and Rheinboldt condition is satisfied. The basic idea of the procedure is that another new approximate x p + 1 B is determined from the previous approximate x p + 1 A and a new system of n linearly independent base points is generated. The basic equations of the Wolfe-Popper formulation (Equation (1.10)) were modified as
x p + 1 f p A = T p X 0 0 T p F X p F p q p B
where
x p + 1 = x p + 1 B x p + 1 A
T p X = d i a g t p , i X = diag x p + 1 , i B x p + 1 , i A x p + 1 , i A x p , i A
and
T p F = diag t p , j F = d i a g f p + 1 , j B f p + 1 , j A f p + 1 , j A f p , j A
The auxiliary variable q p B is determined from the 2nd row of Equation (1.15) as
q p B = F p + T p F 1 f p A = j = 1 m F p , i , j + f p , j A t p , j F
and the new quasi-Newton approximate x p + 1 B comes from the 1st row of Equation (1.15) as
x p + 1 , i B = x p + 1 , i A + x p , i A 2 x p , i q p , i B = x p + 1 , i A x p , i A 2 j = 1 m x p , i F p , i , j + f p , j A t p , j F
i = 1 , , n . The details of the proposed new strategy (’T-Secant method’) is given in Section 4. It is different from the traditional secant method in a way, that all interpolation base points A p and B p , k k = 1 , , n are updated in each iteration (full-rank update) providing n + 1 new base points A p + 1 and B p + 1 , k for the next iteration. The key idea of the method is very simple. The function value f p + 1 A (that can be determined from the new secant approximate x p + 1 A ) measures the ’distance’ of the approximate x p + 1 A from the root x * (if f p + 1 A = 0 , then the distance is zero and x p + 1 A = x * ). The T-secant method uses this information so that the basic equations of the secant method are modified by a scaling transformation T , and an additional new estimate x p + 1 B is determined. Then the new approximates x p + 1 A and x p + 1 B are used to construct the n + 1 new interpolation base points A p + 1 and B p + 1 , k .
The T-secant procedure has been worked out for solving multi-variable problems, it can also be applied for solving single-variable ones, though. The geometrical representation of the latter one gives a good view to explain the mechanism of the procedure as shown in Section 5. It is a surprising result that the T-Secant modification corresponds to a hyperbolic function
z p ( x ) = a p x x p + 1 A + f p A
the zero of which gives the second approximate x p + 1 B in single variable case. A vector space interpretation is also given for the multvariable case in this section.
The general formulations of the proposed method are given in Section 6 and compared with the basic formula of classic quasi-Newton methods. It follows from Equation (1.14) that
S p x p A = f p A
where
S p = F p X p 1 = f 1 , 1 , p x 1 f n , 1 , p x n f 1 , m , p x 1 f n , m , p x n = f k , , j , p x i , p
is the Jacobian approximate of the traditional secant method. It follows from the 1 st and 2 nd rows of Equation (1.15) of the T-Secant method and from the Definition 1.23 of S p , that
S T , p x p A = f p A
is the modified secant equation, where
S T , p = T p F S p T p X 1 = T p F F p X p 1 T p X 1 = t j , p F t i , p X f k , j , p x i , p
It is well known that the single-variable secant method has asymptotic convergence for sufficiently good initial approximates x A and x B if f x doesn’t vanish in x x A x B and f x is continuous at least in a neighborhood of the zero x * . The super-linear convergence property has been proved in different ways and it is known that the order of convergence is α = 1 + 5 / 2 = φ (where φ = 1.618 . . . is the golden ratio). The convergence order of the proposed method is determined in Section 7 and it is shown that it has super-quadratic convergence with rate α T S = φ + 1 = φ 2 = 2.618 in single variable case. It is also shown for the multi-variable case in this section, that the second approximate x p + 1 B will always be in the vicinity of the classic secant approximate x p + 1 A , providing that the solution x * will evenly be surrounded by the n + 1 new trial approximates, and matrix S p + 1 will be well-conditioned.
A step-by-step algorithm is given in Section 8 and the results of numerical tests with a Rosenbrock-type test function demonstrates the stability of the proposed strategy in Section 9 for up to 1000 unknown variables. The Broyden-type efficiency (mean convergence rate) of the proposed method is studied in multi-variable case in Section 10 and it is compared with other classic rank-one update and line-search methods on the basis of available test data. It is shown in Section 11 how the new procedure can be used to improve the convergence of other classic multi-variable root finding methods (Newton-Raphson and Broyden methods). Concluding remarks are summarized in Section 12. Among others, the method has been used for the identification of vibrating mechanical systems (foundation pile driving [5], percussive drilling [6]) and found to be very stable and efficient even in case of large number of unknowns.
The proposed method needs n + 1 function value evaluations in each iteration and it is not using the derivative information of the function, like the Newton-Raphson method is doing. On the other hand, it needs n more function evaluations than the traditional secant method needs in each iteration. However, it is an apparent disadvantage, as the convergence rate considerably increases ( α TS 2.618 . . . ), furthermore the stability and the efficiency of the procedure has been highly improved.

2. Notations

Vectors and matrices are denoted by bold-face letters. Subscripts refer to components of vectors and matrices, superscripts A and B refer to interpolation base points. Vectors and matrices may also be given by their general elements. refers to a difference of two elements. x and X denotes unknown quantities, f and F denotes function values and matrices. q , q , t , and T denotes multiplier scalars, vectors and matrices. e , ε and E denotes approximate error, p is iteration counter, α is convergence rate, ε * is termination criterion. n is the number of unknowns, m is the number of function values, i , j , k and l are running indexes of matrix columns and rows. Superscripts S and TS refer to the traditional Secant-method and to the proposed T-Secant method receptively.

3. Secant method

The history of the secant method in single variable case is several thousands years old, its origin was found in ancient time. The idea of finding the scalar root x * of a scalar nonlinear function
x f ( x ) ( w h e r e x R 1 a n d f : R 1 R 1 )
by successive local replacement of the function by linear interpolation (secant line) gives a simple and efficient numerical procedure. It has an advantage that it doesn’t need the calculation of function derivatives, it only uses function values and the order of asymptotic convergence is super-linear with convergence rate α S 1.618 . . . .
The function f x is locally replaced by linear interpolation (secant line) through interpolation base points A and B and the zero x A of the secant line is determined as an approximate to the zero x * of the function. The next iteration continues with new base points, selected from available old ones. Wolfe [2] extended the scalar procedure to multidimensional
x f ( x ) , w h e r e x R n a n d f : R n R n , n > 1
and Popper [3] made further generalization
x f ( x ) , w h e r e x R n a n d f : R n R m , m n > 1
and suggested to use pseudo-inverse solution for the over determined system of linear equations (where n is the number of unknowns and m is the number of function values).
The zero x * of the nonlinear function x f ( x ) has to be found, where x R n and f : R n R m . Let x A be the initial trial for the zero x * , and let the function f x be linearly interpolated through n + 1 interpolation base points A x A f A and B k x k B f k B k = 1 , , n and be approximated / replaced by the interpolation “plane” y x near x * . One of the key ideas of the suggested numerical procedure is that interpolation base points B k x k B f k B are constructed by individually increment the coordinates x i A of the initial trial x A by an “initial trial increment” value x i i = 1 , , n as
x k , i B = x i A + x i
or in vector form as
x k B = x A + x k d k
where d k is the k t h Cartesian unit vector as shown on Figure 1.
It follows from this special construction of the initial trials x k B , that x k , i B x i A = 0 for i k and x k , i B x i A = x i for i = k providing that
x = x i , i B x i A = x i
is the “initial trial increment vector”. Let
f k = f k , j = f k , j B f j A
j = 1 , , m . Any point on the n dimensional interpolation plane y x can be expressed as
x y ( x ) = x A f A + X F q A
where
X = x k B x A = x 1 , 1 B x 1 A x n , 1 B x 1 A x 1 , n B x n A x n , n B x n A
F = f k B f A = f 1 , 1 B f 1 A f n , 1 B f 1 A f 1 , m B f m A f n , m B f m A
k = 1 , , n , q A is a vector with n scalar multipliers q i A , i = 1 , , n and as a consequence of Equation (3.5)
X = x k = x 1 0 0 x n = d i a g x i
is a diagonal matrix that has great computational advantage. It also follows from Definition 3.7 that
F = f k = f 1 , 1 f n , 1 f 1 , m f n , m
Let x p + 1 A be the zero of the n-dimensional interpolation plane y p x in the p th iteration with interpolation base points A p x p A f p A and B k , p x k , p B f k , p B . Then it follows from the zero condition
y p ( x p + 1 A ) = 0
and from the 2 nd row of Equation (3.8) that
F p q p A = f p A
and the vector q p A of multipliers q p , i A can be expressed as
q p A = F p + f p A
where . + stands for the pseudo-inverse. Let
x p A f p A = x p + 1 A x p A f p + 1 A f p A
be the iteration step-size of the secant method, then it follows from the 1 st row of Equation (3.8) and from Equation (3.15), that
x p A = X p q p A = X p F p + f p A
and from Definition 3.16 it follows that
x p + 1 A f p + 1 A = x p A + x p A f p A + f p A
and the new secant approximate x p + 1 A can be expressed from Equation (3.17) as
x p + 1 A = x p A + x p A
A base point A p + 1 x p + 1 A , f p + 1 A and base vector v p + 1 A = x p + 1 A , f p + 1 A can than be determined for the next iteration. In single variable case m = n = k = 1 with interpolation base points A p x p A , f p A and B p x p B , f p B , Equation (3.15) will have the form
q p A = f p A f p B f p A = f p A f p
and the new secant approximate
x p + 1 A = x p A + x p q p A = x p A x p f p f p A = x p A f p B x p B f p A f p B f p A
can be determined according to Equation (3.19). The procedure then continues with interpolation base points A p + 1 x p + 1 A , f p + 1 A and B p + 1 x p + 1 B , f p + 1 B .

4. T-Secant method

4.1. Single-variable case

The T-secant method is different from the traditional secant method in a way, that all interpolation base points A p and B p , k k = 1 , , n are updated in each iteration providing n + 1 new base points A p + 1 and B p + 1 , k for the next iteration. The key idea of the method is very simple. The function value f p + 1 A (that can be determined from the new secant approximate x p + 1 A ) measures the ’distance’ of the approximate x p + 1 A from the root x * (if f p + 1 A = 0 , then the distance is zero and x p + 1 A = x * ). The T-secant method uses this information to determine another approximate x p + 1 B . In single variable case m = n = k = 1 with interpolation base points A p and B p , the basic equation
f p q p A = f p A
of the secant method (Equation (3.14) in multi-variable case) is modified by a factor
t p f = f p + 1 A f p A
that expresses the ’improvement rate’ of the new approximate x p + 1 A to the original approximate x p A , providing the T-secant modified basic equation
t p f f p q p B = f p A
Then the T-secant multiplier
q p B = q p A t p f = f p A 2 f p + 1 A f p
can be determined. The other basic equation
x p A = x p q p A
of the secant method (Equation (3.17) in multi-variable case) with iteration step size
x p A = x p + 1 A x p A
is also modified in a similar way like in case of Equation (4.1) and Equation (4.3) by a factor
t p x = x p + 1 B x p + 1 A x p + 1 A x p A = x p + 1 x p A
that expresses the ’improvement rate’ of the new ’T-secant approximate change’ x p + 1 to the previous ’secant iteration step size’ x p A , providing a new basic equation
x p A = t p x x p q p B
from which
x p A = x p + 1 x p A x p q p B
and
x p + 1 A x p A = x p + 1 B x p + 1 A x p + 1 A x p A x p B x p A f p A 2 f p + 1 A f p B f p A
By re-ordering Equation (4.10), the T-secant approximate
x p + 1 B = x p + 1 A + x p A 2 x p q p B = x p + 1 A x p + 1 A x p A 2 f p B f p A f p + 1 A x p B x p A f p A 2
can be determined and it is used to update the original interpolation base point B p to B p + 1 . The new iteration will then continue with new base points A p + 1 and B p + 1 . Note, that it follows from Equations (4.4), (4.5) and (4.11) that
x p + 1 = x p + 1 B x p + 1 A = x p A 2 x p q p B = t p f x p A 2 x p q p A = t p f x p A

4.2. Multi-variable case

In multi-variable case m n > 1 with n + 1 interpolation base points A p x p A f p A and B p , k x p , k B f p , k B k = 1 , , n , the basic equations of the secant method (Equation (3.14) and 3.17) are modified as
T p F F p q p B = f p A
and
x p A = T p X X p q p B
Then a vector based equation can be formulated like in case of the traditional secant method (see Equation (3.8)) in a form
x x z x = x A f A + T X 0 0 T F X F q B
where X and F are defined in 3.9 and 3.10, z x is a function with zero at x p + 1 B and the diagonal transformation matrix in the p th iteration is
T p = T p X 0 0 T p F
with T p X and T p F sub-diagonals, where
T p X = d i a g t p , i X = diag x p + 1 , i B x p + 1 , i A x p + 1 , i A x p , i A = diag x p + 1 , i x p , i A
T p F = diag t p , j F = d i a g f p + 1 , j B f p + 1 , j A f p + 1 , j A f p , j A
and T p F is approximated with the assumption f x y p x p + 1 A z p x p + 1 B and according to the conditions y p x p + 1 A = 0 and z p x p + 1 B = 0 as
T p F diag z p , j x p + 1 B f p + 1 , j A y p , j x p + 1 A f p , j A = d i a g f p + 1 , j A f p , j A
i = 1 , , n , j = 1 , , m , where f p , j A 0 . The vector of T-secant multipliers
q p B = F p + T p F 1 f p A = j = 1 m F p , i , j + f p , j A t p , j F
can be determined from Equation (4.13) , where . + stands for the pseudo-inverse ( F p + has already been calculated when q p A was determined from Equation (3.15)). The i th element of the new approximate x p + 1 B can be expressed from the i th row of Equation (4.14) as
x p , i A = x p + 1 , i x p , i A x p , i q p , i B = x p + 1 , i B x p + 1 , i A x p , i A x p , i q p , i B
and the T-Secant approximate x p + 1 B can be calculated as
x p + 1 , i B = x p + 1 , i A + x p , i A 2 x p , i q p , i B = x p + 1 , i A x p , i A 2 j = 1 m x p , i F p , i , j + f p , j A t p , j F
where x p , i 0 and q p , i B 0   i = 1 , , n . Then the next iteration continues with the new trial increment vector
x p + 1 = x p + 1 B x p + 1 A
and with n + 1 new interpolation base points A p + 1 x p + 1 A f p + 1 A B k , p + 1 x k , p + 1 B f k , p + 1 B k = 1 , , n . Figure 1 shows the formulation of a set of new base vectors x k , p + 1 B from x p + 1 A and x p + 1 B in n = 3 case.
Table 1. Summary of the basic equations (single- and multi-variable cases).
Table 1. Summary of the basic equations (single- and multi-variable cases).
Single-variable m = n = 1 Multi-variable m n > 1 Equations
1 x p A x p A
2 x p B x p , k B = x p A + x p , k d k 3.5
3 x p = x p B x p A X p = x p , k B x p A = d i a g x p , i 3.9, 3.11
4 f p = f p B f p A F p = f p , 1 , 1 f p , n , 1 f p , 1 , m f p , n , m 3.7, 3.10, 3.12
5 f p q p A = f p A F p q p A = f p A 4.1, 3.14
6 q p A = f p A f p q p A = F p + f p A 3.20, 3.15
7 x p + 1 A = x p A + x p q p A x p + 1 A = x p A + X p q p A 3.21, 3.19
8 x p A = x p + 1 A x p A x p A = x p + 1 A x p A 4.6
9 t p f = f p + 1 A f p A T p F = d i a g f p + 1 , j A f p , j A 4.2, 4.19
10 t p f f p q p B = f p A T p F F p q p B = f p A 4.3, 4.13
11 q p B = q p A t p f = f p A 2 f p + 1 A f p q p B = F p + T p F 1 f p A 4.4, 4.20
12 t p x = x p + 1 B x p + 1 A x p + 1 A x p A = x p + 1 x p A T p X = diag x p + 1 , i x p , i A 4.7, 4.17
13 x p A = t p x x p q p B x p A = T p X X p q p B 4.8, 4.14
14 x p + 1 B = x p + 1 A + x p A 2 x p q p B x p + 1 , i B = x p + 1 , i A + x p , i A 2 x p , i q p , i B 4.11, 4.22

5. Geometry

5.1. Single-variable case

The T-secant procedure has been worked out for solving multi-variable problems, it can also be applied for solving single-variable ones, though. The geometrical representation of the latter one gives a good view to explain the mechanism of the procedure.
Find the scalar root x * of a nonlinear function x f ( x ) , where x R 1 and f : R 1 R 1 . Let the function f ( x ) be linearly interpolated through initial base points A p x p A , f p A and B p x p B , f p B providing a “secant” line y p ( x ) as shown on Figure 2, where f p A = f ( x p A ) and f p B = f ( x p B ) are the corresponding function values. An arbitrary point of the secant y p ( x ) can be expressed as
x y p x = x p A f p A + x p f p q p A
where q p A is a scalar multiplier. Let a new approximate x p + 1 A be the root of the secant y p ( x ) and let
x p A = x p + 1 A x p A
be the iteration step size. It follows from condition
y p ( x p + 1 A ) = 0
and from the 2 nd row of Equation (5.1) that
f p q p A = f p A
and the scalar multiplier can be determined as
q p A = f p A f p
From the 1 st row of Equation (5.1), the iteration step size is given as
x p A = x p q p A
and the new approximate can be expressed as
x p + 1 A = x p A + x p A
A new base point A p + 1 x p + 1 A , f p + 1 A (see Figure 2) can than be determined for the next iteration. Two out of the three available base points A p B p A p + 1 are used for the next iteration by omitting either A p or B p in case of the traditional secant method. Decision is not obvious and it may cause that the iteration will be unstable and / or will not converge to the solution. Instead, an additional new approximate x p + 1 B is determined by the T-secant procedure as a root of the function z p x near the first secant approximate x p + 1 A , and iteration continues with new base points A p + 1 x p + 1 A , f p + 1 A and B p + 1 x p + 1 B , f p + 1 B . An arbitrary point of the function z p x can be expressed as
x x z p x = x p A f p A + t x 0 0 t f x p f p q p B
where the transformation scalars for x p and f p at x = x p B are
t p x = x p + 1 x p A = x p + 1 B x p + 1 A x p + 1 A x p A a n d t p f = f p + 1 A f p A
Then it follows from condition
z p ( x p + 1 B ) = 0
and from the 2 nd row of Equation (5.8) that
t p f f p q p B = f p A
and
q p B = f p A t p f f p = f p A 2 f p + 1 A f p B f p A
The new approximate x p + 1 B can then be expressed from the 1 st row of Equation 5.8, as
x p + 1 B = x p + 1 A + x p A 2 x p q p B
The new base point B p + 1 x p + 1 B , f p + 1 B (see Figure 2) then can be determined. Interpolation base points A p + 1 and B p + 1 are used for the next iteration. The scalar multiplier q p B can be expressed from Equation (5.13) as
q p B = x p + 1 A x p A 2 x p B x p A x p + 1 B x p + 1 A
By substituting it into the 2 nd row of Equation 5.8 and changing x p + 1 B to x , it turns to a hyperbolic function
z p ( x ) = a p x x p + 1 A + f p A
with vertical and horizontal asymptotes x p + 1 A and f p A , where
a p = x p + 1 A x p A 2 f p B f p A x p B x p A f p + 1 A f p A
and the root x p + 1 B of the function z p ( x ) will be in the vicinity of x p + 1 A in “appropriate distance” that is regulated by the function value f p + 1 A (see Figure 2). This virtue of the T-secant procedure ensures an automatic mechanism for having base vectors in general positions through the whole iteration process providing stable and efficient numerical performance.

5.2. Multi-variable case

Find the root x * of a nonlinear function x f ( x ) , where x R n and f : R n R m . Let the function f ( x ) be linearly interpolated through n + 1 base points A p x p A , f p A and B k , p x k , p B , f k , p B in the R n + m space ( f ( x ) space) in the p th iteration as shown on Figure 3, where k = 1 , , n . Given a set of approximates x p A and
x k , p B = x p A + x k , p d k
in the R n space ( x space) with k = 1 , , n , where d k is the k t h Cartesian unit vector. Let the expression
F p q p A = f 1 , 1 , p f n , 1 , p f 1 , m , p f n , m , p q p A
represent the linear combination q p A = q p , k A T of n column vectors
f k , j , p = f k , p = f k , p B f k , p A
in the R m space ( f space) with k = 1 , , n column index and with j = 1 , , m row index and the expression
X p q p A = x 1 , 1 , p x n , 1 , p x 1 , n , p x n , n , p q p A
represent the same linear combination of n column vectors
x k , j , p = x p , k = x p , k B x p , k A
with k = 1 , , n column index and with j = 1 , , n row index. The linear combination q p A is determined from Equation (3.15) in step S 1 (see Figure 3) providing a new approximate
x p + 1 A = x p + 1 , k A = x p A + x p A
for the solution x * as shown on Figure 3 and the corresponding f p + 1 A vector is also determined in step S 2 (see Figure 3). The column vectors f k , p of F p are then modified by a non-uniform scaling transformation
T p F = d i a g f p + 1 , j A f p , j A
and a new linear combination q p B = q p , k B T is determined from Equation (4.20) in step S 3 (see Figure 3) providing a new approximate x p + 1 B for the solution x * with elements
x p + 1 , k B = x p + 1 , k A + x p , k A 2 x p , k q p , k B
A new set of n + 1 approximates x p + 1 A and
x k , p + 1 B = x p + 1 A + x k , p + 1 d k
( k = 1 , , n ) can then be generated with
x p + 1 = x k , p + 1 = x p + 1 B x p + 1 A
for the next iteration.

5.3. Single-variable example

An example is given with function x f ( x ) , where x   R 1 f : R 1 R 1 and
f x = x 3 2 x 5
with root x * 2.09455 . . . . Figure 4 summarizes the results of the first two iterations (left : x 1 A is the zero of y 0 ( x ) , x 1 B is the zero of z 0 ( x ) , right : x 2 A is the zero of y 1 ( x ) , x 2 B is the zero of z 1 ( x ) ). Iterations were made with initial approximates x 0 A = 3.0 and x 0 B = 1.0 providing f 0 A = 16 ( p = 0 ). The first secant approximate x 1 A = 1.545 . . . is found as the zero of the first secant y 0 x and the first T-secant appropriate x 1 B = 1.945 . . . is found as the zero of the first hyperbola function z 0 ( x ) (Figure 4, left). Iteration then goes on ( p = 1 ) with new interpolation base point x 1 A = 1.545 . . . and x 1 B = 1.945 . . . providing f 1 A = 4.3997 . . . , and new approximates x 2 A = 2.158 . . . and x 2 B = 2.0556 . . . are found as the zeros of the second secant and the second hyperbola function y 1 ( x ) and z 1 ( x ) receptively (Figure 4, right). The next iteration ( p = 2 ) will then continue with interpolation base point x 2 A = 2.158 . . . . . . and x 2 B = 2.0556 . . . and with f 2 A = 0.7367 . . . . The iterated values of f p A x p A and x p B are also indicated on the diagrams. Further diagrams of this example are shown in Section 7.3.

6. General formulations

Re-ordering Equation (3.17) gives the general equation
F X 1 x A = f A
of the secant method. The initial trials are constructed according to Equation (3.5) providing that X is a diagonal matrix with elements x i = x i , i B x i A   i = 1 , , n . Let the “Jacobean-type” matrix of the Secant-method be defined as
S = F X 1
S = f 1 , 1 B f 1 A x 1 , 1 B x 1 A f n , 1 B f 1 A x n , 1 B x 1 A f 1 , m B f m A x 1 , 1 B x 1 A f n , m B f m A x n , 1 B x 1 A = f 1 , 1 x 1 f n , 1 x n f 1 , m x 1 f n , m x n = f k , j x i
i = 1 , , n , j = 1 , , m , k = 1 , , n and
S + = X F +
Then Equation (6.1) simplifies as
S x A = f A
and
x A = S + f A
The i th element of the new approximate x p + 1 A in the p th iteration will then be
x p + 1 , i A = x p , i A + x p , i A = x p , i A j = 1 m S p , i , j + f p , j A
i = 1 , , n . It follows from the 1st row of Equation 4.15 of the T-Secant method, from Equation 4.13 and from the Definition 6.4 of S + , that the p th iteration step-size is
x p A = T p X S p + T p F 1 f p A
and
T p F S p T p X 1 x p A = f p A
Let the modified “Jacobean-type” matrix of the T-Secant-method be defined as
S p , T = T p F S p T p X 1
S p , T = f p + 1 , 1 A f p , 1 A 0 0 0 0 0 0 f p + 1 , m A f p , m A f p , 1 , 1 x p , 1 f p , n , 1 x p , n f p , 1 , m x p , 1 f p , n , m x p , n x p , 1 A x p + 1 , 1 0 0 0 0 0 0 x p , n A x p + 1 , n
and in condensed form with general matrix elements (without the p index) :
S T = T F S T X 1 = d i a g t j F f k , j x i d i a g 1 t i X = t j F t i X f k , j x i
i = 1 , , n , j = 1 , , m , k = 1 , , n and
S T + = T X S + T F 1
Equations (6.8) and (6.9) then can be re-written as
x A = S T + f A
and
S T x A = f A
in a similar form like in case of the traditional secant method (Equations 6.6 and 6.5). The i th element x p + 1 , i B of the 2 nd new approximate x p + 1 B in the p th iteration will then be
x p + 1 , i B = x p + 1 , i A + x p , i B = x p + 1 , i A x p , i A 2 j = 1 m S p , i , j + f p , j A t p , j F
where t p , j F 0 , j = 1 , , m and i = 1 , , n . Note, that the T-secant modification in the secant method “Jacobean-type” matrix 6.3 is made with multipliers t p , j F = f p + 1 , j A f p , j A and t p , i X = x p + 1 , i x p , i A to the difference quantities f p , k , j and x p , i . The basic equations of the secant method and the T-secant method are summarized in Table 2 : rows 1-4 are the elements (matrix T ) of the basic equations, row 5-6 are the explicit basic equations, row 7 : the Jacobean-type matrices, row 8-9 are the general formulations of the basic equations.

7. Convergence

7.1. Single-variable case

As it was shown in Section 4 (Equation (4.12)), the p th iteration step length of the 2 nd new approximate x p + 1 B is
x p + 1 = t p f x p A
The secant method is super-linear convergent, so the new approximate x p + 1 A is expected to be a much better approximate to the solution x * then the previous one ( x p A ). Thus
f p + 1 A f p A
and
t p f = f p + 1 A f p A
is expected to be a “small positive number”. It means, that the T-secant approximate x p + 1 B will always be in the vicinity of the classic secant approximate x p + 1 A and the approximate errors of the new approximates will be of similar order, providing that the solution x * will evenly be surrounded by the 2 new trial approximates x p + 1 A and x p + 1 B .

7.2. Convergence rate

It is well known that the single-variable secant method has asymptotic convergence for sufficiently good initial approximates x A and x B if f x doesn’t vanish in x x A x B and f x is continuous at least in a neighborhood of the zero x * . The super-linear convergence property has also been proved in different ways and it is known that the order of convergence α S = 1 + 5 / 2 with asymptotic error constant
C = 1 2 f ξ f ξ 1 α
The order of convergence of the T-secant method is determined in this section. Let p be the iteration counter and the approximate error be defined in the p th iteration as
e p = x p x *
It follows from Equation (3.21) and from Definition 7.5 that the error e p + 1 A of the new secant approximate x p + 1 A can be expressed as
e p + 1 A = e p A f p B e p B f p A f p B f p A = x p B x p A f p B f p A f p B / e p B f p A / e p A x p B x p A e p A e p B
It follows from the mean value theorem, that the first factor of right side of Equation (7.6) can be replaced with 1 / f η p , where η p x p A , x p B , if f x is continuously differentiable on x p A , x p B and f η p 0 . Let the function f x be approximated around the root x * by a 2 nd order Taylor series expansion as
f p = f e p + x * = f x * + e p f x * + 1 2 e p 2 f ξ p
where ξ p x p A , x p B , x * in the remainder term. Since f x * = 0 , it follows from Equation (7.7) that
f p e p = f x * + 1 2 f ξ p e p
Substituting this expression to Equation (7.6), and since e p B e p A = x p B x p A we get
e p + 1 A = 1 2 f ξ p f η p e p B e p A x p B x p A e p A e p B = C p e p A e p B
and
C p = 1 2 f ξ p f η p
If the series x p A converges to x * , then ξ p and η p x * with increasing iteration counter p , and
C p 1 2 f x * f x * = c o n s t a n t
It follows from Equation (4.11) with Definition 7.5 and from the mean value theorem (with η p 1 x p 1 A , x p 1 B , if f x is continuously differentiable on x p 1 A , x p 1 B ) that
x p B = x p A x p A x p 1 A f p 1 A 2 f η p 1 f p A
and the error e p B of the T-secant approximate x p B can be expressed as
e p B = e p A e p A e p 1 A f p 1 A 2 f η p 1 f p A
With the Taylor-series expansion 7.7 for f p 1 A and f p A , where ξ p 1 x p 1 A , x p 1 B , x * and ξ p x p A , x p B , x * in the remainder term, we get
e p B = e p A e p A e p A e p 1 A e p 1 A 2 γ p
where
γ p = f x * f η p 1 + 1 2 f ξ p f η p 1 e p A f x * f η p 1 + 1 2 f ξ p 1 f η p 1 e p 1 A 2
and f η p 1 0 . If the series x p A converges to x * , then with increasing iteration counter p , ξ p , ξ p 1 , η p 1 x * and e p A , e p 1 A 0 , implies that
f x * f η p 1 f x * f x * = 1
and γ p 1 . By substituting e p B (Equation (7.14)) into Equation (7.9) gives
e p + 1 A = C p e p A e p A e p A e p A e p 1 A e p 1 A 2 γ p
and by re-arranging
e p + 1 A = C p e p A γ p e p A 2 2 e p 1 A e p A e p 1 A 2 + 1 γ p e p A
with x p A converges to x * , γ p 1 and the above equation simplifies as
e p + 1 A = C p e p A e p A 2 2 e p 1 A e p A e p 1 A 2
It means that e p + 1 A depends on e p A and e p 1 A , and by assuming an asymptotic convergence, a power law relationship
e p + 1 A = C e p A α
can be established, where C is the asymptotic error constant and α is the convergence rate or also called “convergence order” of the iterative method. It also follows from Equation (7.20), that
e p A = C e p 1 A α
and
e p 1 A = e p A C 1 α
Let E = e p A be introduced for simplifying purpose, then it follows from Equations (7.17), (7.20), (7.21) and (7.22) that
E α = C p C E 3 2 E C 1 α E E C 2 α
where C p and C are constants and if the series x p A converges to x * , then with increasing iteration counter p , E 0 + . Taking the logarithms of both sides of Equation (7.23) and dividing by lnE gives
α = ln C p C ln E + 3 2 α · ln E C ln E + ln ( 2 E C 1 α E ) ln E
If x p A series converges to x * , then with increasing iteration counter p , E 0 + , ln E and
lim E 0 + ln C p C ln E = 0
lim E 0 + ln E C = lim E 0 + ln E ln C = ln E
lim E 0 + ln 2 E 1 α E ln E = 1 α
and Equation (7.24) simplifies as
α 3 + 1 α = 0
with root (convergence rate of the T-Secant method) :
α T S = 3 + 5 2 2.618033988 = α S + 1 = φ 2
where α S = φ 1.618033988 is the convergence rate of the traditional secant method and ’ is the well known golden ratio. It follows from Equation (7.24) that the actual values α * of α T S depend on the approximate error E = e A . Convergence rates α * ( E ) were determined for different E values and shown on Figure 5. The upper bound α T S = α S + 1 = 2.618 at E 0 + is also indicated (horizontal dashed red line).

7.3. Single-variable example

An example is given for demonstration purpose with a single-variable test function 5.26 with root x * 2.09455 . . . . Iterations were made with initial approximates x 0 A = 3.5 and x 0 B = 2.5 and the convergence rates α S , α N and α T S were determined for the traditional secant method (Table 3, Figure 6), for the Newton-Raphson method (Table 4, Figure 7) and for the T-secant method (Table 5, Figure 8) respectively, the cumulative number of function value ( N f ) and derivative function value ( N f ) calculations are also indicated in the tables. Calculated convergence rates agree well with theoretical values α S = 1.62 . . . , α N = 2.0 and α T S = 2.62 . . . . Figure 9 summarizes the results of iterations with three different methods (Secant, Newton-Raphson and T-Secant). Two groups of graphs show the absolute approximate error e p A decrease and the calculated convergence rates α for the three compared methods. Results demonstrate that the convergence rate of the T-Secant method is higher than the convergence rate of the Newton-Raphson method.

7.4. Multi-variable convergence

Matrix S corresponds to a divided difference approximation to the Jacobian. It is known (e. g. from Dennis-Schnabel [7]) that these values give a second order approximation to the derivative in the midpoint. When considering Newton’s iteration, it is assumed that the Jacobian has inverse in a neighbourhood of x * . If that condition holds, than there are chances that the approximate Jacobian has also inverse in the same neighbourhood.
It follows from Equations (6.7) and (6.16), that the i th elements of the iteration step lengths of the new approximates x p + 1 A and x p + 1 B in the p th iteration are
x p , i A = j = 1 m S p , i , j + f p , j A
and
x p + 1 , i = x p , i A 2 j = 1 m S p , i , j + f p , j A t p , j F
i = 1 , n . It is known, that the secant method is locally q-super-linear convergent, so the new approximate x p + 1 A is expected to be a much better approximate to the solution x * then the previous approximate x p A . Thus
f p + 1 A f p A
and the diagonal elements
t p , j F = f p + 1 , j A f p , j A 1
of the transformation matrix T p F j = 1 , m are expected to be “small numbers”. Let the scalar multipliers μ i be introduced so that
μ i j = 1 m S p , i , j + f p , j A t p , j F = j = 1 m S p , i , j + f p , j A
Then Expression 7.31 for the i th element of the iteration step length x p + 1 of the new approximate x p + 1 B in the p th iteration simplifies as
x p + 1 , i = x p , i A 2 1 μ i j = 1 m S p , i , j + f p , j A = μ i x p , i A 2 x p , i A = μ i x p , i A
where
μ i 1
i = 1 , n and it follows from the above derivations that
x p + 1 μ x p A
It means, that the T-secant approximate x p + 1 B will always be in the vicinity of the classic secant approximate x p + 1 A and the approximate errors of the new approximates will be of similar order, providing that the solution x * will evenly be surrounded by the n + 1 new trial approximates x p + 1 A and x k , p + 1 B   k = 1 , n and matrix S p + 1 will be well-conditioned.
Figure 10. Geometrical representation of the T-secant method convergence in multi-variable case (analogous to the convergence proof figure Dennis-Schnabel [7], p. 180.).
Figure 10. Geometrical representation of the T-secant method convergence in multi-variable case (analogous to the convergence proof figure Dennis-Schnabel [7], p. 180.).
Preprints 90608 g010

8. Algorithm

Let p be the iteration counter, ε * be the error bound for termination criterion and
e p A = x p A x *
be the approximate error vector of approximate x A in the p th iteration with elements e p , i A   i = 1 , , n . Let the scalar approximate error
ε p = e p A 2 n = i = 1 n e p , i A 2 n
be defined, where 2 is Euclidean norm and let the iteration be terminated when
ε p < ε *
holds. Let x p A be the initial trial and x p be the trial increment in the p t h iteration. Choose T min and T max as lower and upper bounds for t p , j F   j = 1 , m and let f min and q min be lower bounds for f p , j A   j = 1 , , m and q p , i B i = 1 , , n respectively.
  • Initial step
    Let p = 0 and let the initial trial x p A = x p , 1 A x p , n A and the initial trial increment x p = x p , 1 x p , n be given. Calculate the corresponding function values f p A and assume that f min < f p , j A ( j = 1 , m ) .
  • Step 1 : Generate a set of n additional initial trials (interpolation base points)
    x p , k B = x p A + x p , k · d k
    and evaluate function values f p , k B   k = 1 , , n .
  • Step 2 (Secant) : Construct matrix
    F p = f p , k i = f p , k B f p A
    then calculate q p A from Equation 3.15. Let q min < q p , i A , and determine x p + 1 A from Equation (3.19) and ε p from Equation 8.2.
  • Step 3 : If ε p < ε * then terminate iteration, else continue with Step 4.
  • Step 4 (T-secant) : Calculate f p + 1 A and T p F from Equation (4.19). Let T min < t p , j F < T max and determine q p B from Equation 4.20 ( F p + has already been calculated when q p A was determined from Equation (3.15)). Let q min < q p , i B , calculate x p + 1 B from Equation (4.22).
  • Step 5 : Let the new initial trial be
    v p + 1 A = x p + 1 A f p + 1 A
    and the new initial trial increment be
    x p + 1 = x p + 1 B x p + 1 A
    and continue iteration with Step 1.
Iteration constants δ m i n f m i n q m i n T m i n T m a x are necessary to avoid division by zero and to avoid computed values be near the numerical precision. If p max is the number of necessary iterations for satisfying the termination criterion ε p < ε * and n is the number of unknowns to be determined, then T-Secant method needs n + 1 function evaluations in each iterations and altogether
N f = p max n + 1
function evaluations to reach the desired termination criterion. p max is depending on many circumstances such as the nature of the function f x , termination criteria ( ε * or others), the distance of the initial trial x A from the solution x * and from the iteration constants T min , q min A , .

9. Numerical tests results

9.1. Rosenbrock test function

A variant of the Rosenbrock function [8] has been used for testing the numerical performance of the new method. Determine the global minimum of the function
R x = i = 1 N 1 100 · x i + 1 x i 2 2 + 1 x i 2
where x = x 1 x N R N and N 2 . R x has exactly one minimum for N = 3 at x * = 1 1 1 and exactly two minima for 4 N 7 , a global minimum of all ones and a local minimum near x ^ = 1 1 1 . The sum of squares R x will be minimum when all terms are zero, such that the minimization of the function R x is equivalent with finding the zero of a function x f ( x ) , where x R N , f : R N R 2 ( N 1 ) , and
f ( x ) = f 2 i 1 ( x ) f 2 i ( x ) = 10 · ( x i + 1 x i 2 ) 1 x i
i = 1 , , N 1 . For N > 7 , the function R ( x ) has exactly one global minimum x * = 1 1 and has some local minima with some x j * = 1 and with x i * = 1 for all other unknowns. The results were obtained by least squares solving of the simultaneous system of nonlinear equations f ( x ) = 0 by the T-Secant method.

9.2. N = 2 , N = 3 and N = 10 examples

In case N = n = m = 2 , iterations terminated after N f = 6 function evaluations ( p max = 2 iterations) in most cases. f 2 x = 1 x 1 is a linear function and the first T-secant iteration p = 0 finds the exact value of x 1 in one step, then f 1 x = 10 x 2 x 1 2 also becomes linear. The exact value of x 2 was then determined in one additional step.
Let N = n = 3 and m = 4 , T min = 0.01 and ε * = 10 14 . Let p = 0 , and
x 0 , i = 0.05 · x 0 , i A
( i = 1 , 3 ) . The number of necessary function evaluations N f varied between 20 36 within p max = 5 9 iterations for different initial trials x p A . Iteration results are summarized in Table 6 and in Figure 11 with initial trial x 0 A = x 0 , i A = 2.0 1.5 2.5 . Termination criterion ε p < * was satisfied after p max = 5 iterations with N f = 20 function evaluations.
Let N = n = 10 and m = 18 . Calculations were made with different, manually constructed initial trials x 0 A = x 0 , i A . Figure 12 (Left) shows the variation of x p , i A for initial trial x 0 A = 2 . 0 1 . 5 2 . 5 1 . 5 1 . 2 3 . 0 3 . 5 2 . 5 2 . 0 3 . 5 . Iteration terminated after N f = 154 function evaluations ( p max = 14 iterations) for ε p < ε * = 10 14 condition. Table 7 shows a set of further initial trials for numerical tests. Test “3” failed probably due to the large distance from the global optimal solution. Test “4” found a local zero x * = 1 1 1 1 1 1 1 1 1 1 . Figure 12 (Right) summarizes the results of numerical tests “1-6”. Graphs show the iteration paths in the l g e p A R p x p A plane. They have an initial part where the variation of R p x p A seems “chaotic”, while below e p A 0.01 and R p x p A  0.001 the iterations run on similar paths.

9.3. Large N 200 500 1000 examples

A series of numerical test has been performed with large number of unknown variables. The values of the initial trials x 0 A = x 0 , i A , i = 1 , N were generated as
x 0 , i A = x i * + L 1 · R a n d o m 1 2 5 + L 2
where “ Random ” is a random real number ( 0 Random < 1 ), L i i = 1 , 2 are parameters regulating the size and location of the interval in which the initial trial values are expected to variate. x * = x i * = 1 1 i = 1 , N is the known global optimal solution. Table 8 shows the results of T-Secant iterations with N = 200 and with initial trials x 0 A : 0.1 x 0 , i A 19.9 ( L 1 = 99 , L 2 = 9 ). Figure 13 (Left) shows the variation of variables x p A through T-Secant iterations. The iteration counter p value is indicated below the graphs for iterations. Figure 13 (Right) shows the decrease of the approximate error e p A = e p , i A i = 1 , , 200 , with the p iteration counter indication below the graphs. Table 9 shows the results of iterations with N = 1000 and initial trials x 0 A : 0.5 x 0 , i A 1.5 ( L 1 = 5 , L 2 = 0 ). Figure 14 summarizes the results of numerical tests with large number of unknowns N = 200 500 1000 . The norm ε p of the approximate error e p A decrease is shown and the number of function value evaluations N f is indicated for N = 200 b l u e 500 r e d 1000 g r e e n and for initial trials x 0 A : 0.5 x 0 , i A 1.5 (solid line) and x 0 A : 0.1 x 0 , i A 19.9 (dashed line).

10. Efficiency

Very limited data are available to compare the performance of the T-secant method with other methods, especially for large number of unknowns. Broyden [9] suggested the mean convergence rate
L = 1 N f ln R x 0 A R x p m a x A
as a measure of efficiency of a method for solving a particular problem, where N f is the total number of function evaluations, x 0 A is the initial trial, x p max A is the last trial for the solution x * when the termination criteria is satisfied after p max iterations. R x is the Euclidean norm of f x . Efficiency results were given by Broyden [9] for the Rosenbrock function for N = 2 and for x 0 A = 1.2 1.0 . The calculated convergence rates for the two Broyden method variants [9], for the Powell’s method [10], for the adaptive coordinate descent method [11] and for the Nelder-Mead simplex method [12] were compared with the calculated values for the T-secant method in Table 10. Rows 1 5 are data from referenced papers, rows 6 8 are T-secant results with the referenced initial trials and rows 9-15 are calculated data for N > 2 . Results show that the mean convergence rate L (Equation (10.1)) for N = 2 is much higher for the T-secant method ( 5.5 6.9 ) than for the other listed methods ( 0.1 0.6 ), however it is obvious that the convergence rate values decrease rapidly with increasing N values (more unknowns need more function evaluations). A modified convergence rate
L N = N L = N N f ln R x 0 A R x p max A
can be used as a more N independent measure of efficiency (see Table 10) than the quantity L . The values of L and L N are at least 10 times larger for the T-secant method than for the other listed methods for N = 2 . Note that the efficiency measures ( L and L N ) are also depending on the initial conditions (distance of the initial trial set from the optimal solution, termination criterion). Results from large number of numerical tests indicate an average L N value around 7.4 with standard deviation 3.7 for the T-secant method even for large N values. It has to be noted that if the value of R x p max A is zero, then the mean convergence rates ( L and L N ) are not countable (zero in the denominator). A substitute value 10 25 was used when iteration ended with R x p max A = 0 in the sample examples.

11. Discussions

11.1. General

The suggested new procedure needs the usual approximate x p + 1 A be determined by any of a classic quasi-Newton iterative method (Wolfe-Popper-Secant, Broyden, etc.). By using the “information” f x p + 1 A = f p + 1 A , an additional and independent approximate x p + 1 B is determined, that provides the possibility for a full-rank update of the exact or approximate derivatives ( S p for Secant or B p for Broyden). Results and experience show that the new procedure considerably accelerates the convergence and the efficiency of the classic methods, and the full-rank update technique increases the stability of the iterative procedure. In multi-variable-case, it follows from Equation (6.8) that
T p X 1 x p A = S p + T p F 1 f p A
and in explicit form after re-arrangement
x p , i A 2 x p + 1 , i B x p + 1 , i A = S p , i , j + f p , j A t p , j F
Then the i th element of the new approximate x p + 1 B can be expressed from the i th row of the above equation as
x p + 1 , i B = x p + 1 , i A x p , i A 2 j = 1 m S p , i , j + f p , j A t p , j F
The mechanism of the procedure can be resembled to the mechanism of an engine’s turbocharger that is powered by the flow of exhaust gases (analogous to f p + 1 A or t p , j F ).

11.2. Newton method

Matrix S in the general formula 6.5 gives a direct connection between secant and Newton method, as differences go to differentials,
S = f k , j x i = S i , j J = f k , j x i = J i , j
where J is the Jacobian matrix the function f : R n R m ( m n ) with k and i column and with j row indeces respectively. It follows from formula 6.12 of matrix S T that the proposed full-rank update procedure can also be applied to the Newton method as
S T = t j F t i X f k , j x i J T = t j F t i X f k , j x i
where J T is the modified Jacobian matrix of the ’T-Newton’ method. In single-variable case, with approximate x p A in the p th iteration, with function value f p A = f x p A and with derivative function value f p A = f x p A , the new Newton-Raphson approximate can be expressed as
x p + 1 A = x p A x p f p f p A = x p A f p A f p A
and the iteration step size is
x p A = x p + 1 A x p A
With the hyperbolic function (Equation (5.15))
z p ( x ) = a p x x p + 1 A + f p A
where
a p = x p + 1 A x p A 2 f p A f p + 1 A f p A
( f p / x p is replaced by f p A ), the new ’T-Newton’ approximate is
x p + 1 B = x p + 1 A x p A 2 f p A f p + 1 A f p A 2
( f p / x p is again replaced by f p A ) similarly like Equation (4.11) in case of the T-secant method. It can be seen from Table 11 and Table 12 that the convergence rate is be improved from α N = 2 to α T N = 3 . In multi-variable case, it follows from Equation (6.8) ( S p + is replaced by J p + ) that
T p X 1 x p A = J p + T p F 1 f p A
and in explicit form after re-arrangement
x p , i A 2 x p + 1 , i B x p + 1 , i A = J p , i , j + f p , j A t p , j F
Then the i th element of the new the ’T-Newton’ approximate x p + 1 B can be expressed from the i th row of the above equation as
x p + 1 , i B = x p + 1 , i A x p , i A 2 j = 1 m J p , i , j + f p , j A t p , j F
similarly like with Equation (4.22) in case of the T-secant method. Thus the “hyperbolic” approximation accelerates the convergence of the Newton-Raphson method by only one additional function evaluation.

11.3. Broyden’s method

Broyden’s method is a special case of the secant method. In single variable case, the derivative of the function is approximated as
f p B p = B p 1 + f p B p 1 x p x p 2 x p
in the p th iteration step, and with
x p x p 2 = 1 x p
it is simplified as
B p = B p 1 + f p B p 1 x p x p
The next Broyden-approximate is then determined as
x p + 1 A = x p A f p A B p
The convergence can similarly be improved by the new hyperbolic approximation procedure as in cases of the secant and the Newton methods. An additional new approximate
x p + 1 B = x p + 1 A x p A 2 B p f p + 1 A f p A 2
can be determined, and the iteration continues with this value. Figure 16 demonstrates the effect of the hyperbolic approximation applied to the classic Broyden method. Not surprisingly, the convergence rate will be improved from f f B = φ 1.618 to f f TB = φ 2 2.618 as in case of the Secant method. In multi-variable case, the i th element of the new the ’T-Broyden’ approximate x p + 1 B can be expressed as
x p + 1 , i B = x p + 1 , i A x p , i A 2 j = 1 m B p , i , j + f p , j A t p , j F
similarly like from Equation (11.13) in case of the T-Newton method with J p , i , j + replaced by B p , i , j + . The new approximate B p + 1 to the Jacobian matrix can then fully be updated in a similar way as it was shown in case of the T-secant method.

12. Conclusions

A completely new iteration strategy has been worked out for solving simultaneous nonlinear equations
f x = 0
x R n and f : R n R m ( m n ). It replaces the Jacobian matrix with finite-difference approximations. The step size x p + 1 was determined as the difference between two new approximates
x p + 1 A = x p A + x p A
and x p + 1 B with elements
x p + 1 , i B = x p + 1 , i A + x p + 1 , i = x p + 1 , i A x p , i A 2 j = 1 m S p , i , j + f p , j A t p , j F
i = 1 , , n as
x p + 1 = x p + 1 B x p + 1 A
The first one is a classic quasi-Newton approximate with stepsize x p A , while the second one was determined from a hyperbolic approximation governed by x p + 1 A and f p + 1 A , such that the classic secant equation
S x A = f A
was modified by a non-uniform scaling transformation
T = T X 0 0 T F
with diagonal elements t j F j = 1 , , m , t i X i = 1 , , n as
S T x A = f A
where
S = f k , j x i and S T = t j F t i X f k , j x i
k = 1 , , n . It was shown, that the new step size x p + 1 is much smaller than the step size x p A of the classic quasi-Newton approximate, providing that x p + 1 B will always be in the vicinity of x p + 1 A . Having two new approximates, a set of n + 1 new independent trial approximates x p + 1 A and x k , p + 1 B   k = 1 , , n was constructed (see Equation (3.5)), providing that the new trial approximates are always in general positions, and ensures stable behavior of the iteration. According to the geometrical representation in single variable case, the suggested procedure corresponds to finding the root of a hyperbolic function with vertical and horizontal asymptotes x p + 1 A and f p A . It was shown in Section 7 that the proposed method has super-quadratic convergence with rate α T S = φ 2 = 2.618 . . . (where φ = 1.618 . . . is the well-known golden ratio) in single variable case. The proposed method needs two function evaluations in each iteration in single variable case and n   + 1 evaluations in multi-variable case. The efficiency of the proposed method was studied in Section 10 in multi-variable case and compared with other classic rank-one update and line-search methods on the basis of available test data. Results show, that the efficiency of the proposed full-rank update procedure is considerably better, then the efficiency of other classic low-rank update methods. A Rosenbrock test function (Equations (9.1) and (9.2)) with up to n = 1000 variables was used to demonstrate the efficiency in Section 9.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

A considerable part of the research work has been done between years 1988 -1992 at Technical University of Budapest (Hungary), at TNO-BOUW Structural Division (The Netherlands) and at Technical High-school of Lulea (Sweden). The work has been sponsored by the Technical University of Budapest (Hungary), by the Hungarian Academy of Sciences (Hungary), by TNO-BOUW (The Netherlands), by Sandvik Rock Tools (Sweden), by CP Test a/s (Denmark) and by Óbuda University (Hungary). Valuable discussions and personal supports from Géza Petrasovits, György Popper, Peter Middendorp, Rikard Skov, Bengt Lundberg and Csaba Hegedűs are greatly appreciated.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Martínez, J.M. Practical quasi-Newton methods for solving nonlinear systems. Journal of Computational and Applied Mathematics. 2000, 124, 97–121. [Google Scholar] [CrossRef]
  2. Wolfe, P. The Secant Method for Simultaneous Nonlinear Equations. Communications of the ACM. 1959, 2, 12–13. [Google Scholar] [CrossRef]
  3. Popper, G. Numerical method for least square solving of nonlinear equations. Periodica Polytechnica. 1985, 29, 67–69. [Google Scholar]
  4. Ortega, J.M.; Rheinboldt, W.C. Iterative Solution of Nonlinear Equations in Several Variables; Academic Press: New York, 1970. [Google Scholar]
  5. Berzi, P. Model investigation for pile bearing capacity prediction. Euromech (280) Symposium on Identification of Nonlinear Mechanical Systems from Dynamic Tests, Ecully, 1991.
  6. nal Berzi, P.; Beccu, R.; Lundberg, B. Identification of a percussive drill rod joint from its response to stress wave loading. International Journal of Impact Engineering. 1994, 18, 281–290. [Google Scholar] [CrossRef]
  7. Dennis, J.E., Jr.; Schnabel, R.B. Numerical Methods for Unconstrained Optimization and Nonlinear Equations; Prentice-Hall: Englewood Cliffs, NJ, 1983. [Google Scholar]
  8. Rosenbrock, H.H. An automatic Method for finding the Greatest or Least Value of a Function. The Computer Journal. 1960, 3, 175–184. [Google Scholar] [CrossRef]
  9. Broyden, C.G. A class of Methods for Solving Nonlinear Simultaneous Equations. Mathematics of Computation. American Mathematical Society. 1965, 19, 577–593. [Google Scholar] [CrossRef]
  10. Powell, M.J.D. An efficient method for finding the minimum of a function of several variables without calculating derivatives. Computer Journal. 1964, 7, 155–162. [Google Scholar] [CrossRef]
  11. Loshchilov, I.; Schoenauer, M.; Sebag, M. Adaptive Coordinate Descent. Genetic and Evolutionary Computation Conference (GECCO), ACM Press, 885-892, 2011. [CrossRef]
  12. Nelder, J.A.; Mead, R. A simplex method for function minimization. Computer Journal. 1965, 7, 308–313. [Google Scholar] [CrossRef]
Figure 1. Formulation of a new set of base vectors n = 3 : x A , x 1 B , x 2 B , x 3 B and interpolation base points A , B 1 , B 2 and B 3 from new approximate x A and from new trial increment x = x B x A = x 1 x 2 x 3 T .
Figure 1. Formulation of a new set of base vectors n = 3 : x A , x 1 B , x 2 B , x 3 B and interpolation base points A , B 1 , B 2 and B 3 from new approximate x A and from new trial increment x = x B x A = x 1 x 2 x 3 T .
Preprints 90608 g001
Figure 2. Geometrical representation of the secant method in single-variable case (A : classic secant method, B : T-secant modification).
Figure 2. Geometrical representation of the secant method in single-variable case (A : classic secant method, B : T-secant modification).
Preprints 90608 g002
Figure 3. Vector space description of the T-secant method in multi-variable case ( k = 1 , n ).
Figure 3. Vector space description of the T-secant method in multi-variable case ( k = 1 , n ).
Preprints 90608 g003
Figure 4. T-Secant iterations with test function 5.26 with initial approximates x 0 A = 3.0 and x 0 B = 1.0 ( Left : x 1 A is the root of y 0 ( x ) , x 1 B is the root of z 0 ( x ) , Right : x 2 A is the root of y 1 ( x ) , x 2 B is the root of z 1 ( x ) ).
Figure 4. T-Secant iterations with test function 5.26 with initial approximates x 0 A = 3.0 and x 0 B = 1.0 ( Left : x 1 A is the root of y 0 ( x ) , x 1 B is the root of z 0 ( x ) , Right : x 2 A is the root of y 1 ( x ) , x 2 B is the root of z 1 ( x ) ).
Preprints 90608 g004
Figure 5. α * convergence rate variation with decreasing E 0 + (dashed red lines indicate α = α S + 1 2.618 level, where α S 1.618 is the convergence rate of the traditional secant method).
Figure 5. α * convergence rate variation with decreasing E 0 + (dashed red lines indicate α = α S + 1 2.618 level, where α S 1.618 is the convergence rate of the traditional secant method).
Preprints 90608 g005
Figure 6. Secant iteration with test function 5.26 with initial approximates x 0 A = 3.5 and x 0 B = 2.5 ( Left : p = 0 , 1 , 2 , Right : p = 2 , 3 , 4 (see data in Table 3).
Figure 6. Secant iteration with test function 5.26 with initial approximates x 0 A = 3.5 and x 0 B = 2.5 ( Left : p = 0 , 1 , 2 , Right : p = 2 , 3 , 4 (see data in Table 3).
Preprints 90608 g006
Figure 7. Newton iteration with test function 5.26 with initial approximate x 0 A = 3.5 ( Left : p = 0 , 1 , Right : p = 2 (see data in Table 4).
Figure 7. Newton iteration with test function 5.26 with initial approximate x 0 A = 3.5 ( Left : p = 0 , 1 , Right : p = 2 (see data in Table 4).
Preprints 90608 g007
Figure 8. T-Secant iteration with test function 5.26 with initial approximates x 0 A = 3.5 and x 0 B = 2.5 ( Left : p = 0 (with interpolation base points A 0 B 0 ) and p = 1 ( A 1 B 1 ) , Right : p = 2 ( A 2 B 2 ) (see data in Table 5).
Figure 8. T-Secant iteration with test function 5.26 with initial approximates x 0 A = 3.5 and x 0 B = 2.5 ( Left : p = 0 (with interpolation base points A 0 B 0 ) and p = 1 ( A 1 B 1 ) , Right : p = 2 ( A 2 B 2 ) (see data in Table 5).
Preprints 90608 g008
Figure 9. Absolute approximate error e p A decrease (dashed lines) and computed convergence rates ( α ) (solid lines) of different methods (Broyden (brown line), Secant (black lines), Newton-Raphson (blue lines), T-Secant (red lines) method).
Figure 9. Absolute approximate error e p A decrease (dashed lines) and computed convergence rates ( α ) (solid lines) of different methods (Broyden (brown line), Secant (black lines), Newton-Raphson (blue lines), T-Secant (red lines) method).
Preprints 90608 g009
Figure 11. ( Left ) Variables x p , i A and ( Right ) absolute approximate errors lg e p A x p , i A i = 1 . . . . 3 variation for initial trial x 0 A = 2.0 1.5 2 , 5 .
Figure 11. ( Left ) Variables x p , i A and ( Right ) absolute approximate errors lg e p A x p , i A i = 1 . . . . 3 variation for initial trial x 0 A = 2.0 1.5 2 , 5 .
Preprints 90608 g011
Figure 12. ( Left ) Variation of x p , i A for x 0 A = 2.0 1.5 2.5 1.5 1.2 3.0 3.5 2.5 2.0 3.5 through iterations N = 10 = n = 10 , m = 18 with p max = 15 and N f = 165 , ( Right ) The absolute approximate errors lg e p A x p , i A i = 1 . . . . 10 and the R x p A function variation through iterations for different initial trials N = 10 , n = 10 , m = 18 (see Table 7).
Figure 12. ( Left ) Variation of x p , i A for x 0 A = 2.0 1.5 2.5 1.5 1.2 3.0 3.5 2.5 2.0 3.5 through iterations N = 10 = n = 10 , m = 18 with p max = 15 and N f = 165 , ( Right ) The absolute approximate errors lg e p A x p , i A i = 1 . . . . 10 and the R x p A function variation through iterations for different initial trials N = 10 , n = 10 , m = 18 (see Table 7).
Preprints 90608 g012
Figure 13. ( Left ) Variation of variables x p A through iterations and ( Right ) Decrease of approximate error l g e p A through iterations, N = 200 (with iteration counter p value indication below the graphs).
Figure 13. ( Left ) Variation of variables x p A through iterations and ( Right ) Decrease of approximate error l g e p A through iterations, N = 200 (with iteration counter p value indication below the graphs).
Preprints 90608 g013
Figure 14. Number of function evaluations for N = 200 (blue), N = 500 (red) and N = 1000 (green) with initial trials x 0 A : 0.5 x 0 , i A 1.5 (solid line) and x 0 A : 0.1 x 0 , i A 19.9 (dashed line).
Figure 14. Number of function evaluations for N = 200 (blue), N = 500 (red) and N = 1000 (green) with initial trials x 0 A : 0.5 x 0 , i A 1.5 (solid line) and x 0 A : 0.1 x 0 , i A 19.9 (dashed line).
Preprints 90608 g014
Figure 15. T-Newton iterations with test function 5.26 with initial approximate x 0 A = 4.5   Left : x 1 B is the root of the tangent line through f 0 A , x 1 A is the root of z 0 ( x ) , Right : x 2 B is the root of the tangent line through f 1 A , x 2 A is the root of z 1 ( x ) (see data in Table 12).
Figure 15. T-Newton iterations with test function 5.26 with initial approximate x 0 A = 4.5   Left : x 1 B is the root of the tangent line through f 0 A , x 1 A is the root of z 0 ( x ) , Right : x 2 B is the root of the tangent line through f 1 A , x 2 A is the root of z 1 ( x ) (see data in Table 12).
Preprints 90608 g015
Figure 16. Broyden ( Left ) and T-Broyden ( Right ) iterations with test function 5.26 with initial approximates x 0 A = 4.5 .
Figure 16. Broyden ( Left ) and T-Broyden ( Right ) iterations with test function 5.26 with initial approximates x 0 A = 4.5 .
Preprints 90608 g016
Table 2. Summary of the multi-variable Secant and T-Secant methods basic equations.
Table 2. Summary of the multi-variable Secant and T-Secant methods basic equations.
Secant method T-Secant method Equations
1 X p F p = x p , k B x p A f p , k B f p A = diag x p , i f p , k , j 3.9, 3.10
2 T p = T p X 0 0 T p F 4.16
3 T p X = d i a g t p , i X = diag x p + 1 , i x p , i A 4.17
4 T p F = diag t p , j F d i a g f p + 1 , j A f p , j A 4.18, 4.19
5 F q A = f A T F F q B = f A 3.14, 4.13
6 F X 1 x A = f A T F F X 1 T X 1 x A = f A 6.1, 6.9
7 S = F X 1 = f k , j x i S T = T F S T X 1 = t j F f k , j t i X x i 6.3, 6.12
8 S x A = f A S T x A = f A 6.5, 6.15
9 x A = S + f A x A = S T + f A 6.6, 6.14
Table 3. Secant method iteration and computed convergence rate, α S (see Figure 6).
Table 3. Secant method iteration and computed convergence rate, α S (see Figure 6).
p x p A x p B x p + 1 A e p + 1 A α S N f
0 3.5 2.5 2.2772 1.8 · 10 1 2
1 2.5 2.2772 2.1282 3.4 · 10 2 3
2 2.2772 2.1282 2.0977 3.2 · 10 3 0.64 4
3 2.1282 2.0977 2.094611 5.9 · 10 5 2.12 5
4 2.0977 2.094611 2.094552 1.1 · 10 7 1.39 6
5 2.094611 2.09455216 2.09455148 3.6 · 10 12 1.69 7
6 2.0945516 2.09455148 2.09455148154233 2.7 · 10 14 1.59 8
7 2.09455148 2.09455148154233 2.09455148154233 2.7 · 10 14 1.63 9
Table 4. Newton method iteration and computed convergence rate, α N (see Figure 7).
Table 4. Newton method iteration and computed convergence rate, α N (see Figure 7).
p x p A x p + 1 A e p + 1 A α N N f N f
0 3.5 2.61 5.2 · 10 1 1 1
1 2.61 2.200 1.1 · 10 1 2 2
2 2.200 2.10037 5.8 · 10 3 1.58 3 3
3 2.10037 2.09457 1.9 · 10 5 1.82 4 4
4 2.09457 2.09455148 2.0 · 10 10 1.97 5 5
5 2.09455148 2.09455148154233 2.7 · 10 14 2.00 6 6
Table 5. T-Secant method iteration and computed convergence rate, α T S (see Figure 8).
Table 5. T-Secant method iteration and computed convergence rate, α T S (see Figure 8).
p x p A x p B x p + 1 A e p + 1 A α TS N f
0 3.5 2.5 2.28 1.8 · 10 1 2
1 2.28 2.1879 2.1032 8.6 · 10 3 4
2 2.1032 2.0957112 2.0945571 5.6 · 10 6 1.50 6
3 2.0945571 2.09455151 2.09455148154242 1.2 · 10 13 2.41 8
4 2.09455148154242 2.09455148154233 2.09455148154233 2.7 · 10 14 2.40 10
Table 6. Iteration results, x 0 A = 2.0 1.5 2 , 5 , T min = 0.01 , T max = 1.5 .
Table 6. Iteration results, x 0 A = 2.0 1.5 2 , 5 , T min = 0.01 , T max = 1.5 .
p 0 1 2 3
x p A 2 1.5 2.5 1.253 0.938 5.248 1.026 0.990 0.980 1.00004 0.99998 0.99994
x p 0.1 0.075 0.125 0.046 0.061 0.026 0.0217 0.0079 0.063 3 · 10 4 1 · 10 4 2 · 10 4
f p A 55 1 47.5 2.5 6.320 0.253 61.28 0.062 0.621 0.026 0.005 0.010 0.00102 0.00004 0.00021 0.00002
q p A 7.47 32.5 22.0 4.915 0.846 243.9 1.184 1.269 0.307 0.160 0.201 0.327
x p + 1 A 1.253 0.938 5.248 1.026 0.990 0.980 1.00004 0.99998 0.99994 0.9 1.0 1.0
e p + 1 A 0.253 0.062 6.248 0.026 0.010 0.020 4 · 10 5 2 · 10 5 6 · 10 5 3 · 10 9 2 · 10 9 5 · 10 9
R x p + 1 A 6.2 · 10 1 6.2 · 10 1 1.0 · 10 3 9.0 · 10 8
ε p 2.1 · 10 0 1.1 · 10 2 2.6 · 10 5 2.2 · 10 9
f p + 1 A 6.32 0.253 61.3 0.062 0.621 0.026 0.005 0.010 0.00102 0.00004 0.00021 0.00002 0.0 0.0 0.0 0.0
t p F 0.115 0.253 1.290 0.025 0.098 0.102 0.01 0.163 0.01 0.01 0.044 0.01 0.01 0.01 0.01 0.01
q p B 120 1298 2365 51.6 5.52 240000 118 127 32 16.0 20.1 32.7
x p + 1 B 1.299 0.999 5.273 1.004 0.998 0.917 0.99978 1.00008 1.00013 1.0 0.9 0.9
x p + 1 0.046 0.061 0.026 0.0217 0.0079 0.063 3 · 10 4 1 · 10 4 2 · 10 4 4 · 10 7 2 · 10 7 6 · 10 7
Table 7. Initial trial vectors N = 10 , n = 10 , m = 18 , x * = 1 1 .
Table 7. Initial trial vectors N = 10 , n = 10 , m = 18 , x * = 1 1 .
x 0 A p max N f
1 1.3 1.5 2.1 1.1 1.3 1.8 1.8 1.7 2.0 2.1 15 165
2 3.1 2.1 4.3 1.2 2.4 3.6 1.6 2.7 4.2 2.2 21 231
3 4.1 1.1 6.3 3.2 4.4 1.6 3.6 5.7 2.2 3.2 - -
4 3.0 3.1 2.3 4.2 2.4 1.6 3.6 2.7 2.2 4.2 - -
5 2.1 3.1 1.3 2.2 3.4 1.6 2.6 1.7 2.2 3.2 16 176
6 3.1 3.1 4.3 2.2 3.4 2.6 1.6 4.7 2.2 2.2 20 220
Table 8. Iteration results ( N = 200 , L1 = 99.9, L2 = 9) with initial trials 0.1 x 0 , i A 19.9 (dashed blue line on Figure 14).
Table 8. Iteration results ( N = 200 , L1 = 99.9, L2 = 9) with initial trials 0.1 x 0 , i A 19.9 (dashed blue line on Figure 14).
p ε p R x p A N f
0 10.6925833405791 24123.43773726327 1
1 5.45917411911925 6895.1103569982861 201
2 2.13338434746463 1247.4064173528971 402
3 0.71430571273689 220.36900527956962 603
4 0.163511639031299 32.621494717337107 804
5 0.0145616620270659 2.4077509738413969 1005
6 0.000197003511771894 0.026366233831030046 1206
7 0.000000084768909602 0.000007982826913871 1407
8 0.000000000032791210 0.000000003114429023 1608
9 0.000000000000013862 0.000000000001333830 1809
10 0.000000000000000546 0.000000000000104185 2010
Table 9. Iteration results ( N = 1000 , L1 = 5, L2 = 0) with initial trials 0.5 x 0 , i A 1.5 (solid green line on Figure 14).
Table 9. Iteration results ( N = 1000 , L1 = 5, L2 = 0) with initial trials 0.5 x 0 , i A 1.5 (solid green line on Figure 14).
p ε p R x p A N f
0 0.287800987765134 212.38512786560364 1
1 0.121219403643695 57.87378211356512 1001
2 0.0396263348376487 13.743840511211417 2002
3 0.0298060844365720 9.6618077142097238 3003
4 0.0120370539008435 5.9465782106406841 4004
5 0.000705489922936629 0.42465246853444877 5005
6 0.000002762586723754 0.001324115254348589 6006
7 0.000000000990421380 0.000000388965253003 7007
8 0.000000000000433209 0.000000000155930410 8008
9 0.000000000000000860 0.000000000000363149 9009
Table 10. Calculated values of the mean convergence rates ( L and L N ) for the Rosenbrock function (1: a substitute value 10 25 was used when R x p max A = 0 ).
Table 10. Calculated values of the mean convergence rates ( L and L N ) for the Rosenbrock function (1: a substitute value 10 25 was used when R x p max A = 0 ).
N Method R x 0 A R x p max A p max N f L L N
1 2 Broyden 1. [9] 4.9193 4.73E-10 - 59 0.391 0.78
2 2 Broyden 2. [9] 4.9193 2.55E-10 - 39 0.607 1.22
3 2 Powell [10] 4.9193 7.00E-10 - 151 0.150 0.30
4 2 ACD [11] 130.062 1.00E-10 - 325 0.086 0.17
5 2 Nelder-Mead [12] 2.0000 1.36E-10 - 185 0.127 0.25
6 2 T-secant [9,10] 4.9193 1.0E- 25 1 3 9 6.573 1 13.15 1
7 2 T-secant [11] 130.06 1.0E- 25 1 3 9 6.937 1 13.87 1
8 2 T-secant [12] 2.0000 6.66E-15 2 6 5.556 11.11
9 3 T-secant 72.722 1.41E-14 5 20 1.809 5.43
10 3 32.466 1.0E- 25 1 4 16 3.815 1 11.45 1
11 5 93.528 1.34E-14 8 48 0.760 3.80
12 5 7.193 5.90E-14 4 24 1.351 6.76
13 10 202.62 1.0E- 25 1 14 154 0.408 1 4.08 1
14 200 92.778 9.00E-15 10 2010 0.042 8.44
15 1000 212.39 3.63E-13 6 6006 0.006 5.66
Table 11. Newton method iteration and computed convergence rate, α N .
Table 11. Newton method iteration and computed convergence rate, α N .
p x p A x p + 1 A e p + 1 A α N N f N f
0 4.5 3.187 1.1 · 10 0 1 1
1 3.187 2.44965 3.6 · 10 1 2 2
2 2.44965 2.14996 5.5 · 10 2 1.42 3 3
3 2.14996 2.096188 1.6 · 10 3 1.66 4 4
4 2.096188 2.094552 1.5 · 10 6 1.89 5 5
5 2.094552 2.09455148 1.3 · 10 12 1.99 6 6
6 2.09455148 2.09455148154233 3.6 · 10 15 2.00 7 7
Table 12. T-Newton method iteration and computed convergence rate, α T N (see Figure 15).
Table 12. T-Newton method iteration and computed convergence rate, α T N (see Figure 15).
p x p A x p + 1 B e p + 1 B α TN N f N f
0 4.5 2.830 7.4 · 10 1 2 1
1 2.830 2.17760 8.3 · 10 2 4 2
2 2.17760 2.09486 3.1 · 10 4 1.84 6 3
3 2.09486 2.09455148 1.9 · 10 11 2.56 8 4
4 2.09455148 2.09455148154233 3.6 · 10 15 2.97 9 5
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated