A NEW THREE -TERM CONJUGATE GRADIENT ALGORITHM FOR SOLVING MINIMIZATION PROBLEMS
Dlovan Haji Omar a, Salah Gazi Shareef b, Bayda Ghanim Fathi c
a,b,c Faculty of Science, University of Zakho, Zakho, Kurdistan Region, Iraq
Received: 13 Apr., 2023 / Accepted: 11 Aug., 2023 / Published: 7 Dec., 2023. https://doi.org/10.25271/sjuoz.2023.11.4.1152
ABSTRACT:
The method of optimization is used to determine the most precise value for certain functions within a certain domain; it is mostly studied and employed in the fields of mathematics, computer science, and physics. This work presents a novel three-term conjugate gradient (CG) approach for unconstrained optimization problems. Both the descending criteria and the sufficient descent criterion were met by the new approach. The novel method that has been proposed has been evaluated for global convergence. The outcomes of numerical trials on a few well-known test functions demonstrated how highly successful our new modified method is, depending on the number of iterations (NOI) and the number of functions to be evaluated (NOF).
KEYWORDS: Optimization, Conjugate Gradient Methods and Three Terms Conjugate Gradient.
The field of applied mathematics known as numerical optimization seeks the best answer to a mathematical problem involving the maximization or minimization of a particular function. The function can represent a wide range of things, including the financial success of an enterprise, the effectiveness of a manufacturing procedure, or the precision of a statistical model.
The objective of numerical optimization is to identify the input values that, within any constraints or limitations, result in the best output value. Various optimization algorithms can be used, depending on the properties of the function being optimized, the kinds of constraints involved, and the computational resources available.
Gradient descent, Newton's method, simulated annealing, genetic algorithms, and particle swarm optimization are a few of the frequently used optimization algorithms. These approaches employ iterative procedures to find the best solution, modifying the input values slightly at each iteration until the best solution is found or a stopping criterion is satisfied.
Iterative techniques such as conjugate gradient methods (CG) are used to solve linear systems of equations. They are particularly useful for solving large, sparse, symmetric positive definite (SPD) systems, which arise in many scientific and engineering applications.
CG methods can also be applied to nonlinear functions, but the approach is slightly different from the linear case. Nonlinear CG methods are used to find the minimum of a nonlinear function of several variables
Min (1.1)
where is a real-valued continuously differentiable function. Starting from a first guess , a nonlinear CG algorithm creates the sequence {}, defined as
(1.2)
The main idea of NCG methods is to iteratively create a sequence of search directions that are conjugate to the previous search directions. The previous search directions are linearly combined in each direction of the search, with certain coefficients that ensure that the search direction is a descent direction. The conjugacy condition ensures that the search directions are as independent as possible, which leads to a faster convergence.
The popular CG methods are Fletcher-Reeves (FR) [1], the Hestenes-Stiefel (HS) [2], and the Polak-Rivière (PR) [3], Dai and Yuan (DY) [4], are respectively determined as follows:
(1.3)
(1.4)
(1.5)
(1.6)
which is based on the following update rule for the search direction:
(1.7)
where is a gradient ofat and . The Euclidean norm of vectors is represented by the letter . There are also many studies on this method see ([5,6,7,8,9])
Three-term conjugate gradients (TT CG) are additional significant classes of CG which is itself an iterative algorithm used for solving linear systems of equations. The three-term CG method is a specific class of conjugate gradient methods, which are characterized by their use of conjugate search directions in each iteration.
The three-term CG method is an improvement over the classical CG method in terms of convergence rate, and is particularly useful for solving symmetric, positive-definite systems of linear equations. It is based on the idea of generating a sequence of three-term conjugate directions, which are linearly independent and orthogonal with respect to a symmetric positive-definite matrix. In each iteration of the algorithm, three terms are computed: the current solution vector, the residual vector, and the search direction vector.
The three-term CG method is a more general class of CG methods that includes other variants, such as, the HS and the PR methods. All of these approaches generate a sequence of conjugate directions, but they differ in how they compute the search direction vectors. The three-term CG method is a powerful variation of the conjugate gradient method that is especially useful for solving symmetric, positive-definite linear systems of equations. Numerous disciplines, including engineering, physics, computer science, and finance, use the method extensively.
The three-term CG methods presented by Zhang et al. [10, 11] in the literature by taking into consideration a descent-modified PRP and also a descent-modified HS CG method as
(1.8)
(1.9)
In the same way, Alaa and Salah in (2019) [12], proposed a new class of the three-term CG method that is computationally efficient and is defined in the following search direction:
(1.10)
where , and the parameter is given from normal CG methods (HS, PRP and FR).
This essay is organized as follows: In Section two, we'll submit a new three-term CG method suggestion. We demonstrate the descent and sufficient descent conditions of the new algorithm in section 3. Section 4 presents numerous numerical evaluations of our three-terms CG technique. Section 5 contains our concluding observations.
In this part, we will create a new three terms direction. The main idea to drive the new search direction is replace the in the third term of the search direction (1.8) by that is defined below
(2.1)
Where, and ) and is the machine error.
More specifically, the search direction of our method, named as (New1 TT-CG) method, defined by:
(2.2)
Where
Or (2.3)
Where, and ) and is the machine error.
Step 1 : Given an and set, .
Step 2 : If then stop, else go to Step 3.
Step 3 : Determine by using the cubic line search.
Step 4 : Set .
Step 5 : Compute, if stop.
Otherwise, go to Step 6.
Step 6 : Compute from the equation [(2.3)].
Step 7 : If is satisfied go to step 2,
Otherwise, set and go to step 3.
Theorem 3.1:- suppose that the is a sequence created by (1.2), then the satisfy the descent condition.
Proof:- By multiplying both sides of equation(2.3) by from right, we obtain
(3.1)
If , (which is mean is chosen by an exact line search), then we get . If we have inexact line search which is .
By mathematical induction, from the first search direction, we get
, and we assume that it is true for case that is mean
. To prove case
Because the PR parameter satisfies the descent condition, the first two terms of the previous formula are less than or equal to zero. We just need to demonstrate that the third term is less than or equal to zero at this point.
from Lipschitz Condition and
where (3.2)
Then,
(3.3)
Finally, we have .
Theorem 3.2:- If is a sequence generated by (1.2), then the search direction in (2.3) satisfies the sufficient descent condition.
Proof:- from equations (31) and (3.3), we have
Let
Then, we have .
The global convergence of (New TT-CG) method will be presented by the following theorem.
Assumption 3.1 [13]. The level set is bounded. i.e. such that
, (3.4)
Assumption 3.2 In a neighbourhood , is differentiable and its gradient is Lipschitz continuous, i.e. such that
(3.5)
From Assumptions (3.1) and (3.2), such that
. (3.6)
Lemma 3.1 [14]. The sequence is produced by the equations (1.2) and (1.7), where satisfies the descent condition and is determined by strong Wolfe conditions and by assuming that Assumptions (3.5) and (3.5) are true. If
(3.7)
The. (3.8)
Theorem 3.3:- suppose that assumptions (3.1) and (3.2),hold that If any iteration of the equations (1.2) and (2.3) and satisfies the strong Wolfe line search conditions, then
Proof: From equation (3.2), we have
, (3.9)
by using (3.6) and (3.2)
(3.10)
since, and . So,
Now, by using lemma (3.1), we get .
4. Numerical Results
|
PR |
New TT-CG |
NOI |
100% |
90 % |
NOF |
100% |
89.524 % |
In this section, the numerical outcomes of the New Three Terms CG Method and the Traditional Method PR Method are compared, along with their performances. The comparative tests call for nine different functions with well-known non-linear problems where . The code was also written in the Fortran 95 programming language. The (NOI) and the (NOF) are shown in the comparative results shown in Table (4.1). Table (4.2) contains additional experimental findings that support the New TT-CG's superiority to the traditional CG method in terms of the NOI and NOF.
Table 4.1: Comparison Between Two Algorithms (PR. and New1 TT-CG)
Test function. |
DIM. |
PR |
New TT-CG |
||
NOI |
NOF |
NOI |
NOF |
||
G-Cantrel
|
4 10 100 500 1000 5000 |
22 22 22 23 23 30 |
159 159 159 171 171 270 |
21 21 21 22 22 23 |
149 149 149 161 161 175 |
G-Wolfe |
4 10 100 500 1000 5000 |
11 32 49 58 64 99 |
24 65 99 117 129 214 |
16 34 46 45 55 58 |
33 76 98 97 120 127 |
Cubic |
4 10 100 500 1000 5000 |
15 16 16 16 16 16 |
45 47 47 47 47 47 |
14 15 15 15 15 16 |
39 43 43 43 43 45 |
Miele |
4 10 100 500 1000 5000 |
37 37 44 44 50 50 |
116 116 148 148 180 180 |
39 40 40 43 46 46 |
122 124 124 143 160 160 |
G-Wood
|
4 10 100 500 1000 5000 |
29 29 30 30 30 30 |
67 67 69 69 69 69 |
26 26 26 26 26 27 |
61 61 61 61 61 63 |
Rosen |
4 10 100 500 1000 5000 |
30 30 30 30 30 30 |
85 85 85 85 85 85 |
28 28 28 28 28 28 |
66 66 66 66 66 66 |
Total |
1170 |
3825 |
1053 |
3348 |
Table 3.2: -The Percentage Of Improving Of New TT-CG Algorithm
The suggested strategy raises NOI and NOF by 10.476% and 10%, respectively. The modified new three term CG method has generally improved by 10.238% when compared to the standard PR approach.
For issues involving unconstrained optimization, we proposed a novel three-term CG approach in this study. The proofs using exact and approximate line searches must both be in decent condition and be sufficiently decent condition. We demonstrate that the suggested strategy converges globally for general functions. The numerical results clearly show that the new approach uses fewer iterations and more function evaluations per iteration than the old one.
Fletcher, R. and Reeves, C.M., Function minimization by conjugate gradients. The Computer Journal. 7, 149-154, 1964.
Hestenes, M. R. and Stiefel, E., Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards. 49, 409-436, 1952.
Polak, E. and Ribiere, G., Note surla convergence des méthodes de directions conjuguées., 3(16), 35-43, 1969.
Dai, Y. H. and Yuan, Y., A nonlinear conjugate gradient method with a strong global convergence property, SIAM Journal on Optimization, 10, 177-182,1999.
A. L. Ibrahim and S. G. Shareef, Modified Conjugate Gradient Method For Training Neural Networks Based On Logistic Mapping, Journal of Duhok University, vol. 22, no. 1, 45–51, 2019. https://doi.org/10.26682/sjuod.2019.22.1.7
H. Dlovan, A new Suggested Conjugate Gradient Algorithm with Logistic Mapping, Journal of University of Zakho, 4(2), 2016, 244-247, https://doi.org/10.25271/2016.4.2.113.
Jie Guo, Zhong Wan. A new three-term conjugate gradient algorithm with modified gradient-differences for solving unconstrained optimization problems[J]. AIMS Mathematics, 2023, 8(2): 2473-2488. doi: 10.3934/math.2023128
Sulaiman, I.M., Malik, M., Awwal, A.M. et al. On three-term conjugate gradient method for optimization problems with applications on COVID-19 model and robotic motion control. Adv Cont Discr Mod 2022, 1 (2022). https://doi.org/10.1186/s13662-021-03638-9
A. L. Ibrahim and M. G. Mohammed, “A new three-term conjugate gradient method for training neural networks with global convergence” Indonesian Journal of Electrical Engineering and Computer Science, Vol. 28, No. 1, October 2022, pp. 547~554. DOI: http://doi.org/10.11591/ijeecs.v2 8.i1.pp551-558
L. Zhang, W. Zhou & D-H. Li, A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence, IMA J. Numer. Anal, 26(4), 629-640,2006, https://doi.org/10.1093/imanum/drl016
M. R. Hestenes and E. Stiefel, Methods for conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
A. L., Ibrahim, & S. G., Shareef, A new class of three-term conjugate gradient methods for solving unconstrained minimization problems. General Letters in Mathematics Vol 7 (2) 79-86, 2019.
https://doi.org/10.31559/glm2019.7.2.4
Jusoh, M. Mamat and M. Rivaie, A new edition of conjugate gradient methods for large-scale unconstrained optimization, International Journal of Mathematical Analysis, Vol. 8, No. 46, (2014), 2277 – 2291.
K. Sugiki, Y. Narushima, and H. Yabe, Globally convergent three–term conjugate gradient methods that use secant conditions and generate descent search directions for unconstrained optimization, J. Optim. Theory Appl. 153, (20
12), 733–757.