A NEW THREE -TERM CONJUGATE GRADIENT ALGORITHM FOR SOLVING MINIMIZATION PROBLEMS

Dlovan Haji Omar a, Salah Gazi Shareef b, Bayda Ghanim Fathi c

a,b,c Faculty of Science, University of Zakho, Zakho, Kurdistan Region, Iraq

Received: 13 Apr., 2023 / Accepted: 11 Aug., 2023 / Published: 7 Dec., 2023.               https://doi.org/10.25271/sjuoz.2023.11.4.1152

 

ABSTRACT:

The method of optimization is used to determine the most precise value for certain functions within a certain domain; it is mostly studied and employed in the fields of mathematics, computer science, and physics. This work presents a novel three-term conjugate gradient (CG) approach for unconstrained optimization problems. Both the descending criteria and the sufficient descent criterion were met by the new approach. The novel method that has been proposed has been evaluated for global convergence. The outcomes of numerical trials on a few well-known test functions demonstrated how highly successful our new modified method is, depending on the number of iterations (NOI) and the number of functions to be evaluated (NOF).

KEYWORDS: Optimization, Conjugate Gradient Methods and Three Terms Conjugate Gradient.


1.     Introduction

The field of applied mathematics known as numerical optimization seeks the best answer to a mathematical problem involving the maximization or minimization of a particular function. The function can represent a wide range of things, including the financial success of an enterprise, the effectiveness of a manufacturing procedure, or the precision of a statistical model.

The objective of numerical optimization is to identify the input values that, within any constraints or limitations, result in the best output value. Various optimization algorithms can be used, depending on the properties of the function being optimized, the kinds of constraints involved, and the computational resources available.

Gradient descent, Newton's method, simulated annealing, genetic algorithms, and particle swarm optimization are a few of the frequently used optimization algorithms. These approaches employ iterative procedures to find the best solution, modifying the input values slightly at each iteration until the best solution is found or a stopping criterion is satisfied.

Iterative techniques such as conjugate gradient methods (CG) are used to solve linear systems of equations. They are particularly useful for solving large, sparse, symmetric positive definite (SPD) systems, which arise in many scientific and engineering applications.

CG methods can also be applied to nonlinear functions, but the approach is slightly different from the linear case. Nonlinear CG methods are used to find the minimum of a nonlinear function of several variables

Min                                                                                  (1.1)

where  is  a real-valued continuously differentiable function. Starting from a first guess ,  a nonlinear CG algorithm creates the sequence {}, defined as

                                                                 (1.2)

The main idea of NCG methods is to iteratively create a sequence of search directions that are conjugate to the previous search directions. The previous search directions are linearly combined in each direction of the search, with certain coefficients that ensure that the search direction is a descent direction. The conjugacy condition ensures that the search directions are as independent as possible, which leads to a faster convergence.

The popular CG methods are Fletcher-Reeves (FR) [1], the Hestenes-Stiefel (HS) [2], and the Polak-Rivière (PR) [3], Dai and Yuan (DY) [4], are respectively determined as follows:

                                                                       (1.3)

                                                                          (1.4)

                                                                          (1.5)

                                                                 (1.6)

which is based on the following update rule for the search direction:

                                                        (1.7)

where  is a gradient ofat  and . The Euclidean norm of vectors is represented by the letter . There are also many studies on this method see ([5,6,7,8,9])

Three-term conjugate gradients (TT CG) are additional significant classes of CG which is itself an iterative algorithm used for solving linear systems of equations. The three-term CG method is a specific class of conjugate gradient methods, which are characterized by their use of conjugate search directions in each iteration.

The three-term CG method is an improvement over the classical CG method in terms of convergence rate, and is particularly useful for solving symmetric, positive-definite systems of linear equations. It is based on the idea of generating a sequence of three-term conjugate directions, which are linearly independent and orthogonal with respect to a symmetric positive-definite matrix. In each iteration of the algorithm, three terms are computed: the current solution vector, the residual vector, and the search direction vector.

The three-term CG method is a more general class of CG methods that includes other variants, such as, the HS and the PR methods. All of these approaches generate a sequence of conjugate directions, but they differ in how they compute the search direction vectors. The three-term CG method is a powerful variation of the conjugate gradient method that is especially useful for solving symmetric, positive-definite linear systems of equations. Numerous disciplines, including engineering, physics, computer science, and finance, use the method extensively.

The three-term CG methods presented by Zhang et al. [10, 11] in the literature by taking into consideration a descent-modified PRP and also a descent-modified HS CG method as

                          (1.8)

                                 (1.9)

In the same way, Alaa and Salah in (2019) [12], proposed a new class of the three-term CG method that is computationally efficient and is defined in the following search direction:

                   (1.10)

where  , and the parameter  is given from normal CG methods (HS, PRP and FR).

This essay is organized as follows: In Section two, we'll submit a new three-term CG method suggestion. We demonstrate the descent and sufficient descent conditions of the new algorithm in section 3. Section 4 presents numerous numerical evaluations of our three-terms CG technique. Section 5 contains our concluding observations.

2.     Derivation of the New Direction of Three Terms

In this part, we will create a new three terms direction. The main idea to drive the new search direction is replace the  in the third term of the search direction (1.8) by  that is defined below

                              (2.1)        

Where,  and ) and  is the machine error.

 More specifically, the search direction of our method, named as (New1 TT-CG) method, defined by:

                                    (2.2)

Where   

Or                                                                                (2.3)

Where,  and ) and  is the machine error.  

2.1 Algorithm of (New TT-CG)

Step 1 : Given an  and set, .

Step 2 : If   then stop, else go to Step 3.

Step 3 : Determine  by using the cubic line search.

Step 4 : Set  .

Step 5 : Compute, if   stop.

Otherwise, go to Step 6.

Step 6 : Compute  from the equation [(2.3)].

Step 7 : If  is satisfied go to step 2,

                  Otherwise, set   and go to step 3.

3. Global CONVERGENCE AND (Descent And Sufficient Descent Propertys) Of The (New TT-CG).

Theorem 3.1:- suppose that the  is a sequence created by (1.2), then the satisfy the descent condition.

Proof:-  By multiplying both sides of equation(2.3) by  from right, we obtain

 (3.1)

If , (which is mean  is chosen by an exact line search), then we get   . If we have inexact line search which is  .

By mathematical induction, from the first search direction, we get

 , and we assume that it is true for case  that is mean

 . To prove case

Because the PR parameter satisfies the descent condition, the first two terms of the previous formula are less than or equal to zero. We just need to demonstrate that the third term is less than or equal to zero at this point.

 

from Lipschitz Condition  and

  where                                               (3.2)

Then,

 

          (3.3)   

Finally, we have    .

 

Theorem 3.2:- If  is a sequence generated by (1.2), then the search direction in (2.3) satisfies the sufficient descent condition.

Proof:- from equations (31) and (3.3), we have

  

Let     

Then, we have    .

The global convergence of (New TT-CG) method will be presented by the following theorem.

Assumption 3.1 [13]. The level set  is bounded. i.e.   such that

 ,                                                              (3.4)

Assumption 3.2 In a neighbourhood ,  is differentiable and its gradient is Lipschitz continuous, i.e.  such that

                   (3.5)

From Assumptions (3.1) and (3.2),  such that

.                                                        (3.6)

Lemma 3.1 [14]. The sequence   is produced by the equations (1.2) and (1.7), where  satisfies the descent condition and is determined by strong Wolfe conditions and by assuming that Assumptions (3.5) and (3.5) are true. If

                                                                        (3.7)                                                    

The.                                                              (3.8)

Theorem 3.3:- suppose that assumptions (3.1) and (3.2),hold that If any iteration of the equations (1.2) and (2.3) and  satisfies the strong Wolfe line search conditions, then

      

Proof: From equation (3.2), we have

,            (3.9)

by using (3.6) and (3.2)

                                                                    (3.10)                                                                                                

since,  and . So,

 

 

 

Now, by using lemma (3.1), we get .  

4. Numerical Results

 

PR

New TT-CG

NOI

100%

90 %

NOF

100%

89.524 %

In this section, the numerical outcomes of the New Three Terms CG Method and the Traditional Method PR Method are compared, along with their performances. The comparative tests call for nine different functions with well-known non-linear problems where . The code was also written in the Fortran 95 programming language. The (NOI) and the (NOF) are shown in the comparative results shown in Table (4.1). Table (4.2) contains additional experimental findings that support the New TT-CG's superiority to the traditional CG method in terms of the NOI and NOF.

Table 4.1: Comparison Between Two Algorithms (PR. and New1 TT-CG)

Test function.

DIM.

PR

New TT-CG

NOI

NOF

NOI

NOF

G-Cantrel

 

4

10

100

500

1000

5000

22

22

22

23

23

30

159

159

159

171

171

270

21

21

21

22

22

23

149

149

149

161

161

175

G-Wolfe

4

10

100

500

1000

5000

11

32

49

58

64

99

24

65

99

117

129

214

16

34

46

45

55

58

33

76

98

97

120

127

Cubic

4

10

100

500

1000

5000

15

16

16

16

16

16

45

47

47

47

47

47

14

15

15

15

15

16

39

43

43

43

43

45

Miele

4

10

100

500

1000

5000

37

37

44

44

50

50

116

116

148

148

180

180

39

40

40

43

46

46

122

124

124

143

160

160

G-Wood

 

4

10

100

500

1000

5000

29

29

30

30

30

30

67

67

69

69

69

69

26

26

26

26

26

27

61

61

61

61

61

63

Rosen

4

10

100

500

1000

5000

30

30

30

30

30

30

85

85

85

85

85

85

28

28

28

28

28

28

66

66

66

66

66

66

Total

1170

3825

1053

3348

Table 3.2: -The Percentage Of Improving Of New TT-CG Algorithm

The suggested strategy raises NOI and NOF by 10.476% and 10%, respectively. The modified new three term CG method has generally improved by 10.238% when compared to the standard PR approach.

Conclusions

For issues involving unconstrained optimization, we proposed a novel three-term CG approach in this study. The proofs using exact and approximate line searches must both be in decent condition and be sufficiently decent condition. We demonstrate that the suggested strategy converges globally for general functions. The numerical results clearly show that the new approach uses fewer iterations and more function evaluations per iteration than the old one.

REFERENCES

Fletcher, R. and Reeves, C.M., Function minimization by conjugate gradients. The Computer Journal. 7, 149-154, 1964.

Hestenes, M. R. and Stiefel, E., Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards. 49, 409-436, 1952.

Polak, E. and Ribiere, G., Note surla convergence des méthodes de directions conjuguées., 3(16), 35-43, 1969.

Dai, Y. H. and Yuan, Y., A nonlinear conjugate gradient method with a strong global convergence property, SIAM Journal on Optimization, 10, 177-182,1999.

A. L. Ibrahim and S. G. Shareef, Modified Conjugate Gradient Method For Training Neural Networks Based On Logistic Mapping, Journal of Duhok University, vol. 22, no. 1, 45–51, 2019. https://doi.org/10.26682/sjuod.2019.22.1.7 

H. Dlovan, A new Suggested Conjugate Gradient Algorithm with Logistic Mapping, Journal of University of Zakho, 4(2), 2016, 244-247, https://doi.org/10.25271/2016.4.2.113.

Jie Guo, Zhong Wan. A new three-term conjugate gradient algorithm with modified gradient-differences for solving unconstrained optimization problems[J]. AIMS Mathematics, 2023, 8(2): 2473-2488. doi: 10.3934/math.2023128

Sulaiman, I.M., Malik, M., Awwal, A.M. et al. On three-term conjugate gradient method for optimization problems with applications on COVID-19 model and robotic motion control. Adv Cont Discr Mod 2022, 1 (2022). https://doi.org/10.1186/s13662-021-03638-9

A. L. Ibrahim and M. G. Mohammed, “A new three-term conjugate gradient method for training neural networks with global convergence” Indonesian Journal of Electrical Engineering and Computer Science, Vol. 28, No. 1, October 2022, pp. 547~554. DOI: http://doi.org/10.11591/ijeecs.v2 8.i1.pp551-558

L. Zhang, W. Zhou & D-H. Li, A descent modified Polak-Ribière-Polyak conjugate gradient method and its global convergence, IMA J. Numer. Anal, 26(4), 629-640,2006, https://doi.org/10.1093/imanum/drl016  

M. R. Hestenes and E. Stiefel, Methods for conjugate gradients for solving linear systems, Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.

A. L., Ibrahim, & S. G., Shareef, A new class of three-term conjugate gradient methods for solving unconstrained minimization problems. General Letters in Mathematics Vol 7 (2) 79-86, 2019.

 https://doi.org/10.31559/glm2019.7.2.4  

Jusoh, M. Mamat and M. Rivaie, A new edition of conjugate gradient methods for large-scale unconstrained optimization, International Journal of Mathematical Analysis, Vol. 8, No. 46, (2014), 2277 – 2291.   

K. Sugiki, Y. Narushima, and H. Yabe, Globally convergent three–term conjugate gradient methods that use secant conditions and generate descent search directions for unconstrained optimization, J. Optim. Theory Appl. 153, (20


12), 733–757.