Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

S. Indrapriyadarsini; Shahrzad Mahboubi; Hiroshi Ninomiya; Takeshi Kamio; Hideki Asai

doi:10.20944/preprints202112.0097.v2

Submitted:

07 December 2021

Posted:

08 December 2021

You are already at the latest version

Abstract

Gradient based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method though less commonly used in training neural networks, are known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks and briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.

Keywords:

Neural networks

;

quasi-Newton

;

symmetric rank-1

;

Nesterov’s accelerated gradient

;

limited memory

;

trust-region

Subject:

Computer Science and Mathematics - Mathematics

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe