Journal of the Operations Research Society of China ›› 2023, Vol. 11 ›› Issue (2): 245-275.doi: 10.1007/s40305-023-00453-9

• Special Issue: Machine Learning and Optimization Algorithm • Previous Articles     Next Articles

An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning

Tian-De Guo1, Yan Liu2, Cong-Ying Han1   

  1. 1 School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 101408, China;
    2 School of Statistics and Data Science, KLMDASR, LEBPS, and LPMC, Nankai University, Tianjin 300071, China
  • Received:2022-05-04 Revised:2023-01-15 Online:2023-06-30 Published:2023-05-24
  • Contact: Yan Liu, Tian-De Guo, Cong-Ying Han E-mail:liuyan23@nankai.edu.cn;tdguo@ucas.ac.cn;hancy@ucas.ac.cn
  • Supported by:
    the National Key R&D Program of China (No. 2021YFA1000403), the National Natural Science Foundation of China (Nos. 11731013, 12101334 and U19B2040), the Natural Science Foundation of Tianjin (No. 21JCQNJC00030) and the Fundamental Research Funds for the Central Universities

Abstract: Numerous intriguing optimization problems arise as a result of the advancement of machine learning. The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency. However, the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate. Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems. This paper provides a review on recent developments in stochastic variants of quasi-Newton methods, which construct the Hessian approximations using only gradient information. We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios. Future research on stochastic quasi-Newton methods should focus on enhancing its applicability, lowering the computational and storage costs, and improving the convergence rate.

Key words: Stochastic quasi-Newton methods, BFGS, Large-scale machine learning

CLC Number: