Journal of the Operations Research Society of China ›› 2023, Vol. 11 ›› Issue (2): 277-307.doi: 10.1007/s40305-022-00436-2

• Special Issue: Machine Learning and Optimization Algorithm • Previous Articles     Next Articles

A Mini-Batch Proximal Stochastic Recursive Gradient Algorithm with Diagonal Barzilai–Borwein Stepsize

Teng-Teng Yu1,3, Xin-Wei Liu2, Yu-Hong Dai3, Jie Sun2,4   

  1. 1 School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;
    2 Institute of Mathematics, Hebei University of Technology, Tianjin 300401, China;
    3 LSEC, ICMSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;
    4 School of Business, National University of Singapore, Singapore 119245, Singapore
  • Received:2021-11-13 Revised:2022-05-06 Online:2023-06-30 Published:2023-05-24
  • Contact: Xin-Wei Liu, Teng-Teng Yu, Yu-Hong Dai, Jie Sun E-mail:mathlxw@hebut.edu.cn;ytt2021@lsec.cc.ac.cn;dyh@lsec.cc.ac.cn;jsun@nus.edu.sg
  • Supported by:
    the National Natural Science Foundation of China ( Nos. 11671116, 11701137, 12071108, 11991020, 11991021 and 12021001), the Major Research Plan of the NSFC (No. 91630202), the Strategic Priority Research Program of Chinese Academy of Sciences (No. XDA27000000), and the Natural Science Foundation of Hebei Province (No. A2021202010)

Abstract: Many machine learning problems can be formulated as minimizing the sum of a function and a non-smooth regularization term. Proximal stochastic gradient methods are popular for solving such composite optimization problems. We propose a minibatch proximal stochastic recursive gradient algorithm SRG-DBB, which incorporates the diagonal Barzilai–Borwein (DBB) stepsize strategy to capture the local geometry of the problem. The linear convergence and complexity of SRG-DBB are analyzed for strongly convex functions. We further establish the linear convergence of SRGDBB under the non-strong convexity condition. Moreover, it is proved that SRG-DBB converges sublinearly in the convex case. Numerical experiments on standard data sets indicate that the performance of SRG-DBB is better than or comparable to the proximal stochastic recursive gradient algorithm with best-tuned scalar stepsizes or BB stepsizes. Furthermore, SRG-DBB is superior to some advanced mini-batch proximal stochastic gradient methods.

Key words: Stochastic recursive gradient, Proximal gradient algorithm, Barzilai–Borwein method, Composite optimization

CLC Number: