Analyze Accelerated Mirror Descent via High-Resolution ODEs

Expand
  • 1 Institute of Computational Mathematics and Scientific/Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China;
    2 University of Chinese Academy of Sciences, Beijing 100049, China

Received date: 2023-08-10

  Revised date: 2024-03-10

  Online published: 2025-12-19

Supported by

This work was partially supported by the National Natural Science Foundation of China (No. 12288201).

Abstract

Mirror descent, which can be seen as generalization of gradient descent for solving constrained optimization problem, has found a variety applications in many fields. As growing demand of solving high-dimensional constrained optimization problem, accelerated form of mirror descent has been proposed, along with its corresponding low-resolution ordinary differential equations (ODEs) framework has been studied. However, The low-resolution ODEs are unable to distinguish between Polyak’s heavy-ball method and Nesterov’s accelerated gradient method. This problem also arises with the low-resolution ODEs for accelerated mirror descent. To address this issue, we derive the high-resolution ODEs for accelerated mirror descent and propose a general Lyapunov function framework to analyze its convergence rates in both continuous time and discrete time. Furthermore, we demonstrate that the accelerated mirror descent can minimize the squared gradient norm at an inverse cubic rate by using the high-resolution ODEs framework. In the end, we extend the high-resolution ODEs framework for the accelerated mirror descent method to analyze the accelerated higher-order mirror descent and obtain finer convergence results.

Cite this article

Ya-Xiang Yuan, Yi Zhang . Analyze Accelerated Mirror Descent via High-Resolution ODEs[J]. Journal of the Operations Research Society of China, 2025 , 13(4) : 873 -899 . DOI: 10.1007/s40305-024-00542-3

References

[1] Alvarez, F., Attouch, H., Bolte, J., et al.: A second-order gradient-like dissipative dynamical system with hessian-driven damping: application to optimization and mechanics. Journal de mathématiques pures et appliquées 81(8), 747–779 (2002)
[2] Attouch, H., Peypouquet, J., Redont, P.: A dynamical approach to an inertial forward-backward algorithm for convex minimization. SIAM J. Optim. 24(1), 232–256 (2014)
[3] Attouch, H., Peypouquet, J., Redont, P.: Fast convex optimization via inertial dynamics with Hessian driven damping. J. Differ. Equ. 261(10), 5734–5783 (2016)
[4] Attouch, H., Chbani, Z., Peypouquet, J., et al.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Programm. 168(1), 123–175 (2018)
[5] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
[6] Bubeck, S.: Introduction to online optimization. In: Lecture Notes, vol. 2 (2011)
[7] Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, Cambridge (2006)
[8] Chavdarova, T., Jordan, M.I., Zampetakis, M.: Last-iterate convergence of saddle point optimizers via high-resolution differential equations (2021). arXiv:2112.13826
[9] Chen, S., Shi, B., Yuan, Y.: Gradient norm minimization of Nesterov acceleration: o(1/k3) (2022). arXiv:2209.08862
[10] Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Programm. 156(1), 59–99 (2016)
[11] Krichene, W., Bayen, A., Bartlett, P.L.: Accelerated mirror descent in continuous and discrete time. In: Advances in neural information processing systems, vol 28 (2015)
[12] Lin, T., Jordan, M.I.: A control-theoretic perspective on optimal high-order optimization. Math. Programm. 1–47 (2021)
[13] Lyapunov, A.M.: The general problem of the stability of motion. Int. J. Control 55(3), 531–534 (1992)
[14] Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization (1983)
[15] Nesterov, Y.: A method for solving the convex programming problem with convergence rate O(1/k2). Soviet Math. Doklady. 269, 543–547 (1983)
[16] Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Programm. 103(1), 127–152 (2005)
[17] Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
[18] Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Programm. 140(1), 125– 161 (2013)
[19] Nesterov, Y.: Lectures on Convex Optimization. Springer, Berlin (2018)
[20] Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
[21] Ryu, E.K., Yin, W.: Large-scale convex optimization via monotone operators. Cambridge University Press, Cambridge (2022)
[22] Shi, B., Du, S.S., Jordan, M.I., et al.: Understanding the acceleration phenomenon via high-resolution differential equations. Math. Programm. 1–70 (2021)
[23] Shi, B., Du, S.S., Su, W., et al.: Acceleration via symplectic discretization of high-resolution differential equations. In: Advances in neural information processing systems, vol 32 (2019)
[24] Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17, 1–43 (2016)
[25] Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
[26] Tseng, P.: On accelerated proximal gradient methods for convex-concave optimization. submitted to SIAM J. Optim. 2(3) (2008)
[27] Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. 113(47), E7351–E7358 (2016)
Options
Outlines

/