Polar Decomposition-based Algorithms on the Product of Stiefel Manifolds with Applications in Tensor Approximation

Expand
  • 1 Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen 518172, Guangdong, China;
    2 Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, MN 55455, USA

Received date: 2021-10-22

  Revised date: 2023-01-10

  Online published: 2024-12-12

Supported by

This work was partly supported by the National Natural Science Foundation of China (No. 11601371) and the Guangdong Basic and Applied Basic Research Foundation (No. 2021A1515010232).

Abstract

In this paper, we propose a general algorithmic framework to solve a class of optimization problems on the product of complex Stiefel manifolds based on the matrix polar decomposition. We establish the weak convergence, global convergence and linear convergence properties, respectively, of this general algorithmic approach using the Łojasiewicz gradient inequality and the Morse-Bott property. This general algorithmic approach and its convergence results are applied to the simultaneous approximate tensor diagonalization problem and the simultaneous approximate tensor compression problem, which include as special cases the low rank orthogonal approximation, best rank-1 approximation and low multilinear rank approximation for higher order complex tensors. We also present a variant of this general algorithmic framework to solve a symmetric version of this class of optimization models, which essentially optimizes over a single Stiefel manifold. We establish its weak convergence, global convergence and linear convergence properties in a similar way. This symmetric variant and its convergence results are applied to the simultaneous approximate symmetric tensor diagonalization, which includes as special cases the low rank symmetric orthogonal approximation and best symmetric rank-1 approximation for higher order complex symmetric tensors. It turns out that well-known algorithms such as LROAT, S-LROAT, HOPM and S-HOPM are all special cases of this general algorithmic framework and its symmetric variant, and our convergence results subsume the results found in the literature designed for those special cases. All the algorithms and convergence results in this paper are straightforwardly applicable to the real case.

Cite this article

Jian-Ze Li, Shu-Zhong Zhang . Polar Decomposition-based Algorithms on the Product of Stiefel Manifolds with Applications in Tensor Approximation[J]. Journal of the Operations Research Society of China, 2024 , 12(4) : 874 -920 . DOI: 10.1007/s40305-023-00462-8

References

[1] Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press (2009)
[2] Boumal, N., Mishra, B., Absil, P.A., Sepulchre, R.: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15, 1455-1459(2014)
[3] Hu, J., Liu, X., Wen, Z.W., Yuan, Y.X.: A brief introduction to manifold optimization. J. Oper. Res. Soc. China 8(2), 199-248(2020)
[4] Smith, S.T.: Optimization techniques on Riemannian manifolds. Fields Inst. Commun. 3(1994)
[5] Adler, R.L., Dedieu, J.P., Margulies, J.Y., Martens, M., Shub, M.: Newton’s method on Riemannian manifolds and a geometric model for the human spine. IMA J. Numer. Anal. 22(3), 359-390(2002)
[6] Absil, P.A., Baker, C.G., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7(3), 303-330(2007)
[7] Kovnatsky, A., Glashoff, K., Bronstein, M.M.: Madmm: a generic algorithm for non-smooth optimization on manifolds. In: European Conference on Computer Vision, pp. 680-696. Springer (2016)
[8] Edelman, A., Arias, T.A., Smith, S.T.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20(2), 303-353(1998)
[9] Gao, B., Liu, X., Chen, X., Yuan, Y.X.: A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J. Optim. 28(1), 302-332(2018)
[10] Journée, M., Nesterov, Y., Richtárik, P., Sepulchre, R.: Generalized power method for sparse principal component analysis. J. Mach. Learn. Res. 11(2), 517-553(2010)
[11] Sato, H., Iwai, T.: A Riemannian optimization approach to the matrix singular value decomposition. SIAM J. Optim. 23(1), 188-212(2013)
[12] Wen, Z., Yin, W.: A feasible method for optimization with orthogonality constraints. Math. Program. 142(1-2), 397-434(2013)
[13] Li, Z., Uschmajew, A., Zhang, S.: On convergence of the maximum block improvement method. SIAM J. Optim. 25(1), 210-233(2015)
[14] Shen, X., Diamond, S., Udell, M., Gu, Y., Boyd, S.: Disciplined multi-convex programming. In: 201729th Chinese Control And Decision Conference (CCDC), pp. 895-900. IEEE (2017)
[15] Xu, Y., Yin, W.: A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imag. Sci. 6(3), 1758-1789(2013)
[16] Comon, P., Golub, G., Lim, L.H., Mourrain, B.: Symmetric tensors and symmetric tensor rank. SIAM J. Matrix Anal. Appl. 30(3), 1254-1279(2008)
[17] Qi, L., Luo, Z.: Tensor Analysis: Spectral Theory and Special Tensors. SIAM (2017)
[18] Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455-500(2009)
[19] Cichocki, A., Mandic, D., Lathauwer, L.D., Zhou, G., Zhao, Q., Caiafa, C., Phan, H.A.: Tensor decompositions for signal processing applications: from two-way to multiway component analysis. IEEE Signal Process. Mag. 32(2), 145-163(2015)
[20] Comon, P.: Tensors: a brief introduction. IEEE Signal Process. Mag. 31(3), 44-53(2014)
[21] Sidiropoulos, N.D., De Lathauwer, L., Fu, X., Huang, K., Papalexakis, E.E., Faloutsos, C.: Tensor decomposition for signal processing and machine learning. IEEE Trans. Signal Process. 65(13), 3551-3582(2017)
[22] Chen, J., Saad, Y.: On the tensor SVD and the optimal low rank orthogonal approximation of tensors. SIAM J. Matrix Anal. Appl. 30(4), 1709-1734(2009)
[23] Kolda, T.G.: Orthogonal tensor decompositions. SIAM J. Matrix Anal. Appl. 23(1), 243-255(2001)
[24] De Lathauwer, L.: Signal Processing Based on Multilinear Algebra. Katholieke Universiteit Leuven (1997)
[25] De Lathauwer, L., Comon, P., De Moor, B., Vandewalle, J.: Higher-order power method—application in independent component analysis. In: Proc. of the International Symposium on Nonlinear Theory and its Applications (NOLTA’95), pp. 91-96(1995)
[26] De Lathauwer, L., De Moor, B., Vandewalle, J.: On the best rank-1 and rank-(r1, r2, ..., rn) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 21(4), 1324-1342(2000)
[27] Pan, J., Ng, M.K.: Symmetric orthogonal approximation to symmetric tensors with applications to image reconstruction. Numer. Linear Algebra Appl. 25(5), e2180(2018)
[28] Li, J., Usevich, K., Comon, P.: Jacobi-type algorithm for low rank orthogonal approximation of symmetric tensors and its convergence analysis. Pac. J. Optim. 17(3), 357-379(2021)
[29] Kofidis, E., Regalia, P.A.: On the best rank-1 approximation of higher-order supersymmetric tensors. SIAM J. Matrix Anal. Appl. 23(3), 863-884(2002)
[30] Comon, P., Jutten, C. (eds.): Handbook of Blind Source Separation. Academic Press, Oxford (2010)
[31] Hu, G., Hua, Y., Yuan, Y., Zhang, Z., Lu, Z., Mukherjee, S.S., Hospedales, T.M., Robertson, N.M., Yang, Y.: Attribute-enhanced face recognition with neural tensor fusion networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3744-3753(2017)
[32] Karami, A., Yazdi, M., Mercier, G.: Compression of hyperspectral images using discerete wavelet transform and tucker decomposition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 5(2), 444-450(2012)
[33] Li, J., Zhang, X.P., Tran, T.: Point cloud denoising based on tensor tucker decomposition. In: 201926th IEEE International Conference on Image Processing (ICIP). IEEE (2019)
[34] Qi, L., Wang, F., Wang, Y.: Z-eigenvalue methods for a global polynomial optimization problem. Math. Program. 118(2), 301-316(2009)
[35] Anandkumar, A., Ge, R., Hsu, D., Kakade, S.M., Telgarsky, M.: Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15, 2773-2832(2014)
[36] Comon, P.: Tensor diagonalization, a useful tool in signal processing. IFAC Proc. Vol. 27(8), 77-82(1994)
[37] Li, J., Usevich, K., Comon, P.: Globally convergent Jacobi-type algorithms for simultaneous orthogonal symmetric tensor diagonalization. SIAM J. Matrix Anal. Appl. 39(1), 1-22(2018)
[38] Li, J., Usevich, K., Comon, P.: On approximate diagonalization of third order symmetric tensors by orthogonal transformations. Linear Algebra Appl. 576, 324-351(2019)
[39] Li, J., Usevich, K., Comon, P.: On the convergence of Jacobi-type algorithms for independent component analysis. In: 11th IEEE Sensor Array and Multichannel Signal Processing Workshop. IEEE (2020)
[40] Usevich, K., Li, J., Comon, P.: Approximate matrix and tensor diagonalization by unitary transformations: convergence of Jacobi-type algorithms. SIAM J. Optim. 30(4), 2998-3028(2020)
[41] Comon, P.: Independent component analysis. In: Lacoume, J.L. (ed.) Higher Order Statistics, pp. 29-38. Elsevier, Amsterdam, London (1992)
[42] Comon, P.: Independent component analysis, a new concept? Signal Process. 36(3), 287-314(1994)
[43] Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific (1999)
[44] Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475-494(2001)
[45] Wright, S.J.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159-186(2012)
[46] Chen, B., He, S., Li, Z., Zhang, S.: Maximum block improvement and polynomial optimization. SIAM J. Optim. 22(1), 87-107(2012)
[47] Xu, Y.: On the convergence of higher-order orthogonal iteration. Linear Multilinear Algebra 66(11), 2247-2265(2018)
[48] Abrudan, T.E., Eriksson, J., Koivunen, V.: Steepest descent algorithms for optimization under unitary matrix constraint. IEEE Trans. Signal Process. 56(3), 1134-1147(2008)
[49] Brandwood, D.: A complex gradient operator and its application in adaptive array theory. IEE Proc. H-Microw., Opt. Antennas 130(1), 11-16(1983)
[50] Krantz, S.G.: Function Theory of Several Complex Variables, vol. 340. American Mathematical Society (2001)
[51] Manton, J.H.: Modified steepest descent and Newton algorithms for orthogonally constrained optimisation. Part i. The complex Stiefel manifold. In: Proceedings of the 6th International Symposium on Signal Processing and its Applications, vol. 1, pp. 80-83. IEEE (2001)
[52] Absil, P.A., Mahony, R., Trumpf, J.: An extrinsic look at the Riemannian Hessian. In: International Conference on Geometric Science of Information, pp. 361-368. Springer (2013)
[53] Van Den Bos, A.: Complex gradient and Hessian. IEE Proc.-Vis., Image Signal Process. 141(6), 380-382(1994)
[54] Łojasiewicz, S.: Ensembles Semi-analytiques. IHES Notes (1965)
[55] Łojasiewicz, S.: Sur la géométrie semi- et sous-analytique. Ann. l’inst. Four. 43(5), 1575-1595(1993)
[56] Absil, P.A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16(2), 531-547(2005)
[57] Uschmajew, A.: A new convergence proof for the higher-order power method and generalizations. Pac. J. Optim. 11(2), 309-321(2015)
[58] Schneider, R., Uschmajew, A.: Convergence results for projected line-search methods on varieties of low-rank matrices via łojasiewicz inequality. SIAM J. Optim. 25(1), 622-646(2015)
[59] Krantz, S., Parks, H.: A Primer of Real Analytic Functions. Birkhäuser, Boston (2002)
[60] Polyak, B.T.: Gradient methods for minimizing functionals. Zh. Vychisl. Mat. Mat. Fiz. 3(4), 643-653(1963)
[61] Bott, R.: Nondegenerate critical manifolds. Ann. Math. 60, 248-261(1954)
[62] Feehan, P.: Optimal Łojasiewicz-Simon inequalities and Morse-Bott Yang-Mills energy functions. (2018) arXiv:1706.09349
[63] Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. Johns Hopkins University Press (2013)
[64] Higham, N.J.: Functions of Matrices: Theory and Computation, vol. 104. SIAM (2008)
[65] Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press (2012)
[66] Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)
[67] Nie, J., Yang, Z.: Hermitian tensor decompositions (2019). arXiv:1912.07175
[68] Jiang, B., Li, Z., Zhang, S.: Characterizing real-valued multivariate complex polynomials and their symmetric tensor representations. SIAM J. Matrix Anal. Appl. 37(1), 381-408(2016)
[69] Hu, S., Ye, K.: Linear convergence of an alternating polar decomposition method for low rank orthogonal tensor approximations (2019). arXiv:1912.04085
[70] Yang, Y.: The epsilon-alternating least squares for orthogonal low-rank tensor approximation and its global convergence. SIAM J. Matrix Anal. Appl. 41(4), 1797-1825(2020)
[71] Zhang, T., Golub, G.H.: Rank-one approximation to high order tensors. SIAM J. Matrix Anal. Appl. 23(2), 534-550(2001)
[72] Hu, S., Li, G.: Convergence rate analysis for the higher order power method in best rank one approximations of tensors. Numer. Math. 140(4), 993-1031(2018)
[73] Kolda, T.G., Mayo, J.R.: Shifted power method for computing tensor eigenpairs. SIAM J. Matrix Anal. Appl. 32(4), 1095-1124(2011)
[74] Zhang, X., Ling, C., Qi, L.: The best rank-1 approximation of a symmetric tensor and related spherical optimization problems. SIAM J. Matrix Anal. Appl. 33(3), 806-821(2012)
[75] Ishteva, M.: Numerical methods for the best low multilinear rank approximation of higher-order tensors. Ph.D Thesis, Department of Electrical Engineering, Katholieke Universiteit Leuven (2009)
[76] Ishteva, M., Absil, P.A., Van Dooren, P.: Jacobi algorithm for the best low multilinear rank approximation of symmetric tensors. SIAM J. Matrix Anal. Appl. 2(34), 651-672(2013)
[77] Ishteva, M., Absil, P.A., Van Huffel, S., De Lathauwer, L.: Best low multilinear rank approximation of higher-order tensors, based on the Riemannian trust-region scheme. SIAM J. Matrix Anal. Appl. 32(1), 115-135(2011)
[78] Kruskal, J.B.: Rank, decomposition, and uniqueness for 3-way and n-way arrays. Multiway Data Anal. 7-18(1989)
[79] Tucker, L.R.: Some mathematical notes on three-mode factor analysis. Psychometrika 31(3), 279-311(1966)
[80] Bader, B.W., Kolda, T.G., et al.: Tensor Toolbox for MATLAB, Version 3.5, Available online (2023)
[81] Guan, Y., Chu, D.: Numerical computation for orthogonal low-rank approximation of tensors. SIAM J. Matrix Anal. Appl. 40(3), 1047-1065(2019)
[82] Wang, L., Chu, M.T.: On the global convergence of the alternating least squares method for rank-one approximation to generic tensors. SIAM J. Matrix Anal. Appl. 35(3), 1058-1072(2014)
[83] Wang, L., Chu, M.T., Yu, B.: Orthogonal low rank tensor approximation: alternating least squares method and its global convergence. SIAM J. Matrix Anal. Appl. 36(1), 1-19(2015)
Options
Outlines

/