Loading...

Table of Content

    30 June 2023, Volume 11 Issue 2
    Special Issue: Machine Learning and Optimization Algorithm
    Preface
    Tian-De Guo
    2023, 11(2):  243-244.  doi:10.1007/s40305-022-00452-2
    Asbtract ( 946 )   PDF  
    Related Articles | Metrics
    An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning
    Tian-De Guo, Yan Liu, Cong-Ying Han
    2023, 11(2):  245-275.  doi:10.1007/s40305-023-00453-9
    Asbtract ( 975 )   PDF  
    References | Related Articles | Metrics
    Numerous intriguing optimization problems arise as a result of the advancement of machine learning. The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency. However, the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate. Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems. This paper provides a review on recent developments in stochastic variants of quasi-Newton methods, which construct the Hessian approximations using only gradient information. We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios. Future research on stochastic quasi-Newton methods should focus on enhancing its applicability, lowering the computational and storage costs, and improving the convergence rate.
    A Mini-Batch Proximal Stochastic Recursive Gradient Algorithm with Diagonal Barzilai–Borwein Stepsize
    Teng-Teng Yu, Xin-Wei Liu, Yu-Hong Dai, Jie Sun
    2023, 11(2):  277-307.  doi:10.1007/s40305-022-00436-2
    Asbtract ( 909 )   PDF  
    References | Related Articles | Metrics
    Many machine learning problems can be formulated as minimizing the sum of a function and a non-smooth regularization term. Proximal stochastic gradient methods are popular for solving such composite optimization problems. We propose a minibatch proximal stochastic recursive gradient algorithm SRG-DBB, which incorporates the diagonal Barzilai–Borwein (DBB) stepsize strategy to capture the local geometry of the problem. The linear convergence and complexity of SRG-DBB are analyzed for strongly convex functions. We further establish the linear convergence of SRGDBB under the non-strong convexity condition. Moreover, it is proved that SRG-DBB converges sublinearly in the convex case. Numerical experiments on standard data sets indicate that the performance of SRG-DBB is better than or comparable to the proximal stochastic recursive gradient algorithm with best-tuned scalar stepsizes or BB stepsizes. Furthermore, SRG-DBB is superior to some advanced mini-batch proximal stochastic gradient methods.
    An Automatic Fuzzy Clustering Algorithm for Discrete Elements
    Tai Vovan, Yen Nguyenhoang, Sang Danh
    2023, 11(2):  309-325.  doi:10.1007/s40305-021-00388-z
    Asbtract ( 899 )   PDF  
    References | Related Articles | Metrics
    This research proposes a measure called cluster similar index (CSI) to evaluate the similarity of cluster for discrete elements. The CSI is used as a criterion to build the automatic fuzzy clustering algorithm. This algorithm can determine the suitable number of clusters, find the elements in each cluster, give the probability to belong to the clusters of each element, and evaluate the quality of the established clusters at the same time. The proposed algorithm can perform quickly and effectively by the established MATLAB procedure. Several numerical examples illustrate the proposed algorithm and show the advantages in comparing with the existing ones. Finally, applying the proposed algorithm in the image recognition shows potentiality in the reality of this research.
    Randomized Algorithms for Orthogonal Nonnegative Matrix Factorization
    Yong-Yong Chen, Fang-Fang Xu
    2023, 11(2):  327-345.  doi:10.1007/s40305-020-00322-9
    Asbtract ( 932 )   PDF  
    References | Related Articles | Metrics
    Orthogonal nonnegative matrix factorization (ONMF) is widely used in blind image separation problem, document classification, and human face recognition. The model of ONMF can be efficiently solved by the alternating direction method of multipliers and hierarchical alternating least squares method. When the given matrix is huge, the cost of computation and communication is too high. Therefore, ONMF becomes challenging in the large-scale setting. The random projection is an efficient method of dimensionality reduction. In this paper, we apply the random projection to ONMF and propose two randomized algorithms. Numerical experiments show that our proposed algorithms perform well on both simulated and real data.
    A Framework of Convergence Analysis of Mini-batch Stochastic Projected Gradient Methods
    Jian Gu, Xian-Tao Xiao
    2023, 11(2):  347-369.  doi:10.1007/s40305-019-00276-7
    Asbtract ( 918 )   PDF  
    References | Related Articles | Metrics
    In this paper, we establish a unified framework to study the almost sure global convergence and the expected convergencerates of a class ofmini-batch stochastic(projected) gradient (SG) methods, including two popular types of SG: stepsize diminished SG and batch size increased SG. We also show that the standard variance uniformly bounded assumption, which is frequently used in the literature to investigate the convergence of SG, is actually not required when the gradient of the objective function is Lipschitz continuous. Finally, we show that our framework can also be used for analyzing the convergence of a mini-batch stochastic extragradient method for stochastic variational inequality.
    A Gradient Descent Method for Estimating the Markov Chain Choice Model
    Lei Fu, Dong-Dong Ge
    2023, 11(2):  371-381.  doi:10.1007/s40305-021-00365-6
    Asbtract ( 918 )   PDF  
    References | Related Articles | Metrics
    In this paper, we propose a gradient descent method to estimate the parameters in a Markov chain choice model. Particularly, we derive closed-form formula for the gradient of the log-likelihood function and show the convergence of the algorithm. Numerical experiments verify the efficiency of our approach by comparing with the expectation-maximization algorithm. We show that the similar result can be extended to a more general case that one does not have observation of the no-purchase data.
    Approximate Customized Proximal Point Algorithms for Separable Convex Optimization
    Hong-Mei Chen, Xing-Ju Cai, Ling-Ling Xu
    2023, 11(2):  383-408.  doi:10.1007/s40305-022-00412-w
    Asbtract ( 936 )   PDF  
    References | Related Articles | Metrics
    Proximal point algorithm (PPA) is a useful algorithm framework and has good convergence properties. Themain difficulty is that the subproblems usually only have iterative solutions. In this paper, we propose an inexact customized PPA framework for twoblock separable convex optimization problem with linear constraint. We design two types of inexact error criteria for the subproblems. The first one is absolutely summable error criterion, under which both subproblems can be solved inexactly. When one of the two subproblems is easily solved, we propose another novel error criterion which is easier to implement, namely relative error criterion. The relative error criterion only involves one parameter, which is more implementable. We establish the global convergence and sub-linear convergence rate in ergodic sense for the proposed algorithms. The numerical experiments on LASSO regression problems and total variation-based image denoising problem illustrate that our new algorithms outperform the corresponding exact algorithms.