Machine Learning

Select

Set Function Optimization

Wei-Li Wu, Zhao Zhang, Ding-Zhu Du

Journal of the Operations Research Society of China 2019, 7 (2): 183-193. DOI: 10.1007/s40305-018-0233-3

Abstract （357）

PDF

Save

This article is an introduction to recent development of optimization theory on set functions, the nonsubmodular optimization, which contains two interesting results, DS (difference of submodular) functions decomposition and sandwich theorem, together with iterated sandwich method and data-dependent approximation. Some potential research problems will be mentioned.

Reference | Related Articles | Metrics | Comments（0）

Select

Layer-Wise Pre-Training Low-Rank NMF Model for Mammogram-Based Breast Tumor Classification

Wen-Ming Wu, Xiao-Hui Yang, Yun-Mei Chen, Juan Zhang, Dan Long, Li-Jun Yang, Chen-Xi Tian

Journal of the Operations Research Society of China 2019, 7 (4): 515-537. DOI: 10.1007/s40305-019-00262-z

Abstract （727）

PDF

Save

Image-based breast tumor classification is an active and challenging problem. In this paper, a robust breast tumor classification framework is presented based on deep feature representation learning and exploiting available information in existing samples. Feature representation learning of mammograms is fulfilled by a modified nonnegative matrix factorization model called LPML-LRNMF, which is motivated by hierarchical learning and layer-wise pre-training (LP) strategy in deep learning. Low-rank (LR) constraint is integrated into the feature representation learning model by considering the intrinsic characteristics of mammograms. Moreover, the proposed LPML-LRNMF model is optimized via alternating direction method of multipliers and the corresponding convergence is analyzed. For completing classification, an inverse projection sparse representation model is introduced to exploit information embedded in existing samples, especially in test ones. Experiments on the public dataset and actual clinical dataset show that the classification accuracy, specificity and sensitivity achieve the clinical acceptance level.

Reference | Related Articles | Metrics | Comments（0）

Select

Quadratic Kernel-Free Least Square Twin Support Vector Machine for Binary Classification Problems

Qian-Qian Gao, Yan-Qin Bai, Ya-Ru Zhan

Journal of the Operations Research Society of China 2019, 7 (4): 539-559. DOI: 10.1007/s40305-018-00239-4

Abstract （707）

PDF

Save

In this paper, a new quadratic kernel-free least square twin support vector machine (QLSTSVM) is proposed for binary classification problems. The advantage of QLSTSVM is that there is no need to select the kernel function and related parameters for nonlinear classification problems. After using consensus technique, we adopt alternating direction method of multipliers to solve the reformulated consensus QLSTSVM directly. To reduce CPU time, the Karush-Kuhn-Tucker (KKT) conditions is also used to solve the QLSTSVM. The performance of QLSTSVM is tested on two artificial datasets andseveral Universityof CaliforniaIrvine(UCI) benchmarkdatasets. Numerical results indicate that the QLSTSVM may outperform several existing methods for solving twin support vector machine with Gaussian kernel in terms of the classification accuracy and operation time.

Reference | Related Articles | Metrics | Comments（0）

Select

Truncated Fractional-Order Total Variation Model for Image Restoration

Raymond Honfu Chan, Hai-Xia Liang

Journal of the Operations Research Society of China 2019, 7 (4): 561-578. DOI: 10.1007/s40305-019-00250-3

Abstract （697）

PDF

Save

Fractional-order derivative is attracting more and more interest from researchers working on image processing because it helps to preserve more texture than total variation when noise is removed. In the existing works, the Grunwald-Letnikov fractional-order derivative is usually used, where the Dirichlet homogeneous boundary condition can only be considered and therefore the full lower triangular Toeplitz matrix is generated as the discrete partial fractional-order derivative operator. In this paper, a modified truncation is considered in generating the discrete fractional-order partial derivative operator and a truncated fractional-order total variation (tFoTV) model is proposed for image restoration. Hopefully, first any boundary condition can be used in the numerical experiments. Second, the accuracy of the reconstructed images by the tFoTV model can be improved. The alternating directional method of multiplier is applied to solve the tFoTV model. Its convergence is also analyzed briefly. In the numerical experiments, we apply the tFoTV model to recover images that are corrupted by blur and noise. The numerical results show that the tFoTV model provides better reconstruction in peak signal-to-noise ratio (PSNR) than the full fractional-order variation and total variation models. From the numerical results, we can also see that the tFoTV model is comparable with the total generalized variation (TGV) model in accuracy. In addition, we can roughly fix a fractional order according to the structure of the image, and therefore, there is only one parameter left to determine in the tFoTV model, while there are always two parameters to be fixed in TGV model.

Reference | Related Articles | Metrics | Comments（0）

Select

Generative Adversarial Networks with Joint Distribution Moment Matching

Yi-Ying Zhang, Chao-Min Shen, Hao Feng, Preston Thomas Fletcher, Gui-Xu Zhang

Journal of the Operations Research Society of China 2019, 7 (4): 579-597. DOI: 10.1007/s40305-019-00248-x

Abstract （637）

PDF

Save

Generative adversarial networks (GANs) have shown impressive power in the field of machine learning. Traditional GANs have focused on unsupervised learning tasks. In recent years, conditional GANs that can generate data with labels have been proposed in semi-supervised learning and have achieved better image quality than traditional GANs. Conditional GANs, however, generally only minimize the difference between marginal distributions of real and generated data, neglecting the difference with respect to each class of the data. To address this challenge, we propose the GAN with joint dis tribution moment matching (JDMM-GAN) for matching the joint distribution based on maximum mean discrepancy, which minimizes the differences of both the marginal and conditional distributions. The learning procedure is iteratively conducted by the stochastic gradient descent and back-propagation. We evaluate JDMM-GAN on several benchmark datasets, including MNIST, CIFAR-10 and the Extended Yale Face. Compared with the state-of-the-art GANs, JDMM-GAN generates more realistic images and achieves the best inception score for CIFAR-10 dataset.

Reference | Related Articles | Metrics | Comments（0）

Select

Longitudinal Image Analysis via Path Regression on the Image Manifold

Shi-Hui Ying, Xiao-Fang Zhang, Ya-Xin Peng, Ding-Gang Shen

Journal of the Operations Research Society of China 2019, 7 (4): 599-614. DOI: 10.1007/s40305-019-00251-2

Abstract （561）

PDF

Save

Longitudinal image analysis plays an important role in depicting the development of the brain structure, where image regression and interpolation are two commonly used techniques. In this paper, we develop an efficient model and approach based on a path regression on the image manifold instead of the geodesic regression to avoid the complexity of the geodesic computation. Concretely, first we model the deformation by diffeomorphism; then, a large deformation is represented by a path on the orbit of the diffeomorphism group action. This path is obtained by compositing several small deformations, which can be well approximated by its linearization. Second, we introduce some intermediate images as constraints to the model, which guides to form the best-fitting path. Thirdly, we propose an approximated quadratic model by local linearization method, where a closed form is deduced for the solution. It actually speeds up the algorithm. Finally, we evaluate the proposed model and algorithm on a synthetic data and a real longitudinal MRI data. The results show that our proposed method outperforms several state-of-the-art methods.

Reference | Related Articles | Metrics | Comments（0）

Select

Evolution Model Based on Prior Information for Narrow Joint Segmentation

Xin Wang, Shuai Xu, Zhen Ye, Chao-Zheng Zhou, Jing Qin

Journal of the Operations Research Society of China 2019, 7 (4): 629-642. DOI: 10.1007/s40305-019-00265-w

Abstract （987）

PDF

Save

Automated segmentation of hip joint computed tomography images is significantly important in the diagnosis and treatment of hip joint disease. In this paper, we propose an automatic hip joint segmentation method based on a variational model guided by prior information. In particular, we obtain prior features by automatic sample selection, get a discriminative function by training these selected samples and then integrate this prior information into our variational model. Numerical results demonstrate that the proposed method has high accuracy in segmenting narrow joint regions.

Reference | Related Articles | Metrics | Comments（0）

Select

Latent Local Feature Extraction for Low-Resolution Virus Image Classification

Zhi-Jie Wen, Zhi-Hu Liu, Yi-Chen Zong, Bao-Jun Li

Journal of the Operations Research Society of China 2020, 8 (1): 117-132. DOI: 10.1007/s40305-018-0212-8

Abstract （383）

PDF

Save

Virus image classification is a significant and challenging issue in both clinical virology and medical image processing. Due to the low-resolution virus images in the original dataset, there is tricky difficulty in extracting useful features from this kind of poor quality images adopting the traditional feature extraction methods. In this paper, we propose an effective and robust method, which eliminates the drawbacks of traditional local feature extraction methods and conducts latent local texture feature extraction thus to promote the accuracy of virus image classification. Firstly, the multi-scale principal component analysis (PCA) filters are learned from all original images. Then, it establishes a scale space for each PCA-filtered image by 2D Gaussian function. Finally, some typical feature descriptors are employed to extract texture features from all images, which include the original image and its filtered images by PCA and Gaussian filters. Aiming at the classification of low-resolution images, the proposed method solves the difficulty in extracting the essential feature from the original image and captures its latent and principal texture information from different perspectives in different filtered images. Experimental results show that the classification accuracy of the proposed method is much higher than state-of-the-art methods in the same low-resolution virus dataset, reaching 88.00%.

Reference | Related Articles | Metrics | Comments（0）

Select

A Brief Introduction to Manifold Optimization

Jiang Hu, Xin Liu, Zai-Wen Wen, Ya-Xiang Yuan

Journal of the Operations Research Society of China 2020, 8 (2): 199-248. DOI: 10.1007/s40305-020-00295-9

Abstract （891）

PDF

Save

Manifold optimization is ubiquitous in computational and applied mathematics, statistics,engineering,machinelearning,physics,chemistry,etc.Oneofthemainchallenges usually is the non-convexity of the manifold constraints. By utilizing the geometry of manifold, a large class of constrained optimization problems can be viewed as unconstrained optimization problems on manifold. From this perspective, intrinsic structures, optimality conditions and numerical algorithms for manifold optimization are investigated. Some recent progress on the theoretical results of manifold optimization is also presented.

Reference | Related Articles | Metrics | Comments（0）

Select

Optimization for Deep Learning: An Overview

Ruo-Yu Sun

Journal of the Operations Research Society of China 2020, 8 (2): 249-294. DOI: 10.1007/s40305-020-00309-6

Abstract （797）

PDF

Save

Optimization is a critical component in deep learning. We think optimization for neural networks is an interesting topic for theoretical research due to various reasons. First, its tractability despite non-convexity is an intriguing question and may greatly expand our understanding of tractable problems. Second, classical optimization theory is far from enough to explain many phenomena. Therefore, we would like to understand the challenges and opportunities from a theoretical perspective and review the existing research in this field. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum and then discuss practical solutions including careful initialization, normalization methods and skip connections. Second, we review generic optimization methods used in training neural networks, such as stochastic gradient descent and adaptive gradient methods, and existing theoretical results. Third, we review existing research on the global issues of neural network training, including results on global landscape, mode connectivity, lottery ticket hypothesis and neural tangent kernel.

Reference | Related Articles | Metrics | Comments（0）

Select

A Review on Deep Learning in Medical Image Reconstruction

Hai-Miao Zhang, Bin Dong

Journal of the Operations Research Society of China 2020, 8 (2): 311-340. DOI: 10.1007/s40305-019-00287-4

Abstract （795）

PDF

Save

Medical imaging is crucial in modern clinics to provide guidance to the diagnosis and treatment of diseases. Medical image reconstruction is one of the most fundamental and important components of medical imaging, whose major objective is to acquire high-quality medical images for clinical usage at the minimal cost and risk to the patients. Mathematical models in medical image reconstruction or, more generally, image restoration in computer vision have been playing a prominent role. Earlier mathematical models are mostly designed by human knowledge or hypothesis on the image to be reconstructed, and we shall call these models handcrafted models. Later, handcrafted plus data-driven modeling started to emerge which still mostly relies on human designs, while part of the model is learned from the observed data. More recently, as more data and computation resources are made available, deep learning based models (or deep models) pushed the data-driven modeling to the extreme where the models are mostly based on learning with minimal human designs. Both handcrafted and data-driven modeling have their own advantages and disadvantages. Typical handcrafted models are well interpretable with solid theoretical supports on the robustness, recoverability, complexity, etc., whereas they may not be flexible and sophisticated enough to fully leverage large data sets. Data-driven models, especially deep models, on the other hand, are generally much more flexible and effective in extracting useful information from large data sets, while they are currently still in lack of theoretical foundations. Therefore, one of the major research trends in medical imaging is to combine handcrafted modeling with deep modeling so that we can enjoy benefits from both approaches. The major part of this article is to provide a conceptual review of some recent works on deep modeling from the unrolling dynamics viewpoint. This viewpoint stimulates new designs of neural network architectures with inspirations from optimization algorithms and numerical differential equations. Given the popularity of deep modeling, there are still vast remaining challenges in the field, as well as opportunities which we shall discuss at the end of this article.

Reference | Related Articles | Metrics | Comments（0）

Select

How Can Machine Learning and Optimization Help Each Other Better?

Zhou-Chen Lin

Journal of the Operations Research Society of China 2020, 8 (2): 341-351. DOI: 10.1007/s40305-019-00285-6

Abstract （580）

PDF

Save

Optimization is an indispensable part of machine learning as machine learning needs to solve mathematical models efficiently. On the other hand, machine learning can also provide new momenta and new ideas for optimization. This paper aims at investigating how to make the interactions between optimization and machine learning more effective.

Reference | Related Articles | Metrics | Comments（0）

Select

Proposal of Japanese Vocabulary Difficulty Level Dictionaries for Automated Essay Scoring Support System Using Rubric

Megumi Yamamoto, Nobuo Umemura, Hiroyuki Kawano

Journal of the Operations Research Society of China 2020, 8 (4): 601-617. DOI: 10.1007/s40305-019-00270-z

Abstract （371）

PDF

Save

We are developing a Moodle plug-in, which is an AES (automated essay scoring) support system for the basic education of university students. Our system evaluates essays based on rubric, which has five evaluation viewpoints “Contents, Structure, Evidence, Style, and Skill”. Vocabulary level is one of the scoring items of Skill. It is calculated using Japanese Language Learners’ Dictionaries constructed by Sunakawa et al. Since this does not fully cover the words used in the student-level essays, we found that there is a problem with the accuracy of the vocabulary level scoring. In this paper, we propose to construct comprehensive Japanese vocabulary difficulty level dictionaries using Japanese Wikipedia as the corpus. We apply Latent Dirichlet Allocation (LDA) to the Wikipedia corpus and find the word appearance probability as oneoftheindexesofworddifficulty.WeusetheTF-IDFvalueinsteadoftheLDAvalue of the words, which rarely appears. As a result, we constructed highly comprehensive Japanese vocabulary difficulty level dictionaries. We confirmed that the vocabulary levelcanbescoredforallwordsinthetestdatasetbyusingtheconstructeddictionaries.

Reference | Related Articles | Metrics | Comments（0）

Select

Forecasting Daily Electric Load by Applying Artificial Neural Network with Fourier Transformation and Principal Component Analysis Technique

Yuji Matsuo, Tatsuo Oyama

Journal of the Operations Research Society of China 2020, 8 (4): 655-667. DOI: 10.1007/s40305-019-00282-9

Abstract （350）

PDF

Save

In this paper, we propose a hybrid forecasting model (HFM) for the short-term electric loadforecastingusingartificialneuralnetwork(ANN),discreteFouriertransformation (DFT) and principal component analysis (PCA) techniques in order to attain higher prediction accuracy. Firstly, we estimate Fourier coefficients by the DFT for predicting the next-day load curve with an ANN and obtain approximate load curves by applying the inverse discrete Fourier transformation. Approximate curves, together with other input variables, are given to the ANN to predict the next-day hourly load curves. Furthermore, we predict PCA scores to obtain approximate load curves in the first step, which are then given to the ANN again in the second step. Both DFT and PCA models use input variables such as calendrical and meteorological data as well as past electric loads. Applying those models for forecasting hourly electric load in the metropolitan area of Japan for January and May in 2018, we train our models using historical data since January 2008. The forecast results show that the HFM consisting of “ANN with DFT” and “ANN with PCA” predicts next-day hourly loads more accurately than the conventional three-layered ANN approach. Their corresponding mean average absolute errors show 2.7% for ANN with DFT, 2.6% for ANN with PCA and 3.0% for the conventional ANN approach. We also find that in May, when electric demand is smaller with smaller fluctuations, forecasting errors are much smaller than January for all the models. Thus, we can conclude that the HFM would contribute to attaining significantly higher forecasting accuracy.

Reference | Related Articles | Metrics | Comments（0）

Select

An Overview of Stochastic Quasi-Newton Methods for Large-Scale Machine Learning

Tian-De Guo, Yan Liu, Cong-Ying Han

Journal of the Operations Research Society of China 2023, 11 (2): 245-275. DOI: 10.1007/s40305-023-00453-9

Abstract （1097）

PDF

Save

Numerous intriguing optimization problems arise as a result of the advancement of machine learning. The stochastic first-ordermethod is the predominant choicefor those problems due to its high efficiency. However, the negative effects of noisy gradient estimates and high nonlinearity of the loss function result in a slow convergence rate. Second-order algorithms have their typical advantages in dealing with highly nonlinear and ill-conditioning problems. This paper provides a review on recent developments in stochastic variants of quasi-Newton methods, which construct the Hessian approximations using only gradient information. We concentrate on BFGS-based methods in stochastic settings and highlight the algorithmic improvements that enable the algorithm to work in various scenarios. Future research on stochastic quasi-Newton methods should focus on enhancing its applicability, lowering the computational and storage costs, and improving the convergence rate.

Reference | Related Articles | Metrics | Comments（0）

Select

An Automatic Fuzzy Clustering Algorithm for Discrete Elements

Tai Vovan, Yen Nguyenhoang, Sang Danh

Journal of the Operations Research Society of China 2023, 11 (2): 309-325. DOI: 10.1007/s40305-021-00388-z

Abstract （983）

PDF

Save

This research proposes a measure called cluster similar index (CSI) to evaluate the similarity of cluster for discrete elements. The CSI is used as a criterion to build the automatic fuzzy clustering algorithm. This algorithm can determine the suitable number of clusters, find the elements in each cluster, give the probability to belong to the clusters of each element, and evaluate the quality of the established clusters at the same time. The proposed algorithm can perform quickly and effectively by the established MATLAB procedure. Several numerical examples illustrate the proposed algorithm and show the advantages in comparing with the existing ones. Finally, applying the proposed algorithm in the image recognition shows potentiality in the reality of this research.

Reference | Related Articles | Metrics | Comments（0）

Special Topics