An Approximation Algorithm Based on Seeding Algorithm for Fuzzy k-Means Problem with Penalties

Expand
  • School of Mathematics and Statistics, Shandong Normal University, Jinan 250014, Shandong, China

Received date: 2020-06-26

  Revised date: 2021-12-19

  Online published: 2024-06-12

Supported by

This research was supported by Higher Educational Science and Technology Program of Shandong Province (No. J17KA171) and Natural Science Foundation of Shandong Province (No. ZR2020MA029).

Abstract

As a classic NP-hard problem in machine learning and computational geometry, the k-means problem aims to partition the given dataset into k clusters according to the minimal squared Euclidean distance. Different from k-means problem and most of its variants, fuzzy k-means problem belongs to the soft clustering problem, where each given data point has relationship to every center point. Compared to fuzzy k-means problem, fuzzy k-means problem with penalties allows that some data points need not be clustered instead of being paid penalties. In this paper, we propose an O(αk ln k)- approximation algorithm based on seeding algorithm for fuzzy k-means problem with penalties, where α involves the ratio of the maximal penalty value to the minimal one. Furthermore, we implement numerical experiments to show the effectiveness of our algorithm.

Cite this article

Wen-Zhao Liu, Min Li . An Approximation Algorithm Based on Seeding Algorithm for Fuzzy k-Means Problem with Penalties[J]. Journal of the Operations Research Society of China, 2024 , 12(2) : 387 -409 . DOI: 10.1007/s40305-022-00399-4

References

[1] Aloise, D., Deshpande, A., Hansen, P., Popat, P.:NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75(2), 245-248(2009)
[2] Ahmadian, S., Norouzi-Fard, A., Svensson, O., Ward, J.:Better guarantees for k-means and Euclidean k-median by primal-dual algorithms. In:Proceedings of Foundations of Computer Science. pp. 61-72(2017)
[3] Lloyd, S.P.:Least squares quantization in PCM's. IEEE Trans. Inf. Theory 28(2), 129-137(1982)
[4] Arthur, D., Vassilvitskii, S.:k-means++:the advantages of careful seeding. In:Proceedings of Symposium on Discrete Algorithms. pp. 1027-1035(2007)
[5] Ji, S., Xu, D.C., Guo, L.K., Li, M., Zhang, D.M.:The seeding algorithm for spherical k-means clustering with penalties. J. Comb. Opt.(2020). https://doi.org/10.1007/s10878-020-00569-1
[6] Ostrovsky, R., Rabani, Y., Schulman, L.J., Swamy, C.:The effectiveness of Lloyd-type methods for the k-means problem. J. ACM 59(6), 1-22(2012)
[7] Xu, D.C., Xu, Y.C., Zhang, D.M.:A survey on algorithm for k-means problem and its variants. Oper. Res. Trans. 21(2), 101-109(2017)
[8] Zhang, D., Li, M., Xu, D.C., Zhang, Z.N.:A survey on theory and algorithms for k-means problems. Sci. Sin. Math. 50(9), 1387-1404(2020)
[9] Zhang, D.M., Hao, C.L., Wu, C.C., Xu, D.C., Zhang, Z.N.:A local search approximation algorithm for the k-means problem with penalties, In:Proceedings of COCOON. pp. 568-574(2017). https://doi.org/10.1007/978-3-319-62389-4_47
[10] Chang, X.Y., Wang, Y., Li, R.J., Xu, Z.B.:Sparse k-means with l/l0 penalty for high-dimensional data clustering. Stat. Sin. 28, 1265-1284(2018)
[11] Tseng, G.:Penalized and weighted k-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23(17), 2247-2255(2007)
[12] Li, M., Xu, D.C., Yue, J., Zhang, D.M., Zhang, P.:The seeding algorithm for k-means problem with penalties. J. Comb. Opt. 39, 15-32(2020)
[13] Li, M.:The bi-criteria seeding algorithms for two variants of k-means problem. J. Comb. Opt.(2020). https://doi.org/10.1007/s10878-020-00537-9
[14] Wang, P..Z.:Pattern recognition with fuzzy objective function algorithms (James C. Bezdek). SIAM Rev. 25(3), 442-442(1983)
[15] Wang, S.N., Zhang, X.Y., Cheng, Y.J., Jiang, F., Yu, W.T., Peng, J.:A fast contentbased spam filtering algorithm with fuzzy-SVM and k-means. In:Proceedings of BigComp. pp. 301-307(2018)
[16] Stetco, A., Zeng, X.J., Keane, J.:Fuzzy C-means++:fuzzy C-means with effective seeding initialization. Exp. Syst. Appl. 42(21), 7541-7548(2015)
[17] Liu, Q., Liu, J.X., Li, M., Zhou, Y.:A novel initialization algorithm for fuzzy C-means problem. In:Proceedings of TAMC. pp. 215-225(2020). https://doi.org/10.1007/978-3-030-59267-7_19
[18] Har-Peled, S., Sadri, B.:How fast is the k-means method?Algorithmica 41(3), 877-885(2005)
[19] http://archive.ics.uci.edu/ml/datasets
Options
Outlines

/