测绘通报 ›› 2018, Vol. 0 ›› Issue (11): 78-82.doi: 10.13474/j.cnki.11-2246.2018.0354

Previous Articles     Next Articles

A Large Scale Spectral Clustering Algorithm Using Sampling Improved Weighted Kernel K-means

JIN Hai1, ZHANG Jinsong2, WU Rui1,3   

  1. 1. Shenzhen Polytechnic, Shenzhen 518055, China;
    2. Zhejiang University of Technology, Hangzhou 310014, China;
    3. Xi'an Jiaotong University, Xi'an 710061, China
  • Received:2018-06-11 Revised:2018-08-26 Online:2018-11-25 Published:2018-11-29

Abstract:

Classical spectral clustering algorithm transforms data clustering into graph partitioning problems, so, based on analyzing the equivalence between its Normalized Cut objective function and the weighted nuclear K-means function, a large-scale data spectrum based on sampling improved weighted nuclear K-means is designed, in which initial clustering preprocessing by Leaders is used to control the size of subsequent random sampling data and coverage of the original data categories, and the weighted kernel K-means iterative optimization is used to avoid the large resource consumption of Laplacian matrix feature decomposition of classical spectral clustering algorithm, thereby avoiding the time-space complexity of all nuclear moments by using of partial kernel matrices. Experimental results show that, the improved algorithm can greatly improve the clustering efficiency on the basis of maintaining similar clustering accuracy with the classic algorithm.

Key words: big data spectral clustering, weighted kernel K-means, data sampling, kernel matrix

CLC Number: