测绘通报 ›› 2021, Vol. 0 ›› Issue (5): 15-19,29.doi: 10.13474/j.cnki.11-2246.2021.0134

• 时空大数据分析 • 上一篇    下一篇

基于共享单车时空大数据的细粒度聚类

张强, 白征东, 辛浩浩, 程宇航, 郭锦萍   

  1. 清华大学土木工程系, 北京 100084
  • 收稿日期:2020-09-11 发布日期:2021-05-28
  • 通讯作者: 白征东。E-mail:zbai@tsinghua.edu.cn
  • 作者简介:张强(1998-),男,硕士生,主要研究方向为大地测量、卫星导航和定位。E-mail:zq_thu@qq.com
  • 基金资助:
    国家重点研发计划(2017YFB0504202)

Fine-grained clustering based on spatio-temporal big data of shared bikes

ZHANG Qiang, BAI Zhengdong, XIN Haohao, CHENG Yuhang, GUO Jinping   

  1. Department of Civil Engineering, Tsinghua University, Beijing 100084, China
  • Received:2020-09-11 Published:2021-05-28

摘要: 针对传统上单独采用K-means或DBSCAN等方法对共享单车位置数据聚类时造成的聚类结果与真实的聚类结构不符的问题,本文提出了一种基于共享单车时空大数据的细粒度聚类方法(FGCM)。该方法通过DBSCAN进行初始聚类,并在此基础上采用GMM-EM算法进行细部聚类,以提取细粒度层级的热点区域。试验表明,该方法可根据密度阈值排除噪声和离群值,无需指定细部聚类簇数,簇的形状和大小比较灵活。在对共享单车大数据位置特征进行聚类时,与传统的单独采用K-means或DBSCAN的方法相比,FGCM具备更高的精细程度,能够充分展现共享单车的实际聚集特征,可用于规划共享单车电子围栏等设施,在不降低通勤效率的基础上规范共享单车的停放问题。

关键词: 共享单车, 时空大数据, 细粒度聚类, DBSCAN, K-means

Abstract: Aiming at the problem that the clustering results obtained by the traditional methods such as K-means and DBSCAN alone when clustering the location data of shared bikes are inconsistent with the real clustering structure, a fine-grained clustering method(FGCM) based on spatio-temporal big data of shared bikes is proposed. This method uses DBSCAN to perform initial clustering, and on this basis, uses GMM-EM algorithm to perform detailed clustering to extract fine-grained hotspots. Experiments show that this method can eliminate noise and outliers based on the density threshold, has no need to specify the number of detailed clusters, and the shape and size of the clusters are more flexible. In the case of clustering the location features of the big data of shared bikes, compared with traditional methods that use K-means or DBSCAN alone, FGCM has a higher degree of refinement, and can fully demonstrate the actual characteristics of shared bikes, which can be used to plan facilities such as electronic fences and helps to regulate the parking of shared bikes without reducing the commuting efficiency.

Key words: shared bikes, spatio-temporal big data, fine-grained clustering, DBSCAN, K-means

中图分类号: