测绘通报 ›› 2024, Vol. 0 ›› Issue (6): 109-114,138.doi: 10.13474/j.cnki.11-2246.2024.0619

• 学术研究 • 上一篇    

基于Hilbert曲线的轨迹大数据分布式存储与索引方法

陈凯1, 宋炜炜1, 金宝轩2, 李永宁1, 普红勋1   

  1. 1. 昆明理工大学, 云南 昆明 650093;
    2. 云南省自然资源厅信息中心, 云南 昆明 650224
  • 收稿日期:2023-11-01 发布日期:2024-06-27
  • 通讯作者: 宋炜炜。E-mail:weiweisong@kust.edu.cn
  • 作者简介:陈凯(1999—),男,硕士生,主要研究方向为时空大数据、数据库存储和索引。E-mail:chenkai@stu.kust.edu.cn
  • 基金资助:
    云南省重大科技专项计划(202202AD080010)

A distributed storage and index method of trajectory big data based on the Hilbert curve

CHEN Kai1, SONG Weiwei1, JIN Baoxuan2, LI Yongning1, PU Hongxun1   

  1. 1. School of Land and Resource Engineering, Kunming University of Science and Technology, Kunming 650093, China;
    2. Information Center Department of Natural Resources of Yunnan Province, Kunming 650224, China
  • Received:2023-11-01 Published:2024-06-27

摘要: 针对具有时空特征的轨迹大数据快速增长的趋势及对其快速查询的需求,传统的关系型数据库对海量轨迹数据的存储和特定的查询需求具有一定局限性,而非关系型数据库又难以满足海量数据的高效索引需求,轨迹数据的存储和索引效率仍亟待提高。本文设计并实现了一种基于HBase数据库的存储和检索的框架,以应对时空轨迹数据的高效管理。首先,构建了一种新颖的Rowkey结构,结合时空索引工具构建了GeoMesa-HBase底层存储模型;然后,融合了基于Hilbert曲线的编码技术构建空间索引,提高了轨迹数据的存储和检索效率;最后,为了评估所提方法的有效性,将其存储和查询性能与传统存储数据库(HBase和MySQL)和Geohash索引进行比较。试验结果表明,本文方案能够有效提升轨迹数据的存储和检索效率,这对于应对轨迹大数据管理领域面临的多项挑战具有重要现实意义。

关键词: 轨迹数据, 时空索引, 时空存储, HBase, Hilbert曲线

Abstract: In response to the rapid growth of trajectory big data with spatio-temporal characteristics and the urgent need for its fast query, traditional relational databases have certain limitations on the storage of massive trajectory data and specific query requirements, while non-relational databases are difficult to meet the efficient indexing requirements of massive data, and the efficiency of the storage and indexing of trajectory data is still in urgent need of improvement. In this paper, a framework for storage and retrieval based on HBase database is designed and implemented to cope with the efficient management of spatio-temporal trajectory data. Firstly, a novel Rowkey structure is designed, and the GeoMesa-HBase underlying storage model is constructed by combining spatio-temporal indexing tools. Secondly, a Hilbert curve-based coding technique is integrated to construct the spatial index, which improves the storage and retrieval efficiency of trajectory data. In order to evaluate the effectiveness of the proposed method, this paper compares its storage and query performance with traditional storage databases (HBase and MySQL) and Geohash index. The experimental results show that the scheme is able to achieve effective storage of trajectory data and improve the retrieval efficiency, which is of great practical significance in addressing the challenges associated with trajectory big data management.

Key words: trajectory data, spatio-temporal index, spatio-temporal storage, HBase, Hilbert curve

中图分类号: