测绘通报 ›› 2018, Vol. 0 ›› Issue (9): 126-129,155.doi: 10.13474/j.cnki.11-2246.2018.0294

• 技术交流 • 上一篇    下一篇

K叉树地址的模糊匹配研究与实现

李新放1,2, 宋转玲1, 陈学业3, 贺彪3, 刘海行1,2   

  1. 1. 国家海洋局第一海洋研究所, 山东 青岛 266061;
    2. 青岛海洋科学与技术国家实验室区域海洋动力学与数值模拟功能实验室, 山东 青岛 266237;
    3. 深圳市数字城市工程研究中心, 广东 深圳 518040
  • 收稿日期:2017-12-26 修回日期:2018-01-29 出版日期:2018-09-25 发布日期:2018-09-29
  • 作者简介:李新放(1983-),男,硕士生,助理工程师,主要研究方向为海洋GIS信息集成与可视化。E-mail:lixinfang@fio.org.cn
  • 基金资助:

    国家重点研发计划(2016YFA0602200);中央级公益性科研院所基本科研业务费专项资金(2015G18;2015P12)

Research and Implementation of Fuzzy Matching for K-tree Address

LI Xinfang1,2, SONG Zhuanling1, CHEN Xueye3, HE Biao3, LIU Haixing1,2   

  1. 1. The First Institute of Oceanography, State Oceanic Administration, Qingdao 266061, China;
    2. Laboratory for Regional Oceanography and Numerical Modeling, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China;
    3. Shenzhen Research Center of Digital City Engineering, Shenzhen 518040, China
  • Received:2017-12-26 Revised:2018-01-29 Online:2018-09-25 Published:2018-09-29
  • Contact: 刘海行。E-mail:liuhx@fio.org.cn E-mail:liuhx@fio.org.cn

摘要:

在数字城市信息资源的集成和融合中,地名地址匹配是一项非常关键的基础技术。由于中文语义和地名地址描述的复杂性,中文地址的匹配比英文要复杂得多,基于海量中文地址数据进行准确分词,实现快速高效的地址匹配是城市数据集成融合的关键问题。本文在对现有地址编码及分词技术研究的基础上,通过一种基于规则和统计的组合方法来实现中文地址分词,并且使用K叉树的结构实现对中文地址的存储,提高了中文地址匹配查询的准确度和效率。基于预处理后的10 000个深圳市地址数据,通过开发原型系统对该方法进行了比较测试,验证了该方法的有效性。

关键词: 地址匹配, 分词, 模糊匹配, K叉树

Abstract:

In the integration of digital city information resources,the address matching is a very crucial basic technology.Due to the complexity of description of Chinese semantics and address,the matching of Chinese addresses is much more complicated than that of English.How to accurately segment words based on mass data of Chinese address and realize fast and efficient address matching is an urgent problem to be solved.Based on the research of existing address coding and word segmentation technology,this paper proposes a combination method based on rules and statistics to implement Chinese address segmentation,and uses K-tree to store the Chinese address and improves the Chinese address matching query Accuracy and efficiency.The method was tested by the prototype system based on 10,000 address data after pretreatment in Shenzhen City to verify the effectiveness of the method.

Key words: address matching, address segmentation, fuzzy matching, K-tree

中图分类号: