Bulletin of Surveying and Mapping ›› 2024, Vol. 0 ›› Issue (10): 25-31.doi: 10.13474/j.cnki.11-2246.2024.1005.

Previous Articles    

Research on distributed processing framework for big data analysis of real-time water environment monitoring

CHEN Jianglong1, SONG Weiwei1, LI Jiahao1, WEI Qunlan1, WANG Jinxia1, DAI Bolan2   

  1. 1. Faculty of Land Resource Engineering, Kunming University of Science and Technology, Kunming 650093, China;
    2. 722 Research Institute of China State Shipbuilding Corporation, Wuhan 430200, China
  • Received:2024-01-29 Published:2024-11-02

Abstract: The rapid accumulation and growth of water quality monitoring data brings new opportunities and challenges to water quality research. In view of the problems of poor timeliness, unintuitive data display, and inefficient data processing faced by water quality monitoring and analysis at present, this paper constructs WaterSpark, a distributed processing framework for real-time water environment monitoring and big data analysis, based on big data technology and data visualization technology, using the improved Canadian Council of Ministers of the Environment water quality index (CCME-WQI) and the Spark machine learning library (MLlib), Applying the water quality monitoring data of nine plateau lakes in Yunnan province, the results show that WaterSpark has excellent performance in real-time water quality transmission, cleaning and archiving, and efficient computing and analysis. It can enable large-scale water quality data to be captured and analyzed timely and accurately, and the distributed data sets and clusters can cope with the growing water quality data to ensure performance scalability, and to support more water quality indicators and water quality monitoring on a larger scale.

Key words: water quality monitoring, distributed processing framework, water quality index, nine plateau lakes in Yunnan province

CLC Number: