高级搜索
师正, 吴子珉, 孙京, 胡佳瑞, 管啸林. 2023: 基于三种聚类算法的雷暴系统识别效果对比研究. 暴雨灾害, 42(3): 334-345. DOI: 10.12406/byzh.2022-179
引用本文: 师正, 吴子珉, 孙京, 胡佳瑞, 管啸林. 2023: 基于三种聚类算法的雷暴系统识别效果对比研究. 暴雨灾害, 42(3): 334-345. DOI: 10.12406/byzh.2022-179
SHI Zheng, WU Zimin, Sun Jing, HU Jiarui, GUAN Xiaolin. 2023: Comparative research on thunderstorms identification based on three clustering methods. Torrential Rain and Disasters, 42(3): 334-345. DOI: 10.12406/byzh.2022-179
Citation: SHI Zheng, WU Zimin, Sun Jing, HU Jiarui, GUAN Xiaolin. 2023: Comparative research on thunderstorms identification based on three clustering methods. Torrential Rain and Disasters, 42(3): 334-345. DOI: 10.12406/byzh.2022-179

基于三种聚类算法的雷暴系统识别效果对比研究

Comparative research on thunderstorms identification based on three clustering methods

  • 摘要: 为了评估不同聚类算法对雷暴系统的识别效果,进一步提高雷电临近预报能力,本文采用地闪定位数据和雷达反射率数据,利用基于密度的空间聚类(Density-Based Spatial Clustering of Application with Noise,DBSCAN)、快速搜索和查找密度峰聚类(Clustering by Fast Search and Find of Density Peaks,CFSFDP)以及改进的快速搜索和查找密度峰聚类(Extended Clustering by Fast Search and Find of Density Peaks,E_CFSFDP)三种聚类算法,对2018年9月21日19∶15—20∶57(北京时)发生在(114°—117°E、27°—30°N)区域的一次雷暴过程进行了聚类识别计算,探讨了三类聚类算法在雷暴系统识别中的差异。结果表明:(1) DBSCAN算法在地闪数据分布清晰且不同数据簇之间有显著距离间隔时,分类识别的准确率较高;当各个闪电数据簇的簇间距离或密度相差很大时,分类识别的准确率较低;(2) 地闪数据“无密度峰值”分布时CFSFDP算法会分裂出错误类,每个闪电数据簇仅具备唯一的密度峰值点是CFSFDP算法识别准确的前提条件;(3) E_CFSFDP算法解决了CFSFDP算法的“无密度峰值”问题,受地闪数据分布影响较小,因此基于E_CFSFDP算法的雷暴系统识别准确率明显高于DBSCAN和CFSFDP算法。

     

    Abstract: In order to evaluate the recognition effect of different clustering algorithms on thunderstorm system, further improve the ability of lightning nowcasting, in this study, with the ground-based lightning location data and radar reflectivity data, three different clustering algorithms for thunderstorm identification are analyzed for a thunderstorm case in the region of 114°-117°E, 27°-30°N and during the time window of 19∶15 BT to 20∶57 BT on 21 September, 2018. The performance of these three algorithms, which are density-based spatial clustering of application with noise (DBSCAN), clustering by fast search and find of density peaks (CFSFDP) and extended clustering by fast search and find of density peaks (E_CFSFDP) are compared and discussed. Here are the results. (1) The DBSCAN algorithm has a higher accuracy of classification and identification when the distributions of lightning location show clear patterns and different clusters are apart from each other, but the accuracy will be low when the density of each lightning data cluster is uneven or the distances between different clusters are very large. (2) the CFSFDP algorithm works when each cluster in the data sets has one single density peak, otherwise, when there are no density peaks, CFSFDP will be failed. (3) The E_CFSFDP algorithm can solve the "no density peaks" issue of the CFSFDP algorithm, and the distribution of lightning location has little effect on the clustering method. Generally, the accuracy of the thunderstorm identification based on the E_CFSFDP algorithm is significantly higher than that of the DBSCAN and CFSFDP algorithms.

     

/

返回文章
返回