Evaluation of Landslide Susceptibility by Optimization Integrated Machine Learning Algorithm Based on Gradient Boosting: Take Both Banks of Yarlung Zangbo River and Niyang River as Examples
-
摘要:
雅鲁藏布江与尼洋河两岸地质构造活跃,山体滑坡时常发生,滑坡易发性评价能有效的减少因灾害发生所造成的对人类生命和财产的伤害。笔者基于基尼系数的加权随机森林、XGBoost和LightGBM算法在滑坡易发性中的性能。选取188个滑坡样本和7个影响因素,应用五折交叉验证法训练模型,训练过程中同时考虑特征选择算法、运用贝叶斯方法优化超参数后,采用precision、recall、F1、Accuracy指标对各个级别的预测结果进行分析。结果表明:在高程为32~1 544 m与2 722~3 752 m、坡度为30°~40°、距断裂带、河流与道路200 m以内的区域最容易发生滑坡;滑坡极高与高易发性分布为12.14%和12.41%,低和极低易发性占比分别为26.47%与29.55%,区内一半以上的地区不容易发生滑坡灾害;LightGBM模型在所有模型中表现最好,AUC值为0.843 2,准确度为0.853 1,F1分数为0.834 5;墨脱县的达木乡、帮辛乡,林芝县的丹娘、里龙、扎西饶登乡,朗县的陇村,工布江达的江达乡位于极高易发区,发生滑坡概率极大,在这些地区应采取相应的地质灾害防治措施。
Abstract:The geological structures on both banks of the Yarlung Zangbo river and the Niyang river are active, and landslides occur frequently. The landslide susceptibility assessment can effectively reduce the damage to human life and property caused by disasters. This paper studies the performances of Weighted Random Forests, XGBoost and LightGBM algorithms based on Gini coefficient in landslide susceptibility. Select 188 landslide samples and 7 influencing factors, and use the 50–fold cross–validation method to train the model. During the training process, the feature selection algorithm is considered at the same time, and the Bayesian method is used to optimize the hyperparameters. Analysis of forecast results at the level. The results show that landslide is most likely to occur within the elevation of 32~1 544 m and 2 722~3 752 m, the gradient of 30°~40°, and the distance of 200 m from the fault zone, river and road. The extremely high and high landslide prone areas account for 12.14% and 12.41% respectively, and the low and extremely low landslide prone areas account for 26.47% and 29.55% respectively. More than half of the areas in Nyingchi prefecture are not prone to landslide disasters. Among all models, LightGBM model performs best, with AUC value of 0.843 2, accuracy of 0.853 1, and F1 score of 0.834 5. Damu township and Bangxin township in Motuo county, Danniang, Lilong, Zhaxi Raodeng township in Linzhi county, Long village in Lang county, and Jiangda township in Gongbujiangda county are positioned in extraordinarily high–risk areas, with a excessive likelihood of landslides. Corresponding prevention and control measures should be taken in these areas.
-
Key words:
- gradient boosting /
- XGBoost /
- LightGBM /
- machine learning /
- landslide susceptibility
-
表 1 因子间皮尔逊相关系数表
Table 1. Pearson correlation coefficient between factors
因子 高程 道路 河流 坡度 断裂带与断层 地层岩性 土地利用类型 高程 1.000 0 −0.162 4 0.155 4 −0.170 8 0.231 7 −0.256 4 −0.029 8 道路 −0.162 4 1.000 0 0.140 5 0.349 3 −0.207 6 −0.093 0 0.002 5 河流 0.155 4 0.140 5 1.000 0 0.126 9 −0.067 2 0.301 1 0.012 2 坡度 −0.170 8 0.349 3 0.126 9 1.000 0 −0.237 1 −0.051 0 −0.064 9 断裂带与断层 0.231 7 −0.207 6 −0.067 2 −0.237 1 1.000 0 −0.196 0 −0.265 4 地层岩性 −0.256 4 −0.093 0 0.301 1 −0.051 0 −0.196 0 1.000 0 0.072 5 土地利用类型 −0.029 8 0.002 5 0.012 2 −0.064 9 −0.265 4 0.072 5 1.000 0 表 2 机器学习模型易发性分区对比
Table 2. Comparison of machine learning model vulnerability zones
类别 机器学习模型 Gini–RF XGBoost LightGBM 栅格
个数栅格
占比滑坡
点个数滑坡
占比栅格
个数栅格
占比滑坡
点个数滑坡
占比栅格
个数栅格
占比滑坡
点个数滑坡
占比极高 14766439 11.99% 44 23.40% 14840333 12.05% 52 27.66% 14951174 12.14% 56 29.79% 高 15554640 12.63% 68 36.17% 15394537 12.50% 72 38.30% 15283696 12.41% 75 39.89% 中 24114003 19.58% 38 20.21% 24163265 19.62% 40 21.28% 23929268 19.43% 42 22.34% 低 32968940 26.77% 22 11.70% 32981256 26.78% 10 5.32% 32599471 26.47% 8 4.26% 极低 35752274 29.03% 16 8.51% 35776905 29.05% 14 7.45% 36392714 29.55% 7 3.72% 表 3 各机器学习模型准确率
Table 3. Accuracy of each machine learning model
机器学习模型 Gini–RF XGBoost LightGBM AUC 0.752 4 0.803 5 0.825 6 5–fold 0.822 5 0.835 8 0.843 2 ACC 0.723 4 0.814 8 0.825 6 5–fold 0.753 4 0.835 9 0.853 1 F1-score 0.775 2 0.786 7 0.802 1 5-fold 0.802 6 0.825 6 0.834 5 Precesion 0.783 4 0.796 8 0.804 5 5–fold 0.802 6 0.813 2 0.825 1 表 4 近几年以来滑坡事件
Table 4. Landslide events in recent years
地区 位置 发生时间 来源 易发性分区 林芝市加拉村 E 94°54′04″,N 29°41′45″ 2018.10.29 新华社 中 林芝市加拉村下游7公里处 E 94°54′24″,N 29°41′27″ 2022.01.22 中国青年网 中 林芝市波密县古乡索通村羌纳自然村 E 95°27′41″,N 30°00′21″ 2017.8.24 中国军视网 中 林芝市朗县辖区560国道K80处 E 92°49′24″,N 29°04′03″ 2022.7.22 朗县公安局 高 林芝市米林县派镇加拉村 E 94°54′04″,N 29°41′45″ 2018.10.17 西藏之声 高 林芝市朗县 E 93°00′48″,N 29°04′42″ 2022.7.23 朗县住建局 高 林芝市墨脱县达木乡 E 95°27′46″,N 29°29′35″ 2021.7.4 中国自然资源报 极高 国道559线波密至墨脱路段 E 97°02′03″,N 29°19′14″ 2019.5.16 西藏自治区交通运输厅 极高 林芝市墨脱县达木珞巴民族乡小学 E 95°27′52″,N 29°29′46″ 2020.8.26 新京报 极高 -
[1] 贾俊, 毛伊敏, 孟晓捷, 等. 深度随机森林和随机森林算法的滑坡易发性评价对比—以汉中市略阳县为例[J]. 西北地质, 2023, 56(3): 239−249. doi: 10.12401/j.nwg.2023084
JIA Jun, MAO Yimin, MENG Xiaojie, et al. Comparison of Landslide Susceptibility Evaluation by Deep Random Forest and Random Forest Model: A Case Study of Lueyang County, Hanzhong City[J]. Northwestern Geology, 2023, 56(3): 239−249. doi: 10.12401/j.nwg.2023084
[2] 康孟羽, 朱月琴, 陈晨, 等. 基于多元非线性回归和BP神经网络的滑坡滑动距离预测模型研究[J]. 地质通报, 2022, 41(12): 2281−2289.
KANG Mengyu, ZHU Yueqin, CHEN Chen, et al. Research on landslide sliding distance prediction model based on multiple nonlinear regression and BP neural network[J]. Geological Bulletin of China, 2022, 41(12): 2281−2289.
[3] 孟晓捷, 张新社, 曾庆铭, 等. 基于加权信息量法的黄土滑坡易发性评价——以1: 5万天水市麦积幅为例[J]. 西北地质, 2022, 55(2): 249−259.
MENG Xiaojie, ZHANG Xinshe, ZENG Qingming, et al. The Susceptibility Evaluation of Loess Landslide Based on Weighted Information Value Method: Taking 1: 50 000 Map of Maiji District of Tianshui City As an Example[J]. Northwestern Geology, 2022, 55(2): 249−259.
[4] 乔德京, 王念秦, 郭有金, 杨盼盼. 加权确定性系数模型的滑坡易发性评价[J]. 西安科技大学学报, 2020, 40(02): 259-267
QIAO Dejing, WANG Nianqin, GUO Youjin, et al, Landslide susceptibility assessment based on weighted certainty factor model[J]. Journal of Xi'an University of Science and Technology, 2020, 40(02): 259-267.
[5] 沈玲玲, 刘连友, 许冲, 王静璞. 基于多模型的滑坡易发性评价——以甘肃岷县地震滑坡为例[J]. 工程地质学报, 2016, 24(01): 19-28 doi: 10.13544/j.cnki.jeg.2016.01.003
SHEN Lingling, LIU Lianyou, XU Chong, et al. Multi-models based landslide susceptibility evaluation—illustrated with landslides triggered by minxian earthquake[J]. Journal of Engineering Geology, 2016, 24(01): 19-28. doi: 10.13544/j.cnki.jeg.2016.01.003
[6] 苏立彬, 郭永刚, 吴悦, 杨永涛. 基于DEM的尼洋河流域地貌形态分析[J]. 中国水土保持科学, 2020, 18(03): 12-21 doi: 10.16843/j.sswc.2020.03.002
SU Libin, GUO Yonggang, WU Yue, et al. Analysis of geomorphology of Niyang River Basin based on digital elevation model[J]. soil and water conservation science, 2020, 18(03): 12-21. doi: 10.16843/j.sswc.2020.03.002
[7] 王瑞琪, 王学良, 刘海洋, 等. 基于精细 DEM 的崩塌滑坡灾害识别及主控因素分析——以雅鲁藏布江缝合带加查-朗县段为例[J]. 工程地质学报, 2019, 27(5): 1146-1152
WANG Ruiqi, WANG Xueliang, LIU Haiyang, et al. Identification and main controlling factor analysis of collapse and landslide based on fine dem——taking jiacha-langxian section of yarlung zangbo suture zone as an example[J]. Journal of Engineering Geology, 2019, 27(5): 1146-1152.
[8] 武辰爽. 基于GIS的川藏铁路林芝段地质灾害危险性评价[D]. 拉萨: 西藏大学, 2021
WU Chenshuang. Evaluation of Geological Hazard Risk Based on Geological Information System in Nyingchi of Sichuan-Tibet Railway[D]. Lasa: Tibet University, 2021.
[9] 杨创奇, 陶攀, 杨正. 基于逻辑回归树耦合熵指数模型的滑坡易发性分区——以陕西省延安市吴起县滑坡为例[J]. 人民长江, 2022, 53(05): 128-134
YANG Chuangqi, TAO Pan, YANG Zheng. Landslide susceptibility zoning based on logistic regression tree coupled entropy index model: case of landslide in Wuqi County, Yan'an City, Shaanxi Province[J]. People's Yangtze River, 2022, 53(05): 128-134.
[10] 张玘恺, 凌斯祥, 李晓宁, 等. 九寨沟县滑坡灾害易发性快速评估模型对比研究[J]. 岩石力学与工程学报, 2020, 39(8): 1595-1610 doi: 10.13722/j.cnki.jrme.2020.0029
ZHANG Qikai, LING Sixiang, LI Xiaoning, et al. Comparison of landslide susceptibility mapping rapid assessment models in Jiuzhaigou County, Sichuan province, China[J]. Chinese Journal of Rock Mechanics and Engineering, 2020, 39(8): 1595-1610. doi: 10.13722/j.cnki.jrme.2020.0029
[11] 张林梵, 王佳运, 张茂省, 等. 基于BP神经网络的区域滑坡易发性评价[J]. 西北地质, 2022, 55(02): 260-270
ZHANG Linfan, WANG Jiayun, ZHANG Maosheng, et al. Evaluation of Regional Landslide Susceptibility Assessment Based on BP Neural Network[J]. Northwest Geology, 2022, 55(02): 260-270.
[12] 张琪, 巨能攀, 张成强, 等. 库水位变化时陡倾软弱顺层岩质滑坡变形机制[J]. 成都理工大学学报(自然科学版), 2023, 50(2): 206−217.
ZHANG Qi, JU Nengpan, ZHANG Chengqiang, et al. Landslide deformation mechanism of steep weak bedding rock under the variation of reservoir water level[J], Journal of Chengdu University of Technology (Science & Technology Edition), 2023, 50(2): 206−217.
[13] 张文龙, 张振凯, 杨帅. 勉略宁地区地质灾害危险性智能评价和区划研究[J]. 西北地质, 2023, 56(1): 276−283.
ZHANG Wenlong, ZHANG Zhenkai, YANG Shuai. Study on Intelligent Evaluation and Zoning of Geohazards Risk in Mianluening Area[J]. Northwestern Geology, 2023, 56(1): 276−283.
[14] 赵永辉. 雅鲁藏布江流域嘎贡沟巨型滑坡变形破坏模式及演化过程研究[J]. 防灾科技学院学报, 2019, 21(04): 1-7 doi: 10.3969/j.issn.1673-8047.2019.04.001
ZHAO Yonghui. Deformation and Failure Model and Evolution Process of Giant Landslides in Gagong Valley in the Yarlung Zangbo River Basin[J]. Journal of Institute of Disaster Prevention, 2019, 21(04): 1-7. doi: 10.3969/j.issn.1673-8047.2019.04.001
[15] 赵永辉. 雅鲁藏布江公路滑坡发育特征及破坏机理研究[J]. 公路, 2021, 66(4): 6 -10.
ZHAO Yonghui. Research on Development Characteristics and Failure Process of Highway Landslide along the Yarlung Zangbo River[J]. Highway, 2021, 66(4): 6-10.
[16] 周硼焜, 张洪波, 赵伟华, 等. 基于Massflow的西南山区某大型岩质滑坡-碎屑流运动模拟研究[J]. 成都理工大学学报(自然科学版), 2023, 50(3): 361−368.
ZHOU Pengkun, ZHANG Hongbo, ZHAO Weihua, et al. Study of a large-scale rock landslide-debris flow in the southwest mountainous region of China based on Massflow numerical simulation[J]. Journal of Chengdu University of Technology (Science & Technology Edition), 2023, 50(3): 361−368.
[17] Alsahaf A, Azzopardi G, Ducro B, et al. Predicting Slaughter Weight in Pigs with Regression Tree Ensembles[C]. APPIS, 2018: 1−9.
[18] Arabameri A, Pradhan B, Rezaei K, et al. Assessment of landslide susceptibility using statistical-and artificial intelligence-based FR–RF integrated model and multiresolution DEMs[J]. Remote Sensing, 2019, 11(9): 999. doi: 10.3390/rs11090999
[19] Batar A K, Watanabe T. Landslide susceptibility mapping and assessment using geospatial platforms and weights of evidence (WoE) method in the Indian Himalayan region: Recent developments, gaps, and future directions[J]. ISPRS International Journal of Geo-Information, 2021, 10(3): 114. doi: 10.3390/ijgi10030114
[20] Disha R A, Waheed S. Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique[J]. Cybersecurity, 2022, 5(1): 1-22. doi: 10.1186/s42400-021-00103-8
[21] Hong H, Liu J, Bui D T, et al. Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China)[J]. Catena, 2018, 163: 399-413. doi: 10.1016/j.catena.2018.01.005
[22] Inan M S K, Ulfath R E, Alam F I, et al. Improved sampling and feature selection to support extreme gradient boosting for PCOS diagnosis[C]. 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), IEEE, 2021: 1046−1050.
[23] Khan H, Shafique M, Khan M A, et al. Landslide susceptibility assessment using Frequency Ratio, a case study of northern Pakistan[J]. The Egyptian Journal of Remote Sensing and Space Science, 2019, 22(1): 11-24. doi: 10.1016/j.ejrs.2018.03.004
[24] Kouhartsiouk D, Perdikou S. The application of DInSAR and Bayesian statistics for the assessment of landslide susceptibility[J]. Natural Hazards, 2021, 105(3): 2957-2985. doi: 10.1007/s11069-020-04433-7
[25] Lee D H, Kim Y T, Lee S R. Shallow landslide susceptibility models based on artificial neural networks considering the factor selection method and various non-linear activation functions[J]. Remote Sensing, 2020, 12(7): 1194. doi: 10.3390/rs12071194
[26] Polykretis C, Chalkias C. Comparison and evaluation of landslide susceptibility maps obtained from weight of evidence, logistic regression, and artificial neural network models[J]. Natural hazards, 2018, 93(1): 249-274. doi: 10.1007/s11069-018-3299-7
[27] Rehman A, Song J, Haq F, et al. Multi-Hazard Susceptibility Assessment Using the Analytical Hierarchy Process and Frequency Ratio Techniques in the Northwest Himalayas, Pakistan[J]. Remote Sensing, 2022, 14(3): 554. doi: 10.3390/rs14030554
[28] Taalab K, Cheng T, Zhang Y. Mapping landslide susceptibility and types using Random Forest[J]. Big Earth Data, 2018, 2(2): 159-178. doi: 10.1080/20964471.2018.1472392
[29] Tanyas H, Rossi M, Alvioli M, et al. A global slope unit-based method for the near real-time prediction of earthquake-induced landslides[J]. Geomorphology, 2019, 327: 126-146. doi: 10.1016/j.geomorph.2018.10.022
[30] Tien Bui D, Shahabi H, Shirzadi A, et al. Landslide detection and susceptibility mapping by airsar data using support vector machine and index of entropy models in cameron highlands, malaysia[J]. Remote Sensing, 2018, 10(10): 1527. doi: 10.3390/rs10101527
[31] Zeng H, Yang C, Zhang H, et al. A lightGBM-based EEG analysis method for driver mental states classification[J]. Computational Intelligence and Neuroscience, 2019.
[32] Zweifel L, Samarin M, Meusburger K, et al. Investigating causal factors of shallow landslides in grassland regions of Switzerland[J]. Natural Hazards and Earth System Sciences, 2021, 21(11): 3421-3437. doi: 10.5194/nhess-21-3421-2021