基于镁铁-超镁铁岩中单斜辉石主量元素含量的决策树集成算法对比

孙建鹍, 杜雪亮, 章宝月, 王龙, 金维浚, 张旗, 罗熊, 朱月琴. 基于镁铁-超镁铁岩中单斜辉石主量元素含量的决策树集成算法对比[J]. 地质通报, 2019, 38(12): 1981-1991.
引用本文: 孙建鹍, 杜雪亮, 章宝月, 王龙, 金维浚, 张旗, 罗熊, 朱月琴. 基于镁铁-超镁铁岩中单斜辉石主量元素含量的决策树集成算法对比[J]. 地质通报, 2019, 38(12): 1981-1991.
SUN Jiankun, DU Xueliang, ZHANG Baoyue, WANG Long, JIN Weijun, ZHANG Qi, LUO Xiong, ZHU Yueqin. A comparison of tree-based ensemble algorithms on the main element content of monoclinal pyroxene in mafic-ultramafic rocks[J]. Geological Bulletin of China, 2019, 38(12): 1981-1991.
Citation: SUN Jiankun, DU Xueliang, ZHANG Baoyue, WANG Long, JIN Weijun, ZHANG Qi, LUO Xiong, ZHU Yueqin. A comparison of tree-based ensemble algorithms on the main element content of monoclinal pyroxene in mafic-ultramafic rocks[J]. Geological Bulletin of China, 2019, 38(12): 1981-1991.

基于镁铁-超镁铁岩中单斜辉石主量元素含量的决策树集成算法对比

  • 基金项目:
    国家重点研究开发计划《基于“地质云”平台的深部找矿知识挖掘》(编号:2016YFC0600510)、国家自然科学基金项目《大数据环境下的滑坡危险性评估模型构建方法研究》(批准号:41872253)、国土资源部地质信息技术重点实验室课题《基于知识图谱深度优化技术的地质大数据智能检索服务应用研究》(编号:2017320)、中国地质调查局项目《国家地质大数据汇聚与管理》(编号:DD20190381A)、《资源环境重大问题综合区划与开发保护策略研究》(编号:DD20190463)
详细信息
    作者简介: 孙建鹍(1996-), 男, 在读博士生, 从事机器学习和地质信息技术研究。E-mail:sunjk@xs.ustb.edu.cn
    通讯作者: 罗熊(1976-), 男, 教授, 博士生导师, 从事机器学习和地质信息技术研究。E-mail:xluo@ustb.edu.cn
  • 中图分类号: P578.594;P628

A comparison of tree-based ensemble algorithms on the main element content of monoclinal pyroxene in mafic-ultramafic rocks

More Information
  • 依靠岩浆构造环境的地球化学成分认识岩浆形成过程是岩石地球化学中的重要应用。当前利用岩石地球化学成分判别构造环境的工作还不够深入。用4种基于决策树的机器学习方法对来自全球新生代洋岛玄武岩(OIB)、岛弧玄武岩(IAB)及大洋中脊玄武岩(MORB)等镁铁-超镁铁岩中单斜辉石的13种主量元素构成数据集进行了岩浆构造环境判别和主要特征排序。通过对比4种基于决策树的机器学习方法,验证了树类算法对于地球化学成分识别问题的有效性,并总结出4种方法在处理岩浆构造环境判别问题时的优劣:决策树算法判别过程更易于理解,但是其准确率欠佳;boosting算法中的AdaBoost和GBDT对于岩浆构造环境的鉴别准确度较高,但构造过程复杂;bagging集成算法随机森林在权衡性能和模型可理解性时不失为一个良好的选择。此外,还通过4种算法的特征重要性排序得出Cr2O3,TFeO,TiO2,FeO和Al2O3是进行岩浆构造环境判别的重要成分。

  • 加载中
  • 图 1  文献中常用经典若干玄武岩构造环境判别图[15]

    Figure 1. 

    图 2  CART示例

    Figure 2. 

    图 3  镁铁-超镁铁岩中单斜辉石分布

    Figure 3. 

    图 4  基于t-SNE的散点图

    Figure 4. 

    图 5  基于主量元素的决策树(部分)

    Figure 5. 

    图 6  基于主量元素的混淆矩阵(代号同表 1

    Figure 6. 

    图 7  特征重要性排序

    Figure 7. 

    图 8  基于主要成分散点图(代号同表 1

    Figure 8. 

    表 1  镁铁-超镁铁岩中单斜辉石主量元素统计信息

    Table 1.  Major element content of clinopyroxene in mafic-ultramafic rocks in the dataset

    主量兀素 IAB(岛弧玄武岩) OIB(洋岛玄武岩) MORB (大洋屮脊玄武岩)
    数据量 平均数/% 屮位数/% 数据量 平均数/% 屮位数/% 数据量 平均数/% 屮位数/%
    SiO2 329 52.09 52.30 198 48.95 49.92 795 51.87 51.73
    TiO2 324 0.36 0.20 198 1.72 1.24 784 0.16 0.10
    Al2O3 329 3.63 3.62 198 4.45 3.67 795 4.34 4.55
    Cr2O3 296 0.68 0.69 135 0.46 0.42 790 1.20 1.23
    Fe2O3 52 1.27 0.99 1 3.34 3.34 10 1.46 1.65
    TFeO 254 3.60 2.81 184 6.13 6.78 225 2.55 2.49
    FeO 75 3.28 3.11 14 5.99 5.85 570 2.80 2.69
    CaO 329 22.49 22.56 198 22.22 22.33 795 22.16 22.32
    MgO 329 16.37 16.52 198 14.38 15.34 795 17.07 17.18
    MnO 307 0.10 0.10 192 0.12 0.13 789 0.09 0.09
    NiO 171 0.05 0.03 12 0.03 0.01 601 0.05 0.05
    K2O 213 0.01 0.00 70 0.01 0.00 180 0.01 0.01
    Na2O 320 0.43 0.39 198 0.55 0.34 778 0.31 0.18
    下载: 导出CSV

    表 2  镁铁-超镁铁岩中单斜辉石主量元素参数设置

    Table 2.  Parameter settings of clinopyroxene in maficultramafic Rocks

    参数 决策树 随机森林 AdaBoost GBDT
    max features 0.55 0.36 0.08 0.03
    max depth 21 6 39 27
    min samples split 2 5 3 4
    min samples leaf 4 3 2 2
    n estimators - 90 210 710
    learning rate - - 0.123 0.008
    subsample - - - 0.52
    下载: 导出CSV

    表 3  基于主量元素的性能指标

    Table 3.  Performance indexes on major element data

    测量指标 决策树 随机森林 AdaBoost GBDT
    MaP 0.8393(+/-0.0357) 0.9120(+/-0.0268) 0.9219(+/-0.0179) 0.9224(+/-0.0292)
    MaR 0.8416(+/-0.0347) 0.8904(+/-0.0398) 0.9023(+/-0.0345) 0.9057(+/-0.0302)
    MaF 0.8389(+/-0.0294) 0.8997(+/-0.0305) 0.9108(+/-0.0241) 0.9130(+/-0.0243)
    Accurcy 0.8715(+/-0.0199) 0.9212(+/-0.0280) 0.9302(+/-0.0193) 0.9315(+/-0.0227)
    下载: 导出CSV
  • [1]

    Leterrier J, Maury R C, Thonon P, et al. Clinopyroxene composition as a method of identification of the magmatic affinities of paleo-volcanic series[J]. Earth and Planetary Science Letters, 1982, 59(1):139-154. doi: 10.1016/0012-821X(82)90122-4

    [2]

    Asthana D. Relict clinopyroxenes from within-plate metadolerites of the Petroi Metabasalt, the New England Fold Belt, Australia[J]. Mineralogical Magazine, 1991, 55(381):549-561. doi: 10.1180/minmag.1991.055.381.08

    [3]

    Nisbet E G, Pearce J A. Clinopyroxene composition in mafic lavas from different tectonic settings[J]. Contributions to Mineralogy and Petrology, 1977, 63(2):149-160. doi: 10.1007/BF00398776

    [4]

    Helmy H M, El Mahallawi M M. Gabbro akarem mafic-ultramafic complex, Eastern Desert, Egypt:A Late Precambrian analogue of Alaskan-type complexes[J]. Mineralogy and Petrology, 2003, 77(1):85-108. http://cn.bing.com/academic/profile?id=632e30f2ba51775b814eef9eeac032bf&encoded=0&v=paper_preview&mkt=zh-cn

    [5]

    Khedr M Z, Arai S. Chemical variations of mineral inclusions in Neoproterozoic high-Cr chromitites from Egypt:Evidence of fluids during chromitite genesis[J]. Lithos, 2016, 240:309-326. http://cn.bing.com/academic/profile?id=cc6430c832b24f5fc56ad8e3e50d1cc4&encoded=0&v=paper_preview&mkt=zh-cn

    [6]

    Hanson R E, Roberts J M, Dickerson P W, et al. Cryogenian intraplate magmatism along the buried southern Laurentian margin:Evidence from volcanic clasts in Ordovician strata, Marathon uplift, west Texas[J]. Geology, 2016, 44(7):539-542. doi: 10.1130/G37889.1

    [7]

    Menand T, Annen C, Blanquat M de S. Rates of magma transfer in the crust:Insights into magma reservoir recharge and pluton growth[J]. Geology, 2015, 43(3):199-202. http://cn.bing.com/academic/profile?id=6a98a13366f56b820cbddf206a768c8d&encoded=0&v=paper_preview&mkt=zh-cn

    [8]

    Pearce J A, Cann J R. Tectonic setting of basic volcanic rocks determined using trace element analyses[J]. Earth and Planetary Science Letters, 1973, 19(2):290-300. doi: 10.1016/0012-821X(73)90129-5

    [9]

    Glassley W. Geochemistry and tectonics of the Crescent volcanic rocks, Olympic Peninsula, Washington[J]. GSA Bulletin, 1974, 85(5):785-794. doi: 10.1130/0016-7606(1974)85<785:GATOTC>2.0.CO;2

    [10]

    Pearce J A, Lippard S J, Roberts S. Characteristics and tectonic significance of supra-subduction zone ophiolites[J]. Geological Society, London, Special Publications, 1984, 16(1):77-94. doi: 10.1144/GSL.SP.1984.016.01.06

    [11]

    Pearce J A. Trace element characteristics of lavas from destructive plate boundaries[J]. Andesites, 1982, 8:528-548. http://cn.bing.com/academic/profile?id=07172bff16d4303c212691c016029b47&encoded=0&v=paper_preview&mkt=zh-cn

    [12]

    Shervais J W. Ti-V plots and the petrogenesis of modern and ophiolitic lavas[J]. Earth and Planetary Science Letters, 1982, 59(1):101-118. doi: 10.1016/0012-821X(82)90120-0

    [13]

    Wood D A. The application of a Th-Hf-Ta diagram to problems of tectonomagmatic classification and to establishing the nature of crustal contamination of basaltic lavas of the British Tertiary volcanic Province[J]. Earth and Planetary Science Letters, 1980, 50(1):11-30. doi: 10.1016/0012-821X(80)90116-8

    [14]

    Mullen E D. MnO/TiO2/P2O5:a minor element discriminant for basaltic rocks of oceanic environments and its implications for petrogenesis[J]. Earth and Planetary Science Letters, 1983, 62(1):53-62. doi: 10.1016/0012-821X(83)90070-5

    [15]

    Zhang Q, Sun W, Zhao Y, et al. New discrimination diagrams for basalts based on big data research[J]. Big Earth Data, 2019, 3(1):45-55. doi: 10.1080/20964471.2019.1576262

    [16]

    王金荣, 陈万峰, 张旗, 等. N-MORB和E-MORB数据挖掘——玄武岩判别图及洋中脊源区地幔性质的讨论[J].岩石学报, 2017, 33(3):993-1005. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=ysxb98201703023

    [17]

    王金荣, 潘振杰, 张旗, 等.大陆板内玄武岩数据挖掘:成分多样性及在判别图中的表现[J].岩石学报, 2016, 32(7):1919-1933. http://d.old.wanfangdata.com.cn/Periodical/ysxb98201607001

    [18]

    杨婧, 王金荣, 张旗, 等.全球岛弧玄武岩数据挖掘——在玄武岩判别图上的表现及初步解释[J].地质通报, 2016, 35(12):1937-1949. doi: 10.3969/j.issn.1671-2552.2016.12.001 http://dzhtb.cgs.cn/gbc/ch/reader/view_abstract.aspx?file_no=20161201&flag=1

    [19]

    汪云亮, 张成江.玄武岩类形成的大地构造环境的Th/HfTa/Hf图解判别[J].岩石学报, 2001, 17(3):413-421.

    [20]

    李玉琼, 杜雪亮, 金维浚, 等.大洋中脊、洋岛、岛弧玄武岩中橄榄石的对比研究[J].地质科学, 2018, 53(4):1228-1239. http://d.old.wanfangdata.com.cn/Periodical/dzkx201804005

    [21]

    韩帅, 李明超, 任秋兵, 等.基于大数据方法的玄武岩大地构造环境智能挖掘判别与分析[J].岩石学报, 2018, 34(11):3207-3216. http://d.old.wanfangdata.com.cn/Periodical/ysxb98201811006

    [22]

    焦守涛, 周永章, 张旗, 等.基于GEOROC数据库的全球辉长岩大数据的大地构造环境智能判别研究[J].岩石学报, 2018, 34(11):3189-3194. http://d.old.wanfangdata.com.cn/Periodical/ysxb98201811004

    [23]

    Vermeesch P. Tectonic discrimination of basalts with classification trees[J]. Geochimica et Cosmochimica Acta, 2006, 70(7):1839-1848. doi: 10.1016/j.gca.2005.12.016

    [24]

    朱林奇, 张冲.谱聚类-Adaboost集成数据挖掘算法在岩性识别中的应用[J].中国科技论文, 2016, 11(5):545-550. doi: 10.3969/j.issn.2095-2783.2016.05.014

    [25]

    韩启迪, 张小桐, 申维.基于梯度提升决策树(GBDT)算法的岩性识别技术[J].矿物岩石地球化学通报, 2018, 37(6):1173-1180. http://d.old.wanfangdata.com.cn/Periodical/kwysdqhxtb201806016

    [26]

    Lehnert K, Su Y, Langmuir C H, et al. A global geochemical database structure for rocks[EB/OL] [2019-04-10] https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/1999GC000026 Geochemistry, Geophysics, Geosystems, 2000.

    [27]

    Phan A V, Nguyen M L, Bui L T. Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems[J]. Applied Intelligence, 2017, 46(2):455-469. doi: 10.1007/s10489-016-0843-6

    [28]

    Luo X, Xu Y, Wang W, et al. Towards enhancing stacked extreme learning machine with sparse autoencoder by correntropy[J]. Journal of the Franklin Institute, 2018, 355(4):1945-1966. doi: 10.1016/j.jfranklin.2017.08.014

    [29]

    Luo X, Sun J, Wang L, et al. Short-term wind speed forecasting via stacked extreme learning machine with generalized correntropy[J]. IEEE Transactions on Industrial Informatics, 2018, 14(11):4963-4971. doi: 10.1109/TII.2018.2854549

    [30]

    Gorissen D, Couckuyt I, Demeester P, et al. A surrogate modeling and adaptive sampling toolbox for computer based design[J]. Journal of Machine Learning Research, 2010, 11(Jul):2051-2055. http://cn.bing.com/academic/profile?id=f45f6cb72f866c5c22f2e6ab92c9a635&encoded=0&v=paper_preview&mkt=zh-cn

    [31]

    Yan R, Ma Z, Zhao Y, et al. A decision tree based data-driven diagnostic strategy for air handling units[J]. Energy and Buildings, 2016, 133:37-45. doi: 10.1016/j.enbuild.2016.09.039

    [32]

    卢东标.基于决策树的数据挖掘算法研究与应用[D].武汉理工大学硕士学位论文, 2008.

    [33]

    Mantovani R G, Horváth T, Cerri R, et al. Hyper-parameter tuning of a decision tree induction algorithm[C]//2016 5th Brazilian Conference on Intelligent Systems (BRACIS). IEEE, 2016: 37-42.

    [34]

    Rodriguez-Galiano V, Sanchez-Castillo M, Chica-Olmo M, et al. Machine learning predictive models for mineral prospectivity:An evaluation of neural networks, random forest, regression trees and support vector machines[J]. Ore Geology Reviews, 2015, 71:804-818. doi: 10.1016/j.oregeorev.2015.01.001

    [35]

    Hastie T, Rosset S, Zhu J, et al. Multi-class AdaBoost[J]. Statistics and Its Interface, 2009, 2(3):349-360. doi: 10.4310/SII.2009.v2.n3.a8

    [36]

    杜雪亮, 李玉琼, 金维浚, 等.镁铁质-超镁铁质岩浆岩中单斜辉石的智能分析研究[J].地质科学, 2018, 53(4):1215-1227. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=dzkx201804004

    [37]

    Snoek J, Larochelle H, Adams R P. Practical bayesian optimization of machine learning algorithms[C]//Advances in neural information processing systems. 2012: 2951-2959.

    [38]

    Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn:Machine learning in Python[J]. Journal of Machine Learning research, 2011, 12(Oct):2825-2830. http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1309.0238

    [39]

    Altmann A, Toloşi L, Sander O, et al. Permutation importance:a corrected feature importance measure[J]. Bioinformatics, 2010, 26(10):1340-1347. doi: 10.1093/bioinformatics/btq134

  • 加载中

(8)

(3)

计量
  • 文章访问数:  1516
  • PDF下载数:  6
  • 施引文献:  0
出版历程
收稿日期:  2019-04-16
修回日期:  2019-07-23
刊出日期:  2019-12-15

目录