Community structure models are improved by exploiting taxonomic rank with predictive clustering trees

被引:8
|
作者
Levatic, Jurica [1 ,2 ]
Kocev, Dragi [1 ]
Debeljak, Marko [1 ,2 ]
Dzeroski, Saso [1 ,2 ]
机构
[1] Jozef Stefan Inst, Dept Knowledge Technol, Ljubljana 1000, Slovenia
[2] Jozef Stefan Int Postgrad Sch, Ljubljana 1000, Slovenia
关键词
Community structure modelling; Taxonomic rank; Predictive clustering trees; Classification; Hierarchical multi-label classification; RIVER WATER-QUALITY; REGRESSION TREES; CHEMICAL-PARAMETERS; CLASSIFICATION; ENSEMBLES; ECOLOGY; RULES;
D O I
10.1016/j.ecolmodel.2014.10.023
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Community structure modelling studies the influence of biotic and abiotic factors on the abundance and composition of a given taxonomic group of organisms. With the advancement of measurement and sensor technology, the availability, precision and complexity of environmental data constantly increases. Nowadays, measurements of ecosystems provide a complete snapshot of the state of the system, including information about the community structure of organisms that are present in a given sample. These measurements include multi-species data that are typically analysed by constructing community models as collections of models built for each species separately (local models) without considering the possible (taxonomic) relationships among species. In this work, we propose to construct a single community structure model for all the species (global model) that is able to exploit the aforementioned relationships. Namely, we investigate whether inclusion of additional information in the form of taxonomic rank or multiple species helps to build better community structure models. More specifically, we use predictive clustering trees (a generalized form of decision trees) to build models for three practically relevant datasets from the task of community structure modelling: microarthopod community living in the agricultural soils of Denmark, organisms living in Slovenian rivers and vegetation found in the State of Victoria, Australia. On each dataset, we compare the performance of four types of community structure models, which correspond to four machine learning tasks: Single species models without taxonomic rank correspond to single-label classification; single species models with taxonomic rank correspond to hierarchical single-label classification; multi-species models without taxonomic rank correspond to multi-label classification; and multi-species models with taxonomic rank correspond to hierarchical multi-label classification. The results of the experimental evaluation reveal that by using the taxonomic rank and the multi-species aspect of the data, we are able to learn better community structure models. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:294 / 304
页数:11
相关论文
共 6 条
  • [1] IMPROVED STRUCTURE SELECTION FOR NONLINEAR MODELS BASED ON TERM CLUSTERING
    AGUIRRE, LA
    BILLINGS, SA
    INTERNATIONAL JOURNAL OF CONTROL, 1995, 62 (03) : 569 - 587
  • [2] Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units
    Kropat, Georg
    Bochud, Francois
    Jaboyedoff, Michel
    Laedermann, Jean-Pascal
    Murith, Christophe
    Palacios , Martha
    Baechler, Sebastien
    JOURNAL OF ENVIRONMENTAL RADIOACTIVITY, 2015, 147 : 51 - 62
  • [3] Multi-level clustering support vector machine trees for improved protein local structure prediction
    Zhong, Wei
    He, Jieyue
    Chen, Xiujuan
    Pan, Yi
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2014, 9 (02) : 172 - 198
  • [4] Multi-type Relational Data Clustering for Community Detection by Exploiting Content and Structure Information in Social Networks
    Tennakoon, Tennakoon Mudiyanselage Gayani
    Luong, Khanh
    Mohotti, Wathsala
    Chakravarthy, Sharma
    Nayak, Richi
    PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2019, 11671 : 541 - 554
  • [5] Exploiting partially-labeled data in learning predictive clustering trees for multi-target regression: A case study of water quality assessment in Ireland
    Nikoloski, Stevanche
    Kocev, Dragi
    Levatic, Jurica
    Wall, David P.
    Dzeroski, Saso
    ECOLOGICAL INFORMATICS, 2021, 61
  • [6] Incorporating metabolic activity, taxonomy and community structure to improve microbiome-based predictive models for host phenotype prediction
    Monshizadeh, Mahsa
    Ye, Yuzhen
    GUT MICROBES, 2024, 16 (01)