Automated synthesis of biodiversity knowledge requires better tools and standardised research output

被引:2
|
作者
Cornford, Richard [1 ,2 ,3 ]
Millard, Joseph [4 ,5 ]
Gonzalez-Suarez, Manuela [6 ]
Freeman, Robin [2 ]
Johnson, Thomas Frederick [7 ]
机构
[1] Imperial Coll London, Dept Life Sci, London, England
[2] Zool Soc London, Inst Zool, London, England
[3] Nat Hist Museum, Dept Life Sci, London, England
[4] UCL, Dept Genet Evolut & Environm, London, England
[5] Univ Oxford, Leverhulme Ctr Demog Sci, Oxford, England
[6] Univ Reading, Sch Biol Sci, Reading, Berks, England
[7] Univ Sheffield, Dept Anim & Plant Sci, Sheffield, S Yorkshire, England
关键词
data extraction; ecology; literature synthesis; machine learning; population trends; text mining;
D O I
10.1111/ecog.06068
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
As the impact of anthropogenic activity on the environment has grown, research into biodiversity change and associated threats has also accelerated. Synthesising this vast literature is important for understanding the drivers of biodiversity change and identifying those actions that will mitigate further ecological losses. However, keeping pace with an ever-increasing publication rate presents a substantial challenge to efficient syntheses, an issue which could be partly addressed by increasing levels of automation in the synthesis pipeline. Here, we evaluate the potential for automated tools to extract ecologically important information from the abstracts of articles compiled in the Living Planet Database. Specifically, we focused on extracting key information on taxonomy (studied species names), geographic location and estimated population trend, assessing the accuracy of automated versus manual information extraction, the potential for automated tools to introduce biases into syntheses, and evaluating if synthesising abstracts was enough to capture the key information from the full article. Taxonomic and geographic extraction tools performed reasonably well, although information on studied species was sometimes limited in the abstract (compared to the main text) preventing fast extraction. In contrast, extraction of trends was less successful, highlighting the challenges involved in automating information extraction from abstracts, such as deficiencies in the algorithms, linguistic complexity associated with ecological findings, and limited information when compared to the main text. In light of these results, we cautiously advocate for a wider use of automated taxonomic and geographic parsing tools for ecological synthesis. Additionally, to further the use of automated synthesis within ecology, we recommend a dual approach: development of improved computational tools to reduce biases; and enhanced protocols for abstracts (and associated metadata) to ensure key information is included in a format that facilitates machine-readability.
引用
收藏
页数:9
相关论文
empty
未找到相关数据