Evaluating Reference String Extraction Using Line-Based Conditional Random Fields: A Case Study with German Language Publications

被引:2
|
作者
Koerner, Martin [1 ]
Ghavimi, Behnam [2 ]
Mayr, Philipp [2 ]
Hartmann, Heinrich
Staab, Steffen [1 ]
机构
[1] Univ Koblenz Landau, Inst Web Sci & Technol, Koblenz, Germany
[2] GESIS Leibniz Inst Social Sci, Cologne, Germany
关键词
Reference extraction; Citations; Conditional random fields; German language papers; INFORMATION EXTRACTION;
D O I
10.1007/978-3-319-67162-8_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The extraction of individual reference strings from the reference section of scientific publications is an important step in the citation extraction pipeline. Current approaches divide this task into two steps by first detecting the reference section areas and then grouping the text lines in such areas into reference strings. We propose a classification model that considers every line in a publication as a potential part of a reference string. By applying line-based conditional random fields rather than constructing the graphical model based on individual words, dependencies and patterns that are typical in reference sections provide strong features while the overall complexity of the model is reduced. We evaluated our novel approach RefExt against various state-of-the-art tools (CERMINE, GROBID, and ParsCit) and a gold standard which consists of 100 German language full text publications from the social sciences. The evaluation demonstrates that we are able to outperform state-of-the-art tools which rely on the identification of reference section areas.
引用
收藏
页码:137 / 145
页数:9
相关论文
共 11 条
  • [1] Reference Information Extraction and Processing Using Conditional Random Fields
    Groza, Tudor
    Grimnes, Gunnar AAstrand
    Handschuh, Siegfried
    INFORMATION TECHNOLOGY AND LIBRARIES, 2012, 31 (02) : 6 - 20
  • [2] LINE-BASED CLASSIFICATION OF TERESTRIAL LASER SCANNING DATA USING CONDITIONAL RANDOM FIELD
    Luo, Chao
    Sohn, Gunho
    ISPRS2013-SSG, 2013, 40-7-W2 : 155 - 160
  • [3] Surface Electromyography and Acceleration Based Sign Language Recognition Using Hidden Conditional Random Fields
    Ma, Deen
    Chen, Xiang
    Li, Yun
    Cheng, Juan
    Ma, Yuncong
    2012 IEEE EMBS CONFERENCE ON BIOMEDICAL ENGINEERING AND SCIENCES (IECBES), 2012,
  • [4] Improving textual medication extraction using combined conditional random fields and rule-based systems
    Domonkos Tikk
    Illes Solt
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (05) : 540 - 544
  • [5] Study of Site Investigation Sample Quality and Worst-Case Scale of Fluctuation for Monopiles Based on Conditional Random Fields
    Liu, Jun
    Guo, Xinshuai
    Li, Juncheng
    Yi, Ping
    Wang, Baisong
    ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART A-CIVIL ENGINEERING, 2024, 10 (03):
  • [6] Word Segmentation Method Based on Conditional Random Fields in China's Stock Market Arbitrage Analysis : a Case Study of Shanghai A Share Market
    He, Henry
    Che, Wen-Gang
    ADVANCES IN APPLIED MATERIALS AND ELECTRONICS ENGINEERING II, 2013, 684 : 567 - +
  • [7] CRUK SMP2: An Intra- and Inter-site Comparative Study of Cell Line-Based Reference Materials Using Complex NGS Platforms to Transfer a Technology into Routine Practice
    Cummings, R.
    Rettino, A.
    Smith, M.
    Clokie, S.
    Wood, M.
    Rehal, P.
    Williams, I.
    Dover, K.
    Walker, B.
    McBride, D.
    Ross, M.
    Butler, R.
    Bell, J.
    De Castro, D. Gonzalez
    Walker, I.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2015, 17 (06): : 853 - 853
  • [8] A CASE STUDY OF MACHINE LEARNING HARDWARE: REAL-TIME SOURCE SEPARATION USING MARKOV RANDOM FIELDS VIA SAMPLING-BASED INFERENCE
    Ko, Glenn G.
    Rutenbar, Rob A.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2477 - 2481
  • [9] Remote Sensing-Based Classification of Winter Irrigation Fields Using the Random Forest Algorithm and GF-1 Data: A Case Study of Jinzhong Basin, North China
    Su, Qiaomei
    Lv, Jin
    Fan, Jinlong
    Zeng, Weili
    Pan, Rong
    Liao, Yuejiao
    Song, Ying
    Zhao, Chunliang
    Qin, Zhihao
    Defourny, Pierre
    REMOTE SENSING, 2023, 15 (18)
  • [10] Exploring relationships between in-hospital mortality and hospital case volume using random forest: results of a cohort study based on a nationwide sample of German hospitals, 2016–2018
    Martin Roessler
    Felix Walther
    Maria Eberlein-Gonska
    Peter C. Scriba
    Ralf Kuhlen
    Jochen Schmitt
    Olaf Schoffer
    BMC Health Services Research, 22