easIE: Easy-to-Use Information Extraction for Constructing CSR Databases From the Web

被引:5
|
作者
Gkatziaki, Vasiliki [1 ]
Papadopoulos, Symeon [1 ]
Mills, Richard [2 ]
Diplaris, Sotiris [1 ]
Tsampoulatidis, Ioannis [1 ]
Kompatsiaris, Ioannis [1 ]
机构
[1] ITI, CERTH ITI, Thessaloniki, Greece
[2] Univ Cambridge, Cambridge, England
基金
欧盟地平线“2020”;
关键词
Information extraction; Web wrapper; corporate social responsibility (CSR); environmental; social; and governance (ESG); CORPORATE SOCIAL-RESPONSIBILITY; DISCLOSURE; PERFORMANCE;
D O I
10.1145/3155807
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Public awareness of and concerns about companies' social and environmental impacts have seen a marked increase over recent decades. In parallel, the quantity of relevant information has increased, as states pass laws requiring certain forms of reporting, researchers investigate companies' performance, and companies themselves seek to gain a competitive advantage by being seen to operate fairly and transparently. However, this information is typically dispersed and non-standardized, making it complicated to collect and analyze. To address this challenge, the WikiRate platform aims to collect this information and store it in a standardized format within a centralized public repository, making it much more amenable to analysis. In the context of WikiRate, this article introduces easIE, an easy-to-use information extraction (IE) framework that leverages general Web IE principles for building datasets with environmental, social, and governance information from the Web. To demonstrate the flexibility and value of easIE, we built a large-scale corporate social responsibility database comprising 654,491 metrics related to 49,009 companies spending less than 16 hours for data engineering, collection, and indexing. Finally, a data collection exercise involving 12 subjects was performed to showcase the ease of use of the developed framework.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Intelligent information extraction from scholarly document databases
    Vegas Fernandez, Fernando
    JOURNAL OF INTELLIGENCE STUDIES IN BUSINESS, 2020, 10 (02): : 44 - 61
  • [42] DiscoRhythm: an easy-to-use web application and R package for discovering rhythmicity (vol 36, pg 1952, 2020)
    Carlucci, Matthew
    Krisciunas, Algimantas
    Li, Haohan
    Gibas, Povilas
    Koncevicius, Karolis
    Petronis, Art
    Oh, Gabriel
    BIOINFORMATICS, 2022, 38 (03) : 882 - 882
  • [43] LCP simulator: An easy-to-use web tool to simulate pattern analysis and enzymatic cleavage of binary linear copolymers
    Hellmann, Margareta J.
    Moerschbacher, Bruno M.
    Cord-Landwehr, Stefan
    SOFTWAREX, 2025, 29
  • [44] GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases
    van Driel, MA
    Cuelenaere, K
    Kemmeren, PPCW
    Leunissen, JAM
    Brunner, HG
    Vriend, G
    NUCLEIC ACIDS RESEARCH, 2005, 33 : W758 - W761
  • [45] VirPipe: an easy-to-use and customizable pipeline for detecting viral genomes from Nanopore sequencing
    Kim, Kijin
    Park, Kyungmin
    Lee, Seonghyeon
    Baek, Seung-Hwan
    Lim, Tae-Hun
    Kim, Jongwoo
    Manavalan, Balachandran
    Song, Jin-Won
    Kim, Won-Keun
    BIOINFORMATICS, 2023, 39 (05)
  • [46] UDEP: AN EASY TO USE INFORMATION EXTRACTION PLATFORM FOR MEDICAL NEWS COLLECTION
    Zhu, W. H.
    Yao, W. X.
    Luo, L. H.
    Dai, S.
    Lu, Z. G.
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2016, 118 : 70 - 71
  • [47] GraphStorm an Easy-to-use and Scalable Graph Neural Network Framework: From Beginners to Heroes
    Zhang, Jian
    Zheng, Da
    Song, Xiang
    Vasiloudis, Theodore
    Nisa, Israt
    Lu, Jim
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 5790 - 5791
  • [48] ChimericSeq: an easy-to-use program for discovery and analysis of integration events from NGS data
    Jongeneel, Patrick
    Lin, Selena
    Steffen, Jamin
    Jain, Surbhi
    Su, Ying-Hsiu
    Song, Wei
    CANCER RESEARCH, 2016, 76
  • [49] A MORE POWERFUL, EASY-TO-USE METHOD FOR EVALUATING RESULTS FROM LABORATORY ANTINOCICEPTIVE TESTS
    TAULBEE, J
    KASTING, G
    FEDERATION PROCEEDINGS, 1987, 46 (03) : 385 - 385
  • [50] context generalization for information extraction from the web
    Habegger, B
    Quafafou, M
    IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS, 2004, : 720 - 723