Synthesizing N-ary Relations from Web Tables

被引:6
|
作者
Lehmberg, Oliver [1 ]
Bizer, Christian [1 ]
机构
[1] Univ Mannheim, Data & Web Sci Grp, Mannheim, Germany
来源
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, MINING AND SEMANTICS (WIMS 2019) | 2019年
关键词
Web Tables; Schema Matching; Schema Extension; FUNCTIONAL-DEPENDENCIES; LARGE-SCALE; EXTRACTION; DISCOVERY;
D O I
10.1145/3326467.3326480
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Web contains a large number of relational HTML tables, which cover a multitude of different, often very specific topics. This rich pool of data has motivated a growing body of research on methods that use web table data to extend local tables with additional attributes or add missing facts to knowledge bases. Nearly all existing approaches for these tasks are limited to the extraction of binary relations from web tables, e.g. an unemployment number may only depend on the state. Inspecting randomly chosen tables on the Web quickly reveals that many relations in the tables are non-binary, e.g. unemployment numbers also depend on the point in time and the profession. Treating such n-ary relations as binary leads to data that cannot be interpreted correctly. The extraction of n-ary relations from web tables is complicated by two factors: 1. important attributes might be stated outside of the table; 2. relational web tables are usually too small for functional dependency discovery. This paper presents a method to synthesize n-ary relations from web tables for the use case of knowledge base extension. The method exploits information from the page around the table and stitches (combines) multiple tables from the same website. We apply the method to a corpus of 5 million web tables originating from 80 thousand different web sites and find that 38% of the synthesized relations are non-binary. We find different relations for the same dependent attribute, e.g. relations providing unemployment numbers based on time, location, or profession. By identifying groups of websites which provide these relations, we lay the foundation for applications in knowledge base augmentation and data search, which allow for a specific selection of relations that determine an attribute according to the applications' data requirements.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] CERTAIN FAMILY OF N-ARY INVARIANT RELATIONS OF SOME ALGEBRA
    DANTONI, G
    ATTI DELLA ACCADEMIA NAZIONALE DEI LINCEI RENDICONTI-CLASSE DI SCIENZE FISICHE-MATEMATICHE & NATURALI, 1969, 47 (06): : 456 - &
  • [32] Interval-valued fuzzy n-ary subhypergroups of n-ary hypergroups
    Davvaz, B.
    Kazanci, Osman
    Yamak, S.
    NEURAL COMPUTING & APPLICATIONS, 2009, 18 (08): : 903 - 911
  • [33] N-ARY POLYGROUPS
    Ghadiri, M.
    Waphare, B. N.
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY TRANSACTION A-SCIENCE, 2009, 33 (A2): : 145 - 158
  • [34] Closed and noise-tolerant patterns in n-ary relations
    Cerf, Loic
    Besson, Jeremy
    Nguyen, Kim-Ngan T.
    Boulicaut, Jean-Francois
    DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 26 (03) : 574 - 619
  • [35] N-ARY ALGEBRAS
    CARLSSON, R
    NAGOYA MATHEMATICAL JOURNAL, 1980, 78 (MAY) : 45 - 56
  • [36] FrameBase: Representing N-Ary Relations Using Semantic Frames
    Rouces, Jacobo
    de Melo, Gerard
    Hose, Katja
    SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, ESWC 2015, 2015, 9088 : 505 - 521
  • [37] n-ary hyperstructures and some connections with binary relations and lattices
    Leoreanu-Fotea, Violeta
    AHA 2008: 10TH INTERNATIONAL CONGRESS-ALGEBRAIC HYPERSTRUCTURES AND APPLICATIONS, PROCEEDINGS, 2009, : 33 - 41
  • [38] Learning Logical Definitions of n-Ary Relations in Graph Databases
    Goz, Furkan
    Mutlu, Alev
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 50 - 61
  • [39] Fuzzy join n-ary spaces and fuzzy canonical n-ary hypergroups
    Leoreanu-Fotea, Violeta
    FUZZY SETS AND SYSTEMS, 2010, 161 (24) : 3166 - 3173
  • [40] Augmented n-ary maps and their applications to graded n-ary algebraic structures
    Calderon-Martin, Antonio J.
    Navarro-Izquierdo, Francisco J.
    PROCEEDINGS OF THE INDIAN ACADEMY OF SCIENCES-MATHEMATICAL SCIENCES, 2022, 132 (02):