A Hybrid Probabilistic Approach for Table Understanding

被引:0
|
作者
Sun, Kexuan [1 ]
Rayudu, Harsha [1 ]
Pujara, Jay [1 ]
机构
[1] Univ Southern Calif, Informat Sci Inst, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables of data are used to record vast amounts of socioeconomic, scientific, and governmental information. Although humans create tables using underlying organizational principles, unfortunately AI systems struggle to understand the contents of these tables. This paper introduces an end-to-end system for table understanding, the process of capturing the relational structure of data in tables. We introduce models that identify cell types, group these cells into blocks of data that serve a similar functional role, and predict the relationships between these blocks. We introduce a hybrid, neuro-symbolic approach, combining embedded representations learned from thousands of tables with probabilistic constraints that capture regularities in how humans organize tables. Our neurosymbolic model is better able to capture positional invariants of headers and enforce homogeneity of data types. One limitation in this research area is the lack of rich datasets for evaluating end-to-end table understanding, so we introduce a new benchmark dataset comprised of 431 diverse tables from data.gov. The evaluation results show that our system achieves the state-of-the-art performance on cell type classification, block identification, and relationship prediction, improving over prior efforts by up to 7% of macro F1 score.
引用
收藏
页码:4366 / 4374
页数:9
相关论文
共 50 条
  • [1] An approach to hybrid probabilistic models
    Di Tomaso, E.
    Baldwin, J.F.
    International Journal of Approximate Reasoning, 2008, 47 (02): : 202 - 218
  • [2] An approach to hybrid probabilistic models
    Di Tomaso, E.
    Baldwin, J. F.
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2008, 47 (02) : 202 - 218
  • [3] A probabilistic approach to printed document understanding
    Medvet, Eric
    Bartoli, Alberto
    Davanzo, Giorgio
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2011, 14 (04) : 335 - 347
  • [4] Understanding probabilistic expectations - a behavioral approach
    Xiao, Wei
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2022, 139
  • [5] A probabilistic approach to printed document understanding
    Eric Medvet
    Alberto Bartoli
    Giorgio Davanzo
    International Journal on Document Analysis and Recognition (IJDAR), 2011, 14 : 335 - 347
  • [6] A Probabilistic Approach to Hybrid Role Mining
    Frank, Mario
    Streich, Andreas P.
    Basin, David
    Buhmann, Joachim M.
    CCS'09: PROCEEDINGS OF THE 16TH ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2009, : 101 - 111
  • [7] A hybrid table/analytical approach to MOSFET modelling
    Bourenkov, V
    McCarthy, KG
    Mathewson, A
    ICMTS 2003: PROCEEDINGS OF THE 2003 INTERNATIONAL CONFERENCE ON MICROELECTRONIC TEST STRUCTURES, 2003, : 142 - 147
  • [8] A Hybrid Approach for Probabilistic Forecasting of Electricity Price
    Wan, Can
    Xu, Zhao
    Wang, Yelei
    Dong, Zhao Yang
    Wong, Kit Po
    IEEE TRANSACTIONS ON SMART GRID, 2014, 5 (01) : 463 - 470
  • [9] A new approach to hybrid probabilistic logic programs
    Emad Saad
    Enrico Pontelli
    Annals of Mathematics and Artificial Intelligence, 2007, 50 : 417 - 418
  • [10] A new approach to hybrid probabilistic logic programs
    Saad, Emad
    Pontelli, Enrico
    ANNALS OF MATHEMATICS AND ARTIFICIAL INTELLIGENCE, 2007, 50 (3-4) : 417 - 418