A Hybrid Probabilistic Approach for Table Understanding

被引:0
|
作者
Sun, Kexuan [1 ]
Rayudu, Harsha [1 ]
Pujara, Jay [1 ]
机构
[1] Univ Southern Calif, Informat Sci Inst, Los Angeles, CA 90089 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Tables of data are used to record vast amounts of socioeconomic, scientific, and governmental information. Although humans create tables using underlying organizational principles, unfortunately AI systems struggle to understand the contents of these tables. This paper introduces an end-to-end system for table understanding, the process of capturing the relational structure of data in tables. We introduce models that identify cell types, group these cells into blocks of data that serve a similar functional role, and predict the relationships between these blocks. We introduce a hybrid, neuro-symbolic approach, combining embedded representations learned from thousands of tables with probabilistic constraints that capture regularities in how humans organize tables. Our neurosymbolic model is better able to capture positional invariants of headers and enforce homogeneity of data types. One limitation in this research area is the lack of rich datasets for evaluating end-to-end table understanding, so we introduce a new benchmark dataset comprised of 431 diverse tables from data.gov. The evaluation results show that our system achieves the state-of-the-art performance on cell type classification, block identification, and relationship prediction, improving over prior efforts by up to 7% of macro F1 score.
引用
收藏
页码:4366 / 4374
页数:9
相关论文
共 50 条
  • [21] A simple graphical approach for understanding probabilistic inference in Bayesian networks
    Butz, C. J.
    Hua, S.
    Chen, J.
    Yao, H.
    INFORMATION SCIENCES, 2009, 179 (06) : 699 - 716
  • [22] Rethinking Hybrid Teaching: The Hybrid Rhombus Model as an Approach to Understanding Hybrid Settings
    Handle-Pfeiffer, Daniel
    Winter, Christoph
    Loew, Christian
    Hackl, Claudia
    8TH INTERNATIONAL CONFERENCE ON HIGHER EDUCATION ADVANCES (HEAD '22), 2022, : 1367 - 1375
  • [23] A hybrid recommendation approach using LDA and probabilistic matrix factorization
    Yulin Cao
    Wenli Li
    Dongxia Zheng
    Cluster Computing, 2019, 22 : 8811 - 8821
  • [24] Probabilistic Slope Stability Evaluation Using Hybrid Metaheuristic Approach
    Zeng, Xinhua
    Khajehzadeh, Mohammad
    Iraji, Amin
    Keawsawasvong, Suraparb
    PERIODICA POLYTECHNICA-CIVIL ENGINEERING, 2022, 66 (04): : 1309 - 1322
  • [25] A hybrid recommendation approach using LDA and probabilistic matrix factorization
    Cao, Yulin
    Li, Wenli
    Zheng, Dongxia
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 4): : S8811 - S8821
  • [26] A probabilistic approach for automatic parameters selection for the hybrid edge detector
    Bennamoun, M
    Boashash, B
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1997, E80A (08) : 1423 - 1429
  • [27] Probabilistic Interval Forecasts: An Individual Differences Approach to Understanding Forecast Communication
    Grounds, Margaret A.
    Joslyn, Susan
    Otsuka, Kyoko
    ADVANCES IN METEOROLOGY, 2017, 2017
  • [28] Probabilistic Analysis of a Table Tennis Game
    Noubary, Reza D.
    JOURNAL OF QUANTITATIVE ANALYSIS IN SPORTS, 2007, 3 (01)
  • [29] A New Hybrid Approach for Scalable Table-driven Routing in MANETs
    Yoshihiro, Takuya
    Kitamura, Yuji
    Paul, Anup Kumar
    Tachibana, Atsuo
    Hasegawa, Teruyuki
    2018 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2018,
  • [30] A Hybrid Machine-Crowdsourcing Approach for Web Table Matching and Cleaning
    Li, Chunhua
    Zhao, Pengpeng
    Sheng, Victor S.
    Li, Zhixu
    Liu, Guanfeng
    Wu, Jian
    Cui, Zhiming
    WEB-AGE INFORMATION MANAGEMENT, PT II, 2016, 9659 : 132 - 144