Exploiting Unlabeled Data for Question Classification

被引:0
|
作者
Tomas, David [1 ]
Giuliano, Claudio [2 ]
机构
[1] Univ Alicante, Dept Software & Comp Syst, Alicante, Spain
[2] FBK Irst, Human Language Technol Grp, Trento, Italy
关键词
question classification; semi-supervised learning; kernel methods;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we introduce a kernel-based approach to question classification. We employed a kernel function based on latent semantic information acquired from Wikipedia. This kernel allows including external semantic knowledge into the supervised learning process. We obtained a highly effective question classifier combining this knowledge with a bag-of-words approach by means of composite kernels. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains. We tested it on a parallel corpus of English and Spanish questions.
引用
收藏
页码:137 / 144
页数:8
相关论文
共 50 条
  • [41] Incorporating large unlabeled data to enhance EM classification
    Xintao Wu
    Journal of Intelligent Information Systems, 2006, 26 : 211 - 226
  • [42] Classification from Positive, Unlabeled and Biased Negative Data
    Hsieh, Yu-Guan
    Niu, Gang
    Sugiyama, Masashi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [43] Unlabeled data selection for active learning in image classification
    Xiongquan Li
    Xukang Wang
    Xuhesheng Chen
    Yao Lu
    Hongpeng Fu
    Ying Cheng Wu
    Scientific Reports, 14
  • [44] Incremental support vector machine for unlabeled data classification
    Hong, JH
    Cho, SB
    ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 1403 - 1407
  • [45] Inter-training: Exploiting unlabeled data in multi-classifier systems
    Jiang, Zhen
    Zeng, Jianping
    Zhang, Shiyong
    KNOWLEDGE-BASED SYSTEMS, 2013, 45 : 8 - 19
  • [46] Exploiting the potential of unlabeled endoscopic video data with self-supervised learning
    Tobias Ross
    David Zimmerer
    Anant Vemuri
    Fabian Isensee
    Manuel Wiesenfarth
    Sebastian Bodenstedt
    Fabian Both
    Philip Kessler
    Martin Wagner
    Beat Müller
    Hannes Kenngott
    Stefanie Speidel
    Annette Kopp-Schneider
    Klaus Maier-Hein
    Lena Maier-Hein
    International Journal of Computer Assisted Radiology and Surgery, 2018, 13 : 925 - 933
  • [47] Exploiting Syntactic and Semantic Information in Coarse Chinese Question Classification
    Kang, Xin
    Wang, Xiaojie
    Ren, Fuji
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 174 - +
  • [48] Exploiting the potential of unlabeled endoscopic video data with self-supervised learning
    Ross, Tobias
    Zimmerer, David
    Vemuri, Anant
    Isensee, Fabian
    Wiesenfarth, Manuel
    Bodenstedt, Sebastian
    Both, Fabian
    Kessler, Philip
    Wagner, Martin
    Mueller, Beat
    Kenngott, Hannes
    Speidel, Stefanie
    Kopp-Schneider, Annette
    Maier-Hein, Klaus
    Maier-Hein, Lena
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2018, 13 (06) : 925 - 933
  • [49] Semi-supervised multi-class Adaboost by exploiting unlabeled data
    Song, Enmin
    Huang, Dongshan
    Ma, Guangzhi
    Hung, Chih-Cheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (06) : 6720 - 6726
  • [50] Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
    Sakai, Tomoya
    du Plessis, Marthinus Christoffel
    Niu, Gang
    Sugiyama, Masashi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70