Multi-engine collaborative bootstrapping for word sense disambiguation

被引:1
|
作者
Duan, Jianyong [1 ]
Lu, Ruzhan
Li, Xuening
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] So Yangtze Univ, Wuxi 214036, Peoples R China
基金
中国国家自然科学基金;
关键词
bootstrapping algorithms; machine learning; word sense disambiguation;
D O I
10.1142/S0218213007003369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose a new word sense disambiguation method called Multi- engine Collaborative Bootstrapping ( MCB) that combines different types of corpora and also uses two languages for bootstrapping. MCB uses the bilingual bootstrapping as its core algorithm that leading to incremental knowledge acquisition. The EM model is applied to train parameters in a base learner. The feature translation model is improved by semantic correlation estimation. In addition we use multi- engine selection to produce qualified starting seeds from parallel corpora and monolingual corpora. Those seeds that are generated through unsupervised machine learning approaches can also ensure bootstrapping effectiveness in contrast with manually selected seeds in spite of their different selection mechanisms. Experimental results prove the effectiveness of MCB. Some factors including feature space and starting seed number are concerned involved in our experiments because the EM algorithm is sensitive to starting values. Limitation of resources is also a concern.
引用
收藏
页码:465 / 482
页数:18
相关论文
共 50 条
  • [41] Word Sense Disambiguation using KeNet
    Cetiner, Meltem
    Yildirim, Ahmet
    Onay, Bahadir
    Oksuz, Cuneyt
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [42] Word sense disambiguation of Czech texts
    Cikhart, O
    Hajic, J
    TEXT, SPEECH AND DIALOGUE, 1999, 1692 : 109 - 114
  • [43] Resources for Nepali Word Sense Disambiguation
    Shrestha, Niraj
    Hall, Patrick A. V.
    Bista, Sanat K.
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 363 - +
  • [44] An Improved Approach to Word Sense Disambiguation
    Sachdeva, Pradeep
    Verma, Surbhi
    Singh, Sandeep Kumar
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2014, : 235 - 240
  • [45] Towards Word Sense Disambiguation of Polish
    Bas, Dominik
    Broda, Bartosz
    Piasecki, Maciej
    2008 INTERNATIONAL MULTICONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (IMCSIT), VOLS 1 AND 2, 2008, : 62 - 67
  • [46] Graph Based Word Sense Disambiguation
    Koppula, Neeraja
    Rani, B. Padmaja
    Rao, Koppula Srinivas
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INFORMATICS, ICCII 2016, 2017, 507 : 665 - 670
  • [47] Correlation Based Word Sense Disambiguation
    Agarwal, Madhavi
    Bajpai, Jyoti
    2014 SEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2014, : 382 - 386
  • [48] Word sense disambiguation for vocabulary learning
    Kulkarni, Anagha
    Heilman, Michael
    Eskenazi, Maxine
    Callan, Jamie
    INTELLIGENT TUTORING SYSTEM, PROCEEDINGS, 2008, 5091 : 500 - 509
  • [49] Word sense disambiguation: Algorithms and applications
    McCarthy, Diana
    COMPUTATIONAL LINGUISTICS, 2007, 33 (02) : 255 - 258
  • [50] Kernel methods for word sense disambiguation
    Xiangjun Li
    Song Qing
    Huawei Zhang
    Tinghua Wang
    Huping Yang
    Artificial Intelligence Review, 2016, 46 : 41 - 58