Cross-lingual document clustering

被引:0
|
作者
Wu, Ke [1 ]
Lu, Bao-Liang [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, 800 Dong Chuan Rd, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever-increasing numbers of Web-accessible documents are available in languages other than English. The management of these heterogeneous document collections has posed a challenge. This paper proposes a novel model, called a domain alignment translation model, to conduct cross-lingual document clustering. While most existing cross-lingual document clustering methods make use of an expensive machine translation system to fill the gap between two languages, our model aims to effectively handle the cross-lingual document clustering by learning a cross-lingual domain alignment model and a domain-specific term translation model in a collaborative way. Experimental results show our method, i.e. C-TLS, without any resources other than a bilingual dictionary can achieve comparable performance to the direct machine translation method via, a machine translation system, e.g. Google language tool. Also, our method is more efficient.
引用
收藏
页码:956 / +
页数:2
相关论文
共 50 条
  • [41] Cross-Lingual Text Categorization
    Bel, N
    Koster, CHA
    Villegas, M
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, 2003, 2769 : 126 - 139
  • [42] A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval
    Ghanbari, Elham
    Shakery, Azadeh
    APPLIED INTELLIGENCE, 2022, 52 (03) : 3156 - 3174
  • [43] Cross-Lingual Visual Grounding
    Dong, Wenjian
    Otani, Mayu
    Garcia, Noa
    Nakashima, Yuta
    Chu, Chenhui
    IEEE ACCESS, 2021, 9 : 349 - 358
  • [44] Cross-lingual Emotion Detection
    Hassan, Sabit
    Shaar, Shaden
    Darwish, Kareem
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6948 - 6958
  • [45] A Survey on Cross-Lingual Summarization
    Wang, Jiaan
    Meng, Fandong
    Zheng, Duo
    Liang, Yunlong
    Li, Zhixu
    Qu, Jianfeng
    Zhou, Jie
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2022, 10 : 1304 - 1323
  • [46] A Cross-Lingual Summarization method based on cross-lingual Fact-relationship Graph Generation
    Zhang, Yongbing
    Gao, Shengxiang
    Huang, Yuxin
    Tan, Kaiwen
    Yu, Zhengtao
    PATTERN RECOGNITION, 2024, 146
  • [47] A Learning to rank framework based on cross-lingual loss function for cross-lingual information retrieval
    Elham Ghanbari
    Azadeh Shakery
    Applied Intelligence, 2022, 52 : 3156 - 3174
  • [48] Cross-lingual prompting method with semantic-based answer space clustering
    Ahmat, Ahtamjan
    Yang, Yating
    Ma, Bo
    Dong, Rui
    Ma, Rong
    Wang, Lei
    APPLIED INTELLIGENCE, 2025, 55 (02)
  • [49] Multi-lingual and Cross-lingual timeline extraction
    Laparra, Egoitz
    Agerri, Rodrigo
    Aldabe, Itziar
    Rigau, German
    KNOWLEDGE-BASED SYSTEMS, 2017, 133 : 77 - 89
  • [50] Class-Dependent Canonical Correlation Analysis for Scalable Cross-Lingual Document Categorization
    Hady, Mohamed Farouk Abdel
    Asham, Mina
    2013 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING (CIDM), 2013, : 308 - 315