Intention-guided deep semi-supervised document clustering via metric learning

被引:2
|
作者
Li, Jingnan [1 ,2 ]
Lin, Chuan [1 ,2 ,3 ]
Huang, Ruizhang [1 ,2 ]
Qin, Yongbin [1 ,2 ]
Chen, Yanping [1 ,2 ]
机构
[1] Guizhou Univ, State Key Lab Publ Big Data, Guiyang 550025, Peoples R China
[2] Guizhou Univ, Coll Comp Sci & Technol, Guiyang 550025, Peoples R China
[3] Guizhou Univ, Guiyang 550025, Peoples R China
基金
中国国家自然科学基金;
关键词
Intention; Semi; -supervised; Clustering; Metric learning; NETWORKS;
D O I
10.1016/j.jksuci.2022.12.010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The intention expresses the user's preference for document structure division. Intention-guided document structure division is an important task in the field of text mining. To achieve this goal, deep semi-supervised document clustering provides a promising solution to personalized document clustering. However, traditional deep semi-supervised clustering models suffer from the problem of the limited number of constraints which is insufficient for intention-guided document clustering. Moreover, documents normally have various emphases on their representations to reflect different structural opinions. In this paper, we proposed an intention-guided deep semi-supervised document clustering model, namely IGSC, to divide document structure based on a small amount of user-provided supervised information. IGSC designs a deep metric learning network to solve the above problems. The deep metric learner explores the user's global intention and outputs an intention matrix. The intention is explored from the small amount user provided pairwise constraints and is used to guide the representation learning. Moreover, IGSC uses the intention matrix to guide the clustering process, to get the clustering results that best meet the user's intention. This paper compares IGSC with a number of document clustering models on four real-world text datasets, namely Reu-10k, BBC, ACM, and Abstract. The results show that IGSC evidently improves the clustering performance and outperforms the best result of benchmark models with 7% on average. The comparison with other models and the visualization results can demonstrate that IGSC is effective.& COPY; 2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:416 / 425
页数:10
相关论文
共 50 条
  • [31] Deep Semi-Supervised Learning
    Hailat, Zeyad
    Komarichev, Artem
    Chen, Xue-Wen
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2154 - 2159
  • [32] DEEP METRIC LEARNING-BASED SEMI-SUPERVISED REGRESSION WITH ALTERNATE LEARNING
    Zell, Adina
    Sumbul, Gencer
    Demir, Begum
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2411 - 2415
  • [33] Semi-Supervised Learning via Compact Latent Space Clustering
    Kamnitsas, Konstantinos
    Castro, Daniel C.
    Le Folgoc, Loic
    Walker, Ian
    Tanno, Ryutaro
    Rueckert, Daniel
    Ben Glocker
    Criminisi, Antonio
    Nori, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [34] A semi-supervised multiview spectral clustering algorithm based on distance metric learning
    Yang J.
    Deng T.
    Sichuan Daxue Xuebao (Gongcheng Kexue Ban)/Journal of Sichuan University (Engineering Science Edition), 2016, 48 (01): : 146 - 151
  • [35] Nonlinear Metric Learning for Semi-Supervised Learning via Coherent Point Drifting
    Zhang, Pin
    Shi, Bibo
    Smith, Chalres D.
    Liu, Jundong
    2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), 2016, : 314 - 319
  • [36] User-interest-based document filtering via semi-supervised clustering
    Tang, N
    Vemuri, VR
    FOUNDATIONS OF INTELLIGENT SYSTEMS, PROCEEDINGS, 2005, 3488 : 573 - 582
  • [37] Towards an approach using metric learning for interactive semi-supervised clustering of images
    Viet Minh Vu
    Hien Phuong Lai
    Visani, Muriel
    2016 EIGHTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2016, : 357 - 362
  • [38] Evolutionary Distance Metric Learning Approach to Semi-Supervised Clustering with Neighbor Relations
    Fukui, Ken-ichi
    Ono, Satoshi
    Megano, Taishi
    Numao, Masayuki
    2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 398 - 403
  • [39] Active Learning of Instance-level Constraints for Semi-supervised Document Clustering
    Zhao, Weizhong
    He, Qing
    Ma, Huifang
    Shi, Zhongzhi
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 264 - 268
  • [40] SAAML: A Framework for Semi-supervised Affective Adaptation via Metric Learning
    Tran, Minh
    Kim, Yelin
    Su, Che-Chun
    Kuo, Cheng-Hao
    Soleymani, Mohammad
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6004 - 6015