Explainable natural language processing with matrix product states

被引:3
|
作者
Tangpanitanon, Jirawat [1 ,2 ]
Mangkang, Chanatip [3 ]
Bhadola, Pradeep [4 ]
Minato, Yuichiro [5 ]
Angelakis, Dimitris G. [6 ,7 ]
Chotibut, Thiparat [3 ]
机构
[1] Quantum Technol Fdn Thailand, Bangkok, Thailand
[2] Minist Higher Educ Sci Res & Innovat, Thailand Ctr Excellence Phys, Bangkok, Thailand
[3] Chulalongkorn Univ, Fac Sci, Dept Phys, Chula Intelligent & Complex Syst, Bangkok, Thailand
[4] Mahidol Univ, Ctr Theoret Phys & Nat Philosophy, Nakhonsawan Studiorum Adv Studies, Nakhonsawan Campus, Khao Thong, Thailand
[5] Blueqat Inc, Tokyo, Japan
[6] Tech Univ Crete, Sch Elect & Comp Engn, Khania, Greece
[7] Natl Univ Singapore, Ctr Quantum Technol, Singapore, Singapore
来源
NEW JOURNAL OF PHYSICS | 2022年 / 24卷 / 05期
关键词
matrix product state; entanglement entropy; entanglement spectrum; quantum machine learning; natural language processing; recurrent neural networks; TENSOR NETWORKS; QUANTUM;
D O I
10.1088/1367-2630/ac6232
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Despite empirical successes of recurrent neural networks (RNNs) in natural language processing (NLP), theoretical understanding of RNNs is still limited due to intrinsically complex non-linear computations. We systematically analyze RNNs' behaviors in a ubiquitous NLP task, the sentiment analysis of movie reviews, via the mapping between a class of RNNs called recurrent arithmetic circuits (RACs) and a matrix product state. Using the von-Neumann entanglement entropy (EE) as a proxy for information propagation, we show that single-layer RACs possess a maximum information propagation capacity, reflected by the saturation of the EE. Enlarging the bond dimension beyond the EE saturation threshold does not increase model prediction accuracies, so a minimal model that best estimates the data statistics can be inferred. Although the saturated EE is smaller than the maximum EE allowed by the area law, our minimal model still achieves similar to 99% training accuracies in realistic sentiment analysis data sets. Thus, low EE is not a warrant against the adoption of single-layer RACs for NLP. Contrary to a common belief that long-range information propagation is the main source of RNNs' successes, we show that single-layer RACs harness high expressiveness from the subtle interplay between the information propagation and the word vector embeddings. Our work sheds light on the phenomenology of learning in RACs, and more generally on the explainability of RNNs for NLP, using tools from many-body quantum physics.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Explainable Natural Language Processing
    Zhang, Zihao
    NATURAL LANGUAGE ENGINEERING, 2024, 30 (04) : 882 - 885
  • [2] Editorial: Explainable AI in Natural Language Processing
    Banerjee, Somnath
    Tomas, David
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [3] Explainable natural language processing for corporate sustainability analysis
    Ong, Keane
    Mao, Rui
    Satapathy, Ranjan
    Shirota Filho, Ricardo
    Cambria, Erik
    Sulaeman, Johan
    Mengaldo, Gianmarco
    INFORMATION FUSION, 2025, 115
  • [4] A Survey of the State of Explainable AI for Natural Language Processing
    Danilevsky, Marina
    Qian, Kun
    Aharonov, Ranit
    Katsis, Yannis
    Kawas, Ban
    Sen, Prithviraj
    1ST CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 10TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (AACL-IJCNLP 2020), 2020, : 447 - 459
  • [5] Local Interpretations for Explainable Natural Language Processing: A Survey
    Luo, Siwen
    Ivison, Hamish
    Han, Soyeon Caren
    Poon, Josiah
    ACM COMPUTING SURVEYS, 2024, 56 (09)
  • [6] Hybrid explainable image caption generation using image processing and natural language processing
    Mishra, Atul
    Agrawal, Anubhav
    Bhasker, Shailendra
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (10) : 4874 - 4884
  • [7] Brain-Inspired Approaches to Natural Language Processing and Explainable Artificial Intelligence
    Deussen, Erik
    Unger, Herwig
    Kubek, Mario M.
    INNOVATIONS FOR COMMUNITY SERVICES, I4CS 2022, 2022, 1585 : 6 - 10
  • [8] Matrix and Tensor Factorization Methods for Natural Language Processing
    Bouchard, Guillaume
    Naradowsky, Jason
    Riedel, Sebastian
    Rocktaschel, Tim
    Vlachos, Andreas
    53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2015), 2015, : 16 - 18
  • [9] Natural Language Processing on Marketplace Product Review Sentiment Analysis
    Rohman, Arif Nur
    Musyarofah, Rizqa Luviana
    Utami, Ema
    Raharjo, Suwanto
    PROCEEDINGS OF ICORIS 2020: 2020 THE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEM (ICORIS), 2020, : 223 - 227
  • [10] Processing natural language without natural language processing
    Brill, E
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 360 - 369