Time Period Categorization in Fiction: A Comparative Analysis of Machine Learning Techniques

被引:0
|
作者
Westin, Fereshta [1 ,2 ]
机构
[1] Univ Boras, Boras, Sweden
[2] Univ Boras, Allegatan 1, Boras, Sweden
关键词
Cataloging for digital resources; time period categorization; machine learning; text analysis; fiction; LDA; SBERT; TF-IDF; CLASSIFICATION;
D O I
10.1080/01639374.2024.2315548
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
This study investigates the automatic categorization of time period metadata in fiction, a critical but often overlooked aspect of cataloging. Using a comparative analysis approach, the performance of three machine learning techniques, namely Latent Dirichlet Allocation (LDA), Sentence-BERT (SBERT), and Term Frequency-Inverse Document Frequency (TF-IDF) were assessed, by examining their precision, recall, F1 scores, and confusion matrix results. LDA identifies underlying topics within the text, TF-IDF measures word importance, and SBERT measures sentence semantic similarity. Based on F1-score analysis and confusion matrix outcomes, TF-IDF and LDA effectively categorize text data by time period, while SBERT performed poorly across all time period categories.
引用
收藏
页码:124 / 153
页数:30
相关论文
共 50 条
  • [1] A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
    Sharef, Nurfadhlina Mohd
    Martin, Trevor
    Kasmiran, Khairul Azhar
    Mustapha, Aida
    Sulaiman, Md Nasir
    Azmi-Murad, Masrah Azrifah
    SOFT COMPUTING, 2015, 19 (06) : 1701 - 1714
  • [2] A comparative study of evolving fuzzy grammar and machine learning techniques for text categorization
    Nurfadhlina Mohd Sharef
    Trevor Martin
    Khairul Azhar Kasmiran
    Aida Mustapha
    Md. Nasir Sulaiman
    Masrah Azrifah Azmi-Murad
    Soft Computing, 2015, 19 : 1701 - 1714
  • [3] Comparative Study of Machine Learning Techniques in Sentimental Analysis
    Bhavitha, B. K.
    Rodrigues, Anisha P.
    Chiplunkar, Niranjan N.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 216 - 221
  • [4] A Comparative Analysis of Machine Learning Techniques for Credit Scoring
    Nwulu, Nnamdi I.
    Oroja, Shola
    Ilkan, Mustafa
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (10): : 4129 - 4145
  • [5] A comparative performance analysis of different machine learning techniques
    Ialithabhavani, B.
    Krishnaveni, G.
    Malathi, J.
    INTERNATIONAL CONFERENCE ON COMPUTER VISION AND MACHINE LEARNING, 2019, 1228
  • [6] Machine Learning Techniques for Intrusion Detection: A Comparative Analysis
    Hamid, Yasir
    Sugumaran, M.
    Journaux, Ludovic
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [7] A comparative analysis of machine learning techniques for imbalanced data
    Mrad, Ali Ben
    Lahiani, Amine
    Mefteh-Wali, Salma
    Mselmi, Nada
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [8] A Comparative Analysis of Machine Learning Techniques for Botnet Detection
    Bansal, Ankit
    Mahapatra, Sudipta
    SIN'17: PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON SECURITY OF INFORMATION AND NETWORKS, 2017, : 91 - 98
  • [9] Document Categorization Engine Based on Machine Learning Techniques
    Alhiyafi, Jamal A.
    Alnahwi, Asmaa
    Alkhurissi, Rawan
    Bayomi, Masha'er
    Bayoumi, Manar
    Altassan, Mona
    Alahmadi, Alaa
    Olatunji, Sunday O.
    Maarouf, Ahmed A.
    2019 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCES (ICCIS), 2019, : 553 - 557
  • [10] A Comparative Study of Machine Learning and Deep Learning Techniques for Sentiment Analysis
    Jain, Kruttika
    Kaushal, Shivani
    2018 7TH INTERNATIONAL CONFERENCE ON RELIABILITY, INFOCOM TECHNOLOGIES AND OPTIMIZATION (TRENDS AND FUTURE DIRECTIONS) (ICRITO) (ICRITO), 2018, : 483 - 487