Classification of Exaggerated News Headlines

被引:0
|
作者
Rangata, Mapitsi Roseline [1 ]
Sefara, Tshephisho Joseph [1 ]
机构
[1] CSIR, Pretoria, South Africa
关键词
Classification; News headlines; Machine learning; Natural language processing; Exaggerated News;
D O I
10.1007/978-3-031-53731-8_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The amount of data online is increasing as companies generate news articles daily. These news articles contain headlines that have a level of exaggeration aimed to win the readers. In addition, these companies are competing against one another; hence creating appealing and exaggerated news headlines is one of the options to win the readers. Some of the exaggerated headlines contain some level of misleading information. Hence, this paper aims to apply machine learning methods and natural language processing to detect and identify exaggerated news headlines in South African context. Machine learning models such as logistic regression, decision trees, support vector machines, and XGBoost are trained on data that contain labelled news headlines as binary classification. The models produced good results, with XGboost and SVM obtaining 70% in terms of accuracy. Furthermore, the F measure was used to evaluate the models and decision trees obtained 56% followed by SVM with 53%. The classification of exaggerated news headlines is a difficult task. Therefore, we oversampled the data to obtain balanced labels. The performance of the models was increased. SVM obtained 84% followed by logistic regression, XGBoost, and decision trees with accuracy of 78%, 72% and 71%, respectively.
引用
收藏
页码:248 / 260
页数:13
相关论文
共 50 条
  • [31] Situational Irony in Farcical News Headlines
    Carvalho, Paula
    Martins, Bruno
    Rosa, Hugo
    Amir, Silvio
    Baptista, Jorge
    Silva, Mario J.
    COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020, 2020, 12037 : 65 - 75
  • [32] An Analysis of Indian English News Headlines
    Roy, Samapika
    Sukhada
    Singh, Anil Kr.
    GLOCAL CONFERENCE 2020 IN ASIA (THE CALA 2020): 2020 CONFERENCE ON ASIAN LINGUISTIC ANTHROPOLOGY: ASIAN TEXT, GLOBAL CONTEXT, 2020, : 383 - 391
  • [33] VOICES IN THE HEADLINES: A CRITICAL DISCOURSE ANALYSIS OF BRITISH WEB-NEWS HEADLINES
    Blazkova, Barbora
    FROM THEORY TO PRACTICE 2013: PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON ANGLOPHONE STUDIES, 2015, 5 : 151 - 163
  • [35] Internet News Headlines: Ontological and Orthological Aspects
    Shirokova, Elena N.
    NAUCHNYI DIALOG, 2021, (12): : 122 - 138
  • [36] FURTHER THE NEWS. PHILOSOPHY BEHIND HEADLINES
    Mateos Martin, Concha
    DOXA COMUNICACION, 2005, (03): : 265 - 266
  • [37] A Study of Stylistic Features of English News Headlines
    许文梅
    英语广场(学术研究), 2012, (02) : 49 - 51
  • [38] A Study of Rhetorical Features of English News Headlines
    陈金燕
    海外英语, 2016, (08) : 176 - 177+180
  • [39] Learning to Explain Ambiguous Headlines of Online News
    Liu, Tianyu
    Wei, Wei
    Wan, Xiaojun
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4230 - 4236
  • [40] The effects of headlines and summaries on news comprehension and recall
    JOSE A. LEÓN
    Reading and Writing, 1997, 9 : 85 - 106