Identifying Image Related Sentences in News Articles

被引:0
|
作者
Ilter, Melike Esma [1 ]
Akarun, Lale [1 ]
Ozgur, Arzucan [1 ]
机构
[1] Bogazici Univ, Bilgisayar Muhendisligi Bolumu, Istanbul, Turkey
关键词
news images; news image sentences; news image captioning;
D O I
10.1109/siu.2019.8806258
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With the increasing availability of images on the web, identifying image related sentences has become an important problem. This research area is also important for the news publishing community for automatic captioning of news images and summarization. Although a large body of research has been devoted to image captioning, it is still a challenging problem. Previous works on image captioning mostly focus on generating new captions for the images. The problem of identifying image related sentences in news articles is discussed in this study for the first time and is novel because we do not try to generate a caption from scratch, but we try to select the most appropriate set of sentences for the image from the news text itself. We have used the CNN news dataset which only contains the text parts of news as basis and we have augmented the dataset by collecting the images of the news articles. We generated two class ground truth for the image and sentences of news by using Tf-Idf and Word2Vec vectors cosine and SEMILAR sentence-to-sentence similarity methods, respectively. The experiment results show that Naive Bayes classifier with HOG feature selection gives better results.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Clustering sentences for discovering events in news articles
    Naughton, Martina
    Kushmerick, Nicholas
    Carthy, Joe
    ADVANCES IN INFORMATION RETRIEVAL, 2006, 3936 : 535 - 538
  • [2] Identifying Informational Sources in News Articles
    Spangher, Alexander
    Peng, Nanyun
    May, Jonathan
    Ferrara, Emilio
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3626 - 3639
  • [3] Annotating and Analyzing Biased Sentences in News Articles using Crowdsourcing
    Lim, Sora
    Jatowt, Adam
    Farber, Michael
    Yoshikawa, Masatoshi
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1478 - 1484
  • [4] Using Linguistic Graph Similarity to Search for Sentences in News Articles
    Schouten, Kim
    Frasincar, Flavius
    DATABASES AND INFORMATION SYSTEMS IX, 2016, 291 : 255 - 268
  • [5] Identifying Relevant Sentences for Travel Blogs from Wikipedia Articles
    Kapoor, Arnav
    Gupta, Manish
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 532 - 536
  • [6] Reliable measures for aligning Japanese-English news articles and sentences
    Utiyama, M
    Isahara, H
    41ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2003, : 72 - 79
  • [7] Vectorization of Text Documents for Identifying Unifiable News Articles
    Singh, Anita Kumari
    Shashi, Mogalla
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (07) : 305 - 310
  • [8] Image Enhanced Event Detection in News Articles
    Tong, Meihan
    Wang, Shuai
    Cao, Yixin
    Xu, Bin
    Li, Juaizi
    Hou, Lei
    Chua, Tat-Seng
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9040 - 9047
  • [9] Identifying Controversial Issues and Their Sub-topics in News Articles
    Choi, Yoonjung
    Jung, Yuchul
    Myaeng, Sung-Hyon
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2010, 6122 : 140 - 153
  • [10] A lightweight and unsupervised approach for identifying risk events in news articles
    Shahsavari, Maryam
    Hussain, Omar Khadeer
    Saberi, Morteza
    Sharma, Pankaj
    2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 37 - 43