Text Reuse Detection by Keyword Extraction for Telegram Channels

被引:0
|
作者
Saki, Misam [1 ]
Faili, Heshaam [1 ]
Asadpour, Masoud [1 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Tehran, Iran
关键词
Text Reuse; Text Similarity; Text Clustering; Keyword Extraction; Telegram;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text reuse detection is the task of finding similar texts, which has many applications e.g. in plagiarism detection and analysis of information diffusion. The general approach to this problem is by detection of texts similarities in combination with other features such as time-stamp which can be used to specify the precedence of publishers e.g. to find the first publisher. In this article a method for finding similar texts has been proposed using keyword extraction which operates in linear time like LSH1 method. In addition, it supports dynamic inputs and does not depend on text vector dimensions. Our evaluations show, it has better performance in clustering quality measures and also run time.
引用
收藏
页码:1481 / 1484
页数:4
相关论文
共 50 条
  • [21] Automatic Keyword Extraction for Text Summarization in e-Newspapers
    Thomas, Justine Raju
    Bharti, Santosh Kumar
    Babu, Korra Sathya
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATICS AND ANALYTICS (ICIA' 16), 2016,
  • [22] Evaluating the Performance of SOBEK Text Mining Keyword Extraction Algorithm
    Reategui, Eliseo
    Bigolin, Marcio
    Carniato, Michel
    dos Santos, Rafael Antunes
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, CD-MAKE 2022, 2022, 13480 : 233 - 243
  • [23] The Fractal Patterns of Words in a Text: A Method for Automatic Keyword Extraction
    Najafi, Elham
    Darooneh, Amir H.
    PLOS ONE, 2015, 10 (06):
  • [24] An Unsupervised Keyword Extraction Method based on Text Semantic Graph
    Zhao, Liujun
    Miao, Zhongquan
    Wang, Chunming
    Kong, Weizheng
    2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2022, : 1431 - 1436
  • [25] Telegram Monitor: Monitoring Brazilian Political Groups and Channels on Telegram
    Júnior, Manoel
    Melo, Philipe
    Kansaon, Daniel
    Mafra, Vitor
    Sá, Kaio
    Benevenuto, Fabrício
    arXiv, 2022,
  • [26] Strategies for High Accuracy Keyword Detection in Noisy Channels
    Mandal, Arindam
    van Hout, Julien
    Tam, Yik-Cheung
    Mitra, Vikramjit
    Lei, Yun
    Zheng, Jing
    Vergyri, Dimitra
    Ferrer, Luciana
    Graciarena, Martin
    Kathol, Andreas
    Franco, Horacio
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 15 - 19
  • [27] Using citation networks to evaluate the impact of text length on keyword extraction
    Tohalino, Jorge A. V.
    Silva, Thiago C.
    Amancio, Diego R.
    PLOS ONE, 2023, 18 (11):
  • [28] Iterative Hard Thresholding for Keyword Extraction from Large Text Corpora
    Yadlowsky, Steve
    Nakkarin, Preetum
    Wang, Jingyan
    Sharma, Rishi
    El Ghaoui, Laurent
    2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 588 - 593
  • [29] Automatic Summarization and Keyword Extraction from Web Page or Text File
    You, Xiangdong
    2019 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING TECHNOLOGY (CCET), 2019, : 154 - 158
  • [30] A Text Feature Based Automatic Keyword Extraction Method for Single Documents
    Campos, Ricardo
    Mangaravite, Vitor
    Pasquali, Arian
    Jorge, Alipio Mario
    Nunes, Celia
    Jatowt, Adam
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 684 - 691