Text Reuse Detection by Keyword Extraction for Telegram Channels

被引:0
|
作者
Saki, Misam [1 ]
Faili, Heshaam [1 ]
Asadpour, Masoud [1 ]
机构
[1] Univ Tehran, Sch Elect & Comp Engn, Tehran, Iran
关键词
Text Reuse; Text Similarity; Text Clustering; Keyword Extraction; Telegram;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text reuse detection is the task of finding similar texts, which has many applications e.g. in plagiarism detection and analysis of information diffusion. The general approach to this problem is by detection of texts similarities in combination with other features such as time-stamp which can be used to specify the precedence of publishers e.g. to find the first publisher. In this article a method for finding similar texts has been proposed using keyword extraction which operates in linear time like LSH1 method. In addition, it supports dynamic inputs and does not depend on text vector dimensions. Our evaluations show, it has better performance in clustering quality measures and also run time.
引用
收藏
页码:1481 / 1484
页数:4
相关论文
共 50 条
  • [41] Variance-based features for keyword extraction in Persian and English text documents
    Veisi, H.
    Aflaki, N.
    Parsafard, P.
    SCIENTIA IRANICA, 2020, 27 (03) : 1301 - 1315
  • [42] Research on Cross Language Text Keyword Extraction Based on Information Entropy and TextRank
    Zhang, Xiaoyu
    Wang, Yongbin
    Wu, Lin
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 16 - 19
  • [43] Variance-based features for keyword extraction in Persian and English text documents
    Veisi H.
    Aflaki N.
    Parsafard P.
    Scientia Iranica, 2020, 27 (3 D) : 1301 - 1315
  • [44] Chinese Text Keyword Extraction Based on Doc2vec And TextRank
    Wang, Wei
    Li, Xiangshun
    Yu, Sheng
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 369 - 373
  • [45] SEMANTIC KEYWORD EXTRACTION VIA ADAPTIVE TEXT BINARIZATION OF UNSTRUCTURED UNSOURCED VIDEO
    Merler, Michele
    Kender, John R.
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 261 - 264
  • [46] "COMMUNICATIVE PARASITISM" IN THE RUSSIAN SEGMENT OF TELEGRAM CHANNELS
    Troshchenkova, Ekaterina, V
    VESTNIK VOLGOGRADSKOGO GOSUDARSTVENNOGO UNIVERSITETA-SERIYA 2-YAZYKOZNANIE, 2023, 22 (05): : 53 - 71
  • [47] Joint Learning with Keyword Extraction for Event Detection in Social Media
    Chen, Guandan
    Mao, Wenji
    Kong, Qingchao
    Han, Han
    2018 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS (ISI), 2018, : 214 - 219
  • [48] Quadrilateral signboard detection and text extraction
    Tam, A
    Shen, H
    Liu, JZ
    Tang, XO
    CISST'03: PROCEEDING OF THE INTERNATIONAL CONFERENCE ON IMAGING SCIENCE, SYSTEMS AND TECHNOLOGY, VOLS 1 AND 2, 2003, : 708 - 713
  • [49] Source Retrieval for Web-Scale Text Reuse Detection
    Hagen, Matthias
    Potthast, Martin
    Adineh, Payam
    Fatehifar, Ehsan
    Stein, Benno
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2091 - 2094
  • [50] Evaluation of Fingerprint Selection Algorithms for Local Text Reuse Detection
    Jekabsons, Gints
    APPLIED COMPUTER SYSTEMS, 2020, 25 (01) : 11 - 18