Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach

被引:0
|
作者
Winarti, Titin [1 ]
Kerami, Djati [2 ]
Etp, Lussiana [3 ]
Sekarwati, Kemal Ade [4 ]
机构
[1] Semarang Univ, Fac Informat Technol & Commun, Semarang 50196, Indonesia
[2] Indonesia Univ, Fac Math & Nat Sci, Depok 16424, Indonesia
[3] Sch Informat Management & Comp Jakarta, Comp Syst, Jakarta 12140, Indonesia
[4] Gunadarma Univ, Fac Comp Sci & Informat Technol, Jakarta 16424, Indonesia
关键词
Plagiarism; Fingerprint; Natural Language Processing;
D O I
10.1166/asl.2016.7993
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.
引用
收藏
页码:3128 / 3131
页数:4
相关论文
共 50 条
  • [21] Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing
    Joshi, Parag Mulendra
    Liu, Sam
    DOCENG'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2009, : 218 - 221
  • [22] Integration in Multiple-Document Comprehension: A Natural Language Processing Approach
    Sonia, Allison N.
    Magliano, Joseph P.
    McCarthy, Kathryn S.
    Creer, Sarah D.
    McNamara, Danielle S.
    Laura, K. Allen
    DISCOURSE PROCESSES, 2022, 59 (5-6) : 417 - 438
  • [23] Fingerprinting based Detection System for Identifying Plagiarism in Malayalam Text Documents
    Sindhu, L.
    Idicula, Sumam Mary
    2015 INTERNATIONAL CONFERENCE ON COMPUTING AND NETWORK COMMUNICATIONS (COCONET), 2015, : 553 - 558
  • [24] A Computational Intelligence text-based detection system of Music Plagiarism
    De Prisco, Roberto
    Malandrino, Delfina
    Zaccagnino, Gianluca
    Zaccagnino, Rocco
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 519 - 524
  • [25] Emoji, Text, and Sentiment Polarity Detection Using Natural Language Processing
    Gupta, Shelley
    Singh, Archana
    Kumar, Vivek
    INFORMATION, 2023, 14 (04)
  • [26] Natural Language Processing Based on a Text Graph Convolutional Network
    Moreira Pereira, Vitor Cesar
    de Castro, Leandro Nunes
    19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2023, 583 : 1 - 10
  • [27] Natural Language Processing (NLP) based Text Summarization - A Survey
    Awasthi, Ishitva
    Gupta, Kuntal
    Bhogal, Prabjot Singh
    Anand, Sahejpreet Singh
    Soni, Piyush Kumar
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2021), 2021, : 1310 - 1317
  • [28] Neurolinguistic approach to natural language processing with applications to medical text analysis
    Duch, Wlodzisfaw
    Matykiewicz, Pawel
    Pestian, John
    NEURAL NETWORKS, 2008, 21 (10) : 1500 - 1510
  • [29] A study on natural language processing-based method for Windows malware detection
    Do Thi Thu Hien
    Nguyen Quang Huy
    Bui Duc Hoang
    Nguyen Tan Cam
    Van-Hau Pham
    2024 IEEE TENTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS, ICCE 2024, 2024, : 403 - 408
  • [30] An automatic error correction method for business English text translation based on natural language processing
    Yang, Yan
    International Journal of Business Intelligence and Data Mining, 2024, 24 (3-4) : 218 - 233