Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach

被引:0
|
作者
Winarti, Titin [1 ]
Kerami, Djati [2 ]
Etp, Lussiana [3 ]
Sekarwati, Kemal Ade [4 ]
机构
[1] Semarang Univ, Fac Informat Technol & Commun, Semarang 50196, Indonesia
[2] Indonesia Univ, Fac Math & Nat Sci, Depok 16424, Indonesia
[3] Sch Informat Management & Comp Jakarta, Comp Syst, Jakarta 12140, Indonesia
[4] Gunadarma Univ, Fac Comp Sci & Informat Technol, Jakarta 16424, Indonesia
关键词
Plagiarism; Fingerprint; Natural Language Processing;
D O I
10.1166/asl.2016.7993
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.
引用
收藏
页码:3128 / 3131
页数:4
相关论文
共 50 条
  • [41] Pre-processing Online Financial Text for Sentiment Classification: A Natural Language Processing Approach
    Sun, Fan
    Belatreche, Ammar
    Coleman, Sonya
    McGinnity, T. M.
    Li, Yuhua
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING & ECONOMICS (CIFER), 2014, : 122 - 129
  • [42] Natural Language Processing in Mixed-methods Text Analysis: A Workflow Approach
    Parks, Louisa
    Peters, Wim
    INTERNATIONAL JOURNAL OF SOCIAL RESEARCH METHODOLOGY, 2023, 26 (04) : 377 - 389
  • [43] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [44] Vulnerability Detection Methods Based on Natural Language Processing
    Yang Y.
    Li Y.
    Chen K.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (12): : 2649 - 2666
  • [45] A Compression-Based Toolkit for Modelling and Processing Natural Language Text
    Teahan, William John
    INFORMATION, 2018, 9 (12)
  • [46] An ASP Based Approach to Answering Questions for Natural Language Text
    Pendharkar, Dhruva
    Gupta, Gopal
    PRACTICAL ASPECTS OF DECLARATIVE LANGUAGES (PADL 2019), 2019, 11372 : 46 - 63
  • [47] Natural language processing as a technique for conducting text-based research
    Allen, Laura K.
    Creer, Sarah D.
    Poulos, Mary Cati
    LANGUAGE AND LINGUISTICS COMPASS, 2021, 15 (07):
  • [48] A Text Analysis Method for Student Learning Feedback on Network Teaching Platform Based on Natural Language Processing
    Du, Xue-Meng
    Yang, Ji-Cheng
    Journal of Computers (Taiwan), 2024, 35 (01) : 177 - 184
  • [49] Using Natural Language Processing Techniques and Fuzzy-Semantic Similarity for Automatic External Plagiarism Detection
    Gupta, Deepa
    Vani, K.
    Singh, Charan Kamal
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 2694 - 2699
  • [50] System Implementation for the Detection of Weak Signals of the Future in Heterogeneous Documents by Text Mining and Natural Language Processing Techniques
    Griol-Barres, Israel
    Milla, Sergio
    Millet, Jose
    PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE (ICAART), VOL 2, 2019, : 631 - 638