Maintaining Academic Integrity in Programming: Locality-Sensitive Hashing and Recommendations

被引:5
|
作者
Karnalim, Oscar [1 ]
机构
[1] Maranatha Christian Univ, Fac Informat Technol, Bandung 40164, Indonesia
来源
EDUCATION SCIENCES | 2023年 / 13卷 / 01期
关键词
programming; plagiarism; collusion; similarity detection; recommendations; higher education; CODE PLAGIARISM DETECTION;
D O I
10.3390/educsci13010054
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
Not many efficient similarity detectors are employed in practice to maintain academic integrity. Perhaps it is because they lack intuitive reports for investigation, they only have a command line interface, and/or they are not publicly accessible. This paper presents SSTRANGE, an efficient similarity detector with locality-sensitive hashing (MinHash and Super-Bit). The tool features intuitive reports for investigation and a graphical user interface. Further, it is accessible on GitHub. SSTRANGE was evaluated on the SOCO dataset under two performance metrics: f-score and processing time. The evaluation shows that both MinHash and Super-Bit are more efficient than their predecessors (Cosine and Jaccard with 60% less processing time) and a common similarity measurement (running Karp-Rabin greedy string tiling with 99% less processing time). Further, the effectiveness trade-off is still reasonable (no more than 24%). Higher effectiveness can be obtained by tuning the number of clusters and stages. To encourage the use of automated similarity detectors, we provide ten recommendations for instructors interested in employing such detectors for the first time. These include consideration of assessment design, irregular patterns of similarity, multiple similarity measurements, and effectiveness-efficiency trade-off. The recommendations are based on our 2.5-year experience employing similarity detectors (SSTRANGE's predecessors) in 13 course offerings with various assessment designs.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Frequent-Itemset Mining Using Locality-Sensitive Hashing
    Bera, Debajyoti
    Pratap, Rameshwar
    COMPUTING AND COMBINATORICS, COCOON 2016, 2016, 9797 : 143 - 155
  • [22] Fast Access for Star Catalog Based on Locality-Sensitive Hashing
    Zhu H.
    Liang B.
    Zhang T.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2018, 36 (05): : 988 - 994
  • [23] Locality-Sensitive Hashing for Finding Nearest Neighbors in Probability Distributions
    Tang, Yi-Kun
    Mao, Xian-Ling
    Hao, Yi-Jing
    Xu, Cheng
    Huang, Heyan
    SOCIAL MEDIA PROCESSING, SMP 2017, 2017, 774 : 3 - 15
  • [24] On the Problem of p1-1 in Locality-Sensitive Hashing
    Ahle, Thomas Dybdahl
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2020, 2020, 12440 : 85 - 93
  • [25] An improved method of locality-sensitive hashing for scalable instance matching
    Mehmet Aydar
    Serkan Ayvaz
    Knowledge and Information Systems, 2019, 58 : 275 - 294
  • [26] Fast hierarchical clustering algorithm using locality-sensitive hashing
    Koga, H
    Ishibashi, T
    Watanabe, T
    DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 114 - 128
  • [27] A Scalable ECG Identification System Based on Locality-Sensitive Hashing
    Chu, Hui-Yu
    Lin, Tzu-Yun
    Lee, Song-Hong
    Chiu, Jui-Kun
    Nien, Cing-Ping
    Wu, Shun-Chi
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [28] Similar Pair Identification using Locality-Sensitive Hashing Technique
    Lee, Kyung Mi
    Lee, Keon Myung
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 2117 - 2119
  • [29] Can LSH (locality-sensitive hashing) be replaced by neural network?
    Liu, Renyang
    Zhao, Jun
    Chu, Xing
    Liang, Yu
    Zhou, Wei
    He, Jing
    SOFT COMPUTING, 2024, 28 (02) : 887 - 902
  • [30] Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning
    Meira, Jorge
    Eiras-Franco, Carlos
    Bolon-Canedo, Veronica
    Marreiros, Goreti
    Alonso-Betanzos, Amparo
    INFORMATION SCIENCES, 2022, 607 : 1245 - 1264