Synthetic Error Dataset Generation Mimicking Bengali Writing Pattern

被引:0
|
作者
Sifat, Md Habibur Rahman [1 ]
Rahman, Chowdhury Rafeed [1 ]
Rafsan, Mohammad [1 ]
Rahman, Hasibur [1 ]
机构
[1] United Int Univ, Dhaka, Bangladesh
关键词
Bengali error dataset; Phonetically similar; Constant cluster; Spell checker;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
While writing Bengali using English keyboard, users often make spelling mistakes. The accuracy of any Bengali spell checker or paragraph correction module largely depends on the kind of error dataset it is based on. Manual generation of such error dataset is a cumbersome process. In this research, We present an algorithm for automatic misspelled Bengali word generation from correct word through analyzing Bengali writing pattern using QWERTY layout English keyboard. As part of our analysis, we have formed a list of most commonly used Bengali words, phonetically similar replaceable clusters, frequently mispressed replaceable clusters, frequently mispressed insertion prone clusters and some rules for Juktakkhar (constant letter clusters) handling while generating errors.
引用
收藏
页码:1363 / 1366
页数:4
相关论文
共 50 条
  • [41] Synthetic Dataset Generation for Non-Intrusive Load Monitoring in Commercial Buildings
    Henriet, Simon
    Simsekli, Umut
    Richard, Gael
    Fuentes, Benoit
    BUILDSYS'17: PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILT ENVIRONMENTS, 2017,
  • [42] A synthetic human-centric dataset generation pipeline for active robotic vision
    Georgiadis, Charalampos
    Passalis, Nikolaos
    Nikolaidis, Nikos
    PATTERN RECOGNITION LETTERS, 2024, 179 : 17 - 23
  • [43] Review and analysis of synthetic dataset generation methods and techniques for application in computer vision
    Goran Paulin
    Marina Ivasic‐Kos
    Artificial Intelligence Review, 2023, 56 : 9221 - 9265
  • [44] Review and analysis of synthetic dataset generation methods and techniques for application in computer vision
    Paulin, Goran
    Ivasic-Kos, Marina
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (09) : 9221 - 9265
  • [45] On accelerating soft-error detection by targeted pattern generation
    Sanyal, Alodeep
    Ganeshpure, Kunal
    Kundu, Sandip
    ISQED 2007: PROCEEDINGS OF THE EIGHTH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, 2007, : 723 - +
  • [46] ACQUISITION OF READING AND WRITING IN 5 INDIAN LANGUAGES - ERROR PATTERN-ANALYSIS
    KAUR, B
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1992, 27 (3-4) : 582 - 582
  • [47] Pattern generation with synthetic sensing systems in lipid bilayer membranes
    Takeuchi, Toshihide
    Montenegro, Javier
    Hennig, Andreas
    Matile, Stefan
    CHEMICAL SCIENCE, 2011, 2 (02) : 303 - 307
  • [48] Novel signal wave pattern for efficient synthetic jet generation
    Zhang, P.F.
    Wang, J.J.
    AIAA Journal, 2007, 45 (05): : 1058 - 1065
  • [49] Pattern Based Synthetic Benchmark Generation for Hardware Security Applications
    Meka, Juneeth Kumar
    Marupureddy, AmarKant
    Vemuri, Ranga
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 461 - 466
  • [50] Novel signal wave pattern for efficient synthetic jet generation
    Zhang, P. F.
    Wang, J. J.
    AIAA JOURNAL, 2007, 45 (05) : 1058 - 1065