Development of an effective character segmentation and efficient feature extraction technique for malayalam character recognition from palm leaf manuscripts

被引：3

作者：

Sudarsan, Dhanya ^{[1
]}

Sankar, Deepa ^{[1
]}

机构：

[1] Cochin Univ Sci & Technol, Sch Engn Elect & Commun, Kochi, India

来源：

SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2023年 / 48卷 / 03期

关键词：

Character segmentation; character recognition; base classifiers; KNN; Bayesian; decision tree; feature extraction; Malayalam Palm Leaf manuscripts; HANDWRITTEN BANGLA CHARACTER; NEURAL-NETWORK; CLASSIFICATION;

D O I：

10.1007/s12046-023-02181-5

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The paper developed a novel character segmentation and feature extraction technique for old Malayalam Palm leaf manuscripts. The generic novel segmentation algorithm developed in this paper is fine-tuned to address all the language-specific properties of Malayalam characters written in old palm-leaf manuscripts. Since no major work has been reported in the area of character recognition from old Malayalam palm leaf manuscripts, the paper provides a clear insight into the performance of various feature extractors in recognizing the Malayalam characters which is mandatory while analyzing the performance of deep learning neural network for Malayalam character recognition from palm leaf manuscript. For this, an in-depth analysis of the performance of various existing feature extraction techniques on the base classifiers for Malayalam character recognition from palm-leaf manuscripts is done. The paper also aims to identify the best feature extractor classifier pair suitable for character recognition from old Malayalam palm leaf manuscript images. Initially, the color palm leaf manuscript is preprocessed using the linear block-by-block transformation, Nilblacks technique, and morphological operations for noise removal and binarization. A novel feature extraction technique is proposed is a combination of Log-Gabor which encodes a natural image in the best possible way and can properly address the properties of handwritten characters (similarity, overlapping characters, uneven background color, and foreground-background contrast) efficiently and uniform rotational invariant LBP which solves the invariant text analysis deficiency of Log-Gabor and thus the combination Log Gabor and uniform rotation invariant LBP was proved to be the best feature extractor for the purpose with an accuracy of 95.57%. The stacked ResNet (Convolutional Neural Network) architecture with the Long Short-Term Memory (LSTM) architecture is used to classify the different characters present in the manuscript.

引用

页数：21

共 35 条

[31] Enhanced Intelligent Character Recognition (ICR) Approach using Diagonal Feature Extraction and Euler Number as Classifier with Modified One-Pixel Width Character Segmentation Algorithm
Matsuoka, Yosuke R.
Sandoval, Gabriel Angelo R.
Say, Luis Paolo Q.
Teng, Jann Skyler Y.
Acula, Donata D.
2018 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON18), 2018, : 162 - 167
[32] CHARACTER-RECOGNITION BY FEATURE-EXTRACTION USING CROSS-CORRELATION SIGNALS FROM A MATCHED-FILTER
KAMEMARU, S
ITOH, H
YANO, J
OPTICAL ENGINEERING, 1993, 32 (01) : 26 - 32
[33] A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images
Angadi, S. A.
Kodabagi, M. M.
2014 FIFTH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2014), 2014, : 42 - 49
[34] A Robust Segmentation Technique for Line, Word and Character Extraction from Kannada Text in Low Resolution Display Board Images
Angadi, S. A.
Kodabagi, M. M.
INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2014, 14 (1-2)
[35] Simple and Efficient Method for Region of Interest Value Extraction from Picture Archiving and Communication System Viewer with Optical Character Recognition Software and Macro Program
Lee, Young Han
Park, Eun Hae
Suh, Jin-Suck
ACADEMIC RADIOLOGY, 2015, 22 (01) : 113 - 116

← 1 2 3 4 →