Word and sentence extraction using irregular pyramid

被引:0
|
作者
Loo, PK [1 ]
Tan, CL
机构
[1] Singapore Polytech, Sch Built Environm & Design, Singapore 139651, Singapore
[2] Natl Univ Singapore, Sch Comp, Singapore 117543, Singapore
来源
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents the result of our continued work on a further enhancement to our previous proposed algorithm. Moving beyond the extraction of word groups and based on the same irregular pyramid structure the new proposed algorithm groups the extracted words into sentences. The uniqueness of the algorithm is in its ability to process text of a wide variation in terms of size, font, orientation and layout on the same document image. No assumption is made on any specified document type. The algorithm is based on the irregular pyramid structure with the application. of four fundamental concepts. The first is the inclusion of background information. The second is the concept of closeness where text information within a group is close to each other, in terms of spatial distance, as compared to other text areas. The third is the "majority win" strategy that is more suitable under the greatly varying environment than a constant threshold value. The final concept is the uniformity and continuity among words belonging to the same sentence.
引用
收藏
页码:307 / 318
页数:12
相关论文
共 50 条
  • [41] Text Summarization by Sentence Extraction Using Unsupervised Learning
    Garcia-Hernandez, Rene Arnulfo
    Montiel, Romyna
    Ledeneva, Yulia
    Rendon, Erendira
    Gelbukh, Alexander
    Cruz, Rafael
    MICAI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5317 : 133 - +
  • [42] Important Sentence Extraction Using Contextual Semantic Network
    Okamoto, Jun
    Ishizaki, Shun
    COMPUTATIONAL LINGUISTICS AND RELATED FIELDS, 2011, 27 : 86 - 94
  • [43] TRAUMA IS A WORD-Not a Sentence
    Souers, Kristin
    Hall, Pete
    EDUCATIONAL LEADERSHIP, 2020, 78 (02) : 34 - 39
  • [44] GERMAN WORD AND SENTENCE INTONATIONS
    MAACK, A
    PHONETICA, 1957, 1 (04) : 230 - 240
  • [45] Dynamics of Word Length in Sentence
    Fan, Fengxiang
    Grzybek, Peter
    Altmann, Gabriel
    GLOTTOMETRICS, 2010, 20 : 70 - 109
  • [46] WORD INTELLIGIBILITY AND POSITION IN SENTENCE
    RUBENSTEIN, H
    PICKETT, JM
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1957, 29 (11): : 1263 - 1263
  • [47] Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting
    Zhang, Chunjie
    Liu, Jing
    Wang, Jinqiao
    Tian, Qi
    Xu, Changsheng
    Lu, Hanqing
    Ma, Songde
    COMPUTER VISION - ACCV 2010, PT III, 2011, 6494 : 239 - +
  • [48] Expanding Irregular Graph Pyramid for an Approaching Object
    Mateos, Luis A.
    Shao, Dan
    Kropatsch, Walter G.
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 885 - 891
  • [49] The performance prediction on sentence recognition using a finite state word automaton
    Otsuki, T
    Ito, A
    Makino, S
    Ohtomo, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1996, E79D (01) : 47 - 53
  • [50] Characterising receptive language processing in schizophrenia using word and sentence tasks
    Tan, Eric J.
    Yelland, Gregory W.
    Rossell, Susan L.
    COGNITIVE NEUROPSYCHIATRY, 2016, 21 (01) : 14 - 31