Unsupervised extraction of phonetic units in sign language videos for natural language processing

被引:1
|
作者
Martinez-Guevara, Niels [1 ]
Rojano-Caceres, Jose-Rafael [1 ]
Curiel, Arturo [2 ]
机构
[1] Univ Veraruzana, Fac Estadist & Informat, Xalapa, Veracruz, Mexico
[2] Univ Veracruzana CONACyT, Xalapa, Veracruz, Mexico
关键词
Sign language; Machine learning; Natural language processing; Image thresholding; FRAMEWORK;
D O I
10.1007/s10209-022-00888-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sign languages (SL) are the natural languages used by Deaf communities to communicate with each other. Signers use visible parts of their bodies, like their hands, to convey messages without sound. Because of this modality change, SLs have to be represented differently in natural language processing (NLP) tasks: Inputs are regularly presented as video data rather than text or sound, which makes even simple tasks computationally intensive. Moreover, the applicability of NLP techniques to SL processing is limited by their linguistic characteristics. For instance, current research in SL recognition has centered around lexical sign identification. However, SLs tend to exhibit lower vocabulary sizes than vocal languages, as signers codify part of their message through highly iconic signs that are not lexicalized. Thus, a lot of potentially relevant information is lost to most NLP algorithms. Furthermore, most documented SL corpora contain less than a hundred video hours; far from enough to train most non-symbolic NLP approaches. This article proposes a method to achieve unsupervised identification of phonetic units in SL videos, based on Image Thresholding using The Liddell and Johnson Movement-Hold Model [13]. The procedure strives to identify the smallest possible linguistic units that may carry relevant information. This is an effort to avoid losing sub-lexical data that would be otherwise missed to most NLP algorithms. Furthermore, the process enables the elimination of noisy or redundant video frames from the input, decreasing the overall computation costs. The algorithm was tested in a collection of Mexican Sign Language videos. The relevance of the extracted segments was assessed by way of human judges. Further comparisons were carried against French Sign Language resources (LSF), so as to explore how well the algorithm performs across different SLs. The results show that the frames selected by the algorithm contained enough information to remain comprehensible to human signers. In some cases, as much as 80% of the available frames could be discarded without loss of comprehensibility, which may have direct repercussions on how SLs are represented, transmitted and processed electronically in the future.
引用
收藏
页码:1143 / 1151
页数:9
相关论文
共 50 条
  • [21] Feature Extraction and Analysis of Natural Language Processing for Deep Learning English Language
    Wang, Dongyang
    Su, Junli
    Yu, Hongbin
    IEEE ACCESS, 2020, 8 (08): : 46335 - 46345
  • [22] Evaluation of language feedback methods for student videos of American Sign Language
    Huenerfauth M.
    Gale E.
    Penly B.
    Pillutla S.
    Willard M.
    Hariharan D.
    ACM Transactions on Accessible Computing, 2017, 10 (01)
  • [23] Detecting Reduplication in Videos of American Sign Language
    Gavrilov, Zoya
    Sclaroff, Stan
    Neidle, Carol
    Dickinson, Sven
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3767 - 3773
  • [24] Body Posture Estimation in Sign Language Videos
    Lefebvre-Albaret, Francois
    Dalle, Patrice
    GESTURE IN EMBODIED COMMUNICATION AND HUMAN-COMPUTER INTERACTION, 2010, 5934 : 289 - 300
  • [25] Biomolecular Event Extraction using Natural Language Processing
    Bali, Manish
    Anandaraj, S. P.
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (05) : 601 - 612
  • [26] Customizable Natural Language Processing Biomarker Extraction Tool
    Holmes, Benjamin
    Chitale, Dhananjay
    Loving, Joshua
    Tran, Mary
    Subramanian, Vinod
    Berry, Anna
    Rioth, Matthew
    Warrier, Raghu
    Brown, Thomas
    JCO CLINICAL CANCER INFORMATICS, 2021, 5 : 833 - 841
  • [27] Natural Language Processing and Automatic Knowledge Extraction for Lexicography
    Krek, Simon
    INTERNATIONAL JOURNAL OF LEXICOGRAPHY, 2019, 32 (02) : 115 - 118
  • [28] Data Extraction by Using Natural Language Processing Tool
    More, Sujata D.
    Madankar, Mangala S.
    Chandak, M. B.
    HELIX, 2018, 8 (05): : 3846 - 3848
  • [29] The Acquisition of Sign Language: The Impact of Phonetic Complexity on Phonology
    Mann, Wolfgang
    Marshall, Chloe
    Mason, Kathryn
    Morgan, Gary
    LANGUAGE LEARNING AND DEVELOPMENT, 2010, 6 (01) : 60 - 86
  • [30] SLPAnnotator: Tools for implementing Sign Language Phonetic Annotation
    Hall, Kathleen Currie
    Mackie, Scott
    Fry, Michael
    Tkachman, Oksana
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2083 - 2087