Statistical word segmentation succeeds given the minimal amount of exposure

被引:0
|
作者
Hao Wang, Felix [1 ]
Luo, Meili [1 ]
Wang, Suiping [2 ]
机构
[1] Nanjing Normal Univ, Sch Psychol, Nanjing, Jiangsu, Peoples R China
[2] South China Normal Univ, Philosophy & Social Sci Lab Reading & Dev Children, Minist Educ, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Statistical learning; Word segmentation; Exposure amount; NONADJACENT DEPENDENCIES; PROBABILITY; PERFORMANCE; FREQUENCY; ADJACENT; INFANTS; IMPACT;
D O I
10.3758/s13423-023-02386-z
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
One of the first tasks in language acquisition is word segmentation, a process to extract word forms from continuous speech streams. Statistical approaches to word segmentation have been shown to be a powerful mechanism, in which word boundaries are inferred from sequence statistics. This approach requires the learner to represent the frequency of units from syllable sequences, though accounts differ on how much statistical exposure is required. In this study, we examined the computational limit with which words can be extracted from continuous sequences. First, we discussed why two occurrences of a word in a continuous sequence is the computational lower limit for this word to be statistically defined. Next, we created short syllable sequences that contained certain words either two or four times. Learners were presented with these syllable sequences one at a time, immediately followed by a test of the novel words from these sequences. We found that, with the computationally minimal amount of two exposures, words were successfully segmented from continuous sequences. Moreover, longer syllable sequences providing four exposures to words generated more robust learning results. The implications of these results are discussed in terms of how learners segment and store the word candidates from continuous sequences.
引用
收藏
页码:1172 / 1180
页数:9
相关论文
共 50 条
  • [41] Dual language statistical word segmentation in infancy: Simulating a language-mixing bilingual environment
    Tsui, Angeline Sin Mei
    Erickson, Lucy C.
    Mallikarjunn, Amritha
    Thiessen, Erik D.
    Fennell, Christopher T.
    DEVELOPMENTAL SCIENCE, 2021, 24 (03)
  • [42] Statistical speech segmentation and word learning in parallel: scaffolding from child-directed speech
    Yurovsky, Daniel
    Yu, Chen
    Smith, Linda B.
    FRONTIERS IN PSYCHOLOGY, 2012, 3
  • [43] More cues or more languages? word segmentation using statistical learning in multilinguals, bilinguals, and monolinguals
    Tachakourt, Yasmine
    Rassili, Outhmane
    INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2024, 21 (04) : 2165 - 2181
  • [44] Statistical learning of an auditory sequence and reorganization of acquired knowledge: A time course of word segmentation and ordering
    Daikoku, Tatsuya
    Yatomi, Yutaka
    Yumoto, Masato
    NEUROPSYCHOLOGIA, 2017, 95 : 1 - 10
  • [45] Integrating Multi-source Bilingual Information for Chinese Word Segmentation in Statistical Machine Translation
    Chen, Wei
    Wei, Wei
    Chen, Zhenbiao
    Xu, Bo
    CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, 2013, 8208 : 61 - 72
  • [46] The Impact of Word Segmentation Techniques on Neural and Statistical Machine Translation: English-Arabic Case
    Berrichi, Safae
    Mazroui, Azzeddine
    ADVANCED INTELLIGENT SYSTEMS FOR SUSTAINABLE DEVELOPMENT (AI2SD'2020), VOL 1, 2022, 1417 : 454 - 462
  • [47] Prediction of Initial Thyroid Therapy Clinical Exposure Rates Given the Amount of I-131 Activity and the Patient Habitus
    Pickering, C.
    Dykes, J.
    Mas, J.
    Domingo, M.
    Yamauchi, D.
    Lopatin, G.
    Patricko, J.
    Williams, L.
    MEDICAL PHYSICS, 2013, 40 (06)
  • [48] Successful Word Recognition by 10-Month-Olds Given Continuous Speech Both at Initial Exposure and Test
    Junge, Caroline
    Cutler, Anne
    Hagoort, Peter
    INFANCY, 2014, 19 (02) : 179 - 193
  • [49] Minimal second language exposure, SES, and early word comprehension: New evidence from a direct assessment
    Deanda, Stephanie
    Arias-Trejo, Natalia
    Poulin-Dubois, Diane
    Zesiger, Pascal
    Friend, Margaret
    BILINGUALISM-LANGUAGE AND COGNITION, 2016, 19 (01) : 162 - 180
  • [50] Infants' statistical word segmentation in an artificial language is linked to both parental speech input and reported production abilities
    Hoareau, Melanie
    Yeung, H. Henny
    Nazzi, Thierry
    DEVELOPMENTAL SCIENCE, 2019, 22 (04)