Statistical word segmentation succeeds given the minimal amount of exposure

被引:0
|
作者
Hao Wang, Felix [1 ]
Luo, Meili [1 ]
Wang, Suiping [2 ]
机构
[1] Nanjing Normal Univ, Sch Psychol, Nanjing, Jiangsu, Peoples R China
[2] South China Normal Univ, Philosophy & Social Sci Lab Reading & Dev Children, Minist Educ, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Statistical learning; Word segmentation; Exposure amount; NONADJACENT DEPENDENCIES; PROBABILITY; PERFORMANCE; FREQUENCY; ADJACENT; INFANTS; IMPACT;
D O I
10.3758/s13423-023-02386-z
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
One of the first tasks in language acquisition is word segmentation, a process to extract word forms from continuous speech streams. Statistical approaches to word segmentation have been shown to be a powerful mechanism, in which word boundaries are inferred from sequence statistics. This approach requires the learner to represent the frequency of units from syllable sequences, though accounts differ on how much statistical exposure is required. In this study, we examined the computational limit with which words can be extracted from continuous sequences. First, we discussed why two occurrences of a word in a continuous sequence is the computational lower limit for this word to be statistically defined. Next, we created short syllable sequences that contained certain words either two or four times. Learners were presented with these syllable sequences one at a time, immediately followed by a test of the novel words from these sequences. We found that, with the computationally minimal amount of two exposures, words were successfully segmented from continuous sequences. Moreover, longer syllable sequences providing four exposures to words generated more robust learning results. The implications of these results are discussed in terms of how learners segment and store the word candidates from continuous sequences.
引用
收藏
页码:1172 / 1180
页数:9
相关论文
共 50 条
  • [31] Isarn Dharma Word Segmentation Using a Statistical Approach with Named Entity Recognition
    Somsap, Sittichai
    Seresangtakul, Pusadee
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (02)
  • [32] Chinese to Braille translation based on Braille word segmentation using statistical model
    Wang X.
    Yang Y.
    Zhang J.
    Jiang W.
    Liu H.
    Qian Y.
    Wang, Xiangdong (xdwang@ict.ac.cn), 1600, Shanghai Jiaotong University (22): : 82 - 86
  • [33] MINIMAL AMOUNT OF X-RAY EXPOSURE CAUSING LENS OPACITIES IN THE HUMAN EYE
    COGAN, DG
    DREISLER, KK
    AMA ARCHIVES OF OPHTHALMOLOGY, 1953, 50 (01): : 30 - 34
  • [34] Application of a nonparametric procedure for testing the hypothesis about the independence of random variables given a large amount of statistical data
    Lapko, A. V.
    Lapko, V. A.
    Bakhtina, A. V.
    MEASUREMENT TECHNIQUES, 2024, 66 (10) : 744 - 754
  • [35] Beyond Transitional Probabilities: Human Learners Impose a Parsimony Bias in Statistical Word Segmentation
    Frank, Michael C.
    Arnon, Inbal
    Tily, Harry
    Goldwater, Sharon
    COGNITION IN FLUX, 2010, : 760 - 765
  • [36] Can infants map meaning to newly segmented words? Statistical segmentation and word learning
    Estes, Katharine Graf
    Evans, Julia L.
    Alibali, Martha W.
    Saffran, Jenny R.
    PSYCHOLOGICAL SCIENCE, 2007, 18 (03) : 254 - 260
  • [37] Learning across languages: bilingual experience supports dual language statistical word segmentation
    Antovich, Dylan M.
    Estes, Katharine Graf
    DEVELOPMENTAL SCIENCE, 2018, 21 (02)
  • [38] One language or two? Navigating cross-language conflict in statistical word segmentation
    Antovich, Dylan M.
    Graf Estes, Katharine
    DEVELOPMENTAL SCIENCE, 2020, 23 (06)
  • [39] Modeling the Statistical Behavior of Lexical Chains to Capture Word Cohesiveness for Automatic Story Segmentation
    Chan, Shing-kai
    Xie, Lei
    Meng, Helen Mei-ling
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2408 - 2411
  • [40] An Improved Statistical Machine Translation Method for United Chinese-Japanese Word Segmentation
    Wang, Xiaowei
    Wang, Jinke
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL & ELECTRONICS ENGINEERING AND COMPUTER SCIENCE (ICEEECS 2016), 2016, 50 : 1 - 4