Selection of breast features for young women in northwestern China based on the random forest algorithm

被引:12
|
作者
Zhou, Jie [1 ]
Mao, Qian [1 ]
Zhang, Jun [2 ]
Lau, Newman M. L. [2 ]
Chen, Jianming [3 ]
机构
[1] Xian Polytech Univ, Sch Apparel & Art Design, 19 Jinhua South Rd, Xian 710048, Shaanxi, Peoples R China
[2] Hong Kong Polytech Univ, Sch Design, Hong Kong, Peoples R China
[3] Chinese Univ Hong Kong, Dept Biomed Engn, Hong Kong, Peoples R China
关键词
Breast shape classification; random forest algorithm; feature selection; breast shape recognition; K-MEANS; BRA; SHAPE; DIMENSIONS; SUPPORT;
D O I
10.1177/00405175211040869
中图分类号
TB3 [工程材料学]; TS1 [纺织工业、染整工业];
学科分类号
0805 ; 080502 ; 0821 ;
摘要
In the research of breast morphology, numerous breast features are measured, whereas only a few parameters are adopted for classification. Therefore, how to extract the key variables from the multi-dimensional features in a rational way is an issue that is focused upon. This study aimed to reduce the complexity of the dimensionality reduction for further improving the objectivity and interpretability of the selected breast features. Since the random forest (RF) algorithm can quantify the feature importance during training, the method was adopted to determine the optimal breast features for classification and recognition in this paper. Firstly, the anthropometric data of 360 females from northwestern China aged from 19 to 27 years were measured by non-contact three-dimensional body scanning technology and the contact manual measurement method. Then, the k-means clustering was applied to categorize breast shapes, and the RF algorithm was utilized to quantify and rank the importance of 25 breast features. Finally, to verify the availability of the RF algorithm on breast feature selection, the t-distributed stochastic neighbor embedding method was adopted to visualize the distribution of breast shape clusters into two dimensions. Meanwhile, four neural networks were determined to recognize the breast morphology. The results demonstrate that fewer breast features can effectively increase the accuracy of breast shape classification and recognition. The best performance of breast shape classification and recognition is obtained when the number of breast features is 13. In this case, the average Hamming loss of four neural networks is the smallest (0.1136). Interestingly, the bust circumference and the horizontal curve of breasts across the bust points are found to be the most important of the 25 breast features in this paper. The importance of the breast curve features is higher than that of the breast cross-sectional features, while the breast positioning features have the lowest importance. Meanwhile, the RF algorithm is verified to be more effective than traditional dimensionality reduction methods, such as principal component analysis, hierarchical clustering, and recursive feature elimination. The approach developed in this paper can be generalized to the dimensionality reduction of other body morphology.
引用
收藏
页码:957 / 973
页数:17
相关论文
共 50 条
  • [1] Selection of hand features based on Random Forest algorithm and hand shape recognition
    Li, Xin
    Ding, Xiao-jun
    Peng, Zhou-yan
    Lin, Xi-yan
    Zou, Feng-yuan
    INDUSTRIA TEXTILA, 2024, 75 (03): : 319 - 326
  • [2] Feature selection algorithm based on random forest
    Yao, Deng-Ju
    Yang, Jing
    Zhan, Xiao-Juan
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2014, 44 (01): : 137 - 141
  • [3] A Breast Cancer Diagnosis Method Based on VIM Feature Selection and Hierarchical Clustering Random Forest Algorithm
    Huang, Zexian
    Chen, Daqi
    IEEE ACCESS, 2022, 10 : 3284 - 3293
  • [4] Human Activity Recognition Based on Evolution of Features Selection and Random Forest
    Dewi, Christine
    Chen, Rung-Ching
    2019 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2019, : 2496 - 2501
  • [5] Predictive modeling of Pan Evaporation using Random Forest Algorithm along with Features Selection
    Rakhee
    Singh, Archana
    Mittal, Mamta
    Kumar, Amrender
    PROCEEDINGS OF THE CONFLUENCE 2020: 10TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING, 2020, : 380 - 384
  • [6] Cotton Classification Method at the County Scale Based on Multi-Features and Random Forest Feature Selection Algorithm and Classifier
    Fei, Hao
    Fan, Zehua
    Wang, Chengkun
    Zhang, Nannan
    Wang, Tao
    Chen, Rengu
    Bai, Tiecheng
    REMOTE SENSING, 2022, 14 (04)
  • [7] Features Selection in Character Recognition with Random Forest Classifier
    Homenda, Wladyslaw
    Lesinski, Wojciech
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, PT I, 2011, 6922 : 93 - +
  • [8] FEATURES OF BREAST CANCER IN YOUNG WOMEN
    Khachaturyan, I.
    ANNALS OF ONCOLOGY, 2010, 21 : 38 - 39
  • [9] Behavioral Analysis of Urban Travel Mode Selection Based on Random Forest Algorithm
    Zhang, Hai
    Liu, Na
    INTERNATIONAL JOURNAL OF MULTIPHYSICS, 2024, 18 (03) : 792 - 802
  • [10] Feature Selection Algorithm based on Random Forest applied to Sleep Apnea Detection
    Deyiaene, Margot
    Testelmans, Dries
    Borzee, Pascal
    Buyse, Bertien
    Van Huffel, Sabine
    Varon, Carolina
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 2580 - 2583