Feature Extraction and Analysis for Lung Nodule Classification using Random Forest

被引:7
|
作者
El-Askary, Nada S. [1 ]
Salem, Mohammed A-M [1 ,2 ]
Roushdy, Mohamed, I [1 ]
机构
[1] Ain Shams Univ, Fac Comp & Informat Sci, Cairo, Egypt
[2] German Univ Cairo, Fac Media Engn & Technol, Cairo, Egypt
关键词
Random Forest; Classification; Computed tomography; Machine Learning; Feature Extraction; Lung Nodule; Medical Images; Wavelet; IMAGE DATABASE CONSORTIUM; PULMONARY NODULES; SEGMENTATION;
D O I
10.1145/3328833.3328872
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Early detection of lung nodule decreases the risk of advanced stages in lung cancer disease. Random forest (RF), a machine learning classifier, is used to detect the lung nodules and classify soft-tissues into nodules and non-nodules. A lung nodule classification approach is proposed to improve early detection for nodules. A five stages model has been built and tested using 165 cases from the LIDC database. Stage 1 is image acquisition and preprocessing. Stage 2 is extracting 119 features from the CT image. Stage 3 is refining feature vectors by removing all duplicate instances and undersampling the non-nodule class. Stage 4 is tuning the RF parameters. Stage 5 is examining different collections from the extracted feature sets to select those scores best for classification. The accuracy achieved by RF is the highest compared to other machine learning classifiers such as KNN, SVM, and DT. The proposed method aimed to analyze and select features that maximize classification results. Pixel based feature set and wavelet-based set scored best for higher accuracy. RF was tuned with 170 trees and 0.007 for in-bag fraction. Best results were achieved by the proposed model are 90.67%, 90.8% and 90.73% for sensitivity, specificity, and accuracy respectively.
引用
收藏
页码:248 / 252
页数:5
相关论文
共 50 条
  • [21] A time series forest for classification and feature extraction
    Deng, Houtao
    Runger, George
    Tuv, Eugene
    Vladimir, Martyanov
    INFORMATION SCIENCES, 2013, 239 : 142 - 153
  • [22] Features processing for random forest optimization in lung nodule localization
    El-Askary, Nada S.
    Salem, Mohammed A. -M.
    Roushdy, Mohamed I.
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
  • [23] A Clustering Approach for Feature Selection in Microarray Data Classification Using Random forest
    Aydadenta, Husna
    Adiwijaya
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2018, 14 (05): : 1167 - 1175
  • [24] Performance Analysis of Lung Cancer Classification using Multiple Feature Extraction with SVM and KNN Classifiers
    Ashwini, S. S.
    Kurain, M. Z.
    Nagaraja, M.
    2021 IEEE INTERNATIONAL CONFERENCE ON MOBILE NETWORKS AND WIRELESS COMMUNICATIONS (ICMNWC), 2021,
  • [25] Deep Feature Learning for Pulmonary Nodule Classification in a Lung CT
    Kim, Bum-Chae
    Sung, Yu Sub
    Suk, Heung-Il
    2016 4TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2016,
  • [26] Classification of Lung Nodules with Feature Extraction using CT scan Images
    Jayalaxmi, M.
    Dhanaselvam, J.
    Swathi, R.
    Babu, M.
    2017 IEEE INTERNATIONAL CONFERENCE ON POWER, CONTROL, SIGNALS AND INSTRUMENTATION ENGINEERING (ICPCSI), 2017, : 2146 - 2151
  • [27] CORN STALK DISEASE CLASSIFICATION USING RANDOM FOREST COMBINATION OF EXTRACTION FEATURES
    Ansori, Nachnul
    Rachmad, Aeri
    BIN Fauzan, Hermawan
    BIN Fauzan, Hermawan
    Pancaasmara, Yuli
    COMMUNICATIONS IN MATHEMATICAL BIOLOGY AND NEUROSCIENCE, 2024,
  • [28] FEATURE EXTRACTION USING RANDOM WALKS
    Deng, Yue
    Dai, Qionghai
    Zhang, Zengke
    2009 IEEE YOUTH CONFERENCE ON INFORMATION, COMPUTING AND TELECOMMUNICATION, PROCEEDINGS, 2009, : 498 - 501
  • [29] Automatic hip geometric feature extraction in DXA imaging using regional random forest
    Hussain, Dildar
    Han, Seung-Moo
    Kim, Tae-Seong
    JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY, 2019, 27 (02) : 207 - 236
  • [30] Automatic habitat classification using image analysis and random forest
    Torres, Mercedes
    Qiu, Guoping
    ECOLOGICAL INFORMATICS, 2014, 23 : 126 - 136