Volatile Organic Compounds for the Prediction of Lung Cancer by Using Ensembled Machine Learning Model and Feature Selection

被引:1
|
作者
Khanna, Divya [1 ]
Kumar, Arun [2 ]
Bhat, Shahid Ahmad [3 ]
机构
[1] Chitkara Univ, Inst Engn & Technol, Rajpura 140401, Punjab, India
[2] Madhav Inst Sci & Technol, Ctr Artificial Intelligence, Gwalior 474005, Madhya Pradesh, India
[3] LUT Univ, LUT Business Sch, Lappeenranta 53851, Finland
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Lung cancer; Cancer; Predictive models; Volatile organic compounds; Machine learning; Lungs; Feature extraction; Analytical models; Support vector machines; Biomarkers; VOCs; lung cancer; biomarkers; machine learning models; ensemble model; ensemble feature selection approach; B-CELL EPITOPES; ALLERGENIC PROTEINS; CLASSIFICATION; BIOMARKERS; LOCATION; DISEASE; SCENT;
D O I
10.1109/ACCESS.2025.3527027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advancement of biomarkers is critically important at present, as lung cancer is a leading cause of death. In the present study, volatile organic compounds (VOCs) are considered as biomarkers to predict lung cancer. VOCs from seven different sources including breath, blood, urine, cell line, plerual fluid, cancer tissue and lung tissue are targeted to enhance the prediction reliability. Feature selection and models fusion have been focused on during this study. Five in-built and one proposed ensemble machine learning model have been utilised to investigate the different types of VOCs. The idea behind designing one ensemble model is to combine multiple individual models for better performance by using optimal feature sets. This reasoning led to the design of an ensemble model to predict breath VOCs. The AvNNet model has superior performance in predicting blood VOCs, cancer tissue VOCs, cell line VOCs, and urine VOCs compared to four other models, achieving accuracies of 70%, 80%, 70%, and 90% accordingly on the validation dataset. The Blackboost model achieved 90% accuracy on the validation dataset in its prediction of lung tissue VOCs. With 90% accuracy on a validation dataset, the random forest model predicts pleural fluid volatile organic compounds efficiently. When compared to individual models, the proposed ensemble model predicts breath VOCs more effectively and achieves 100% accuracy on the validation dataset.
引用
收藏
页码:9809 / 9820
页数:12
相关论文
共 50 条
  • [41] Prediction of Radiation Induced Lymphedema for Head & Neck Cancer Patients Using Ensemble Feature Selection and Machine Learning
    Teo, P.
    Rogacki, K.
    Gopalakrishnan, M.
    Mittal, B.
    Das, I.
    Abazeed, M.
    Gentile, M.
    MEDICAL PHYSICS, 2022, 49 (06) : E266 - E267
  • [42] Efficient Model for Prediction of Parkinson's Disease Using Machine Learning Algorithms with Hybrid Feature Selection Methods
    Singh, Nutan
    Tripathi, Priyanka
    BIOMEDICAL ENGINEERING SCIENCE AND TECHNOLOGY, ICBEST 2023, 2024, 2003 : 186 - 203
  • [43] Diagnosis by Volatile Organic Compounds in Exhaled Breath from Lung Cancer Patients Using Support Vector Machine Algorithm
    Sakumura, Yuichi
    Koyama, Yutaro
    Tokutake, Hiroaki
    Hida, Toyoaki
    Sato, Kazuo
    Itoh, Toshio
    Akamatsu, Takafumi
    Shin, Woosuck
    SENSORS, 2017, 17 (02)
  • [44] FEATURE EXTRACTION AND SUPERVISED LEARNING FOR VOLATILE ORGANIC COMPOUNDS GAS RECOGNITION
    Tombel, Nor Syahira Mohd
    Zaki, Hasan Firdaus Mohd
    Fadglullah, Hanna Farihin Binti Mohd
    IIUM ENGINEERING JOURNAL, 2023, 24 (02): : 407 - 420
  • [45] Optimal Feature Selection of Technical Indicator and Stock Prediction Using Machine Learning Technique
    Naik, Nagaraj
    Mohan, Biju R.
    EMERGING TECHNOLOGIES IN COMPUTER ENGINEERING: MICROSERVICES IN BIG DATA ANALYTICS, 2019, 985 : 261 - 268
  • [46] Prediction of amyloid aggregation rates by machine learning and feature selection
    Yang, Wuyue
    Tan, Pengzhen
    Fu, Xianjun
    Hong, Liu
    JOURNAL OF CHEMICAL PHYSICS, 2019, 151 (08):
  • [47] Enhancing Parkinson's Disease Prediction Using Machine Learning and Feature Selection Methods
    Saeed, Faisal
    Al-Sarem, Mohammad
    Al-Mohaimeed, Muhannad
    Emara, Abdelhamid
    Boulila, Wadii
    Alasli, Mohammed
    Ghabban, Fahad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (03): : 5639 - 5657
  • [48] Prediction of Cardiovascular Disease by Feature Selection and Machine Learning Techniques
    Ranade, Aditya
    Pise, Nitin
    ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 2, AITA 2023, 2024, 844 : 457 - 472
  • [49] Feature selection for effective prediction of SARS-COV-2 using machine learning
    Gagan Punacha
    Rama Adiga
    Genes & Genomics, 2024, 46 : 341 - 354
  • [50] Battery Health Prediction Using Fusion-Based Feature Selection and Machine Learning
    Hu, Xiaosong
    Che, Yunhong
    Lin, Xianke
    Onori, Simona
    IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION, 2021, 7 (02) : 382 - 398