Volatile Organic Compounds for the Prediction of Lung Cancer by Using Ensembled Machine Learning Model and Feature Selection

被引:1
|
作者
Khanna, Divya [1 ]
Kumar, Arun [2 ]
Bhat, Shahid Ahmad [3 ]
机构
[1] Chitkara Univ, Inst Engn & Technol, Rajpura 140401, Punjab, India
[2] Madhav Inst Sci & Technol, Ctr Artificial Intelligence, Gwalior 474005, Madhya Pradesh, India
[3] LUT Univ, LUT Business Sch, Lappeenranta 53851, Finland
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Lung cancer; Cancer; Predictive models; Volatile organic compounds; Machine learning; Lungs; Feature extraction; Analytical models; Support vector machines; Biomarkers; VOCs; lung cancer; biomarkers; machine learning models; ensemble model; ensemble feature selection approach; B-CELL EPITOPES; ALLERGENIC PROTEINS; CLASSIFICATION; BIOMARKERS; LOCATION; DISEASE; SCENT;
D O I
10.1109/ACCESS.2025.3527027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The advancement of biomarkers is critically important at present, as lung cancer is a leading cause of death. In the present study, volatile organic compounds (VOCs) are considered as biomarkers to predict lung cancer. VOCs from seven different sources including breath, blood, urine, cell line, plerual fluid, cancer tissue and lung tissue are targeted to enhance the prediction reliability. Feature selection and models fusion have been focused on during this study. Five in-built and one proposed ensemble machine learning model have been utilised to investigate the different types of VOCs. The idea behind designing one ensemble model is to combine multiple individual models for better performance by using optimal feature sets. This reasoning led to the design of an ensemble model to predict breath VOCs. The AvNNet model has superior performance in predicting blood VOCs, cancer tissue VOCs, cell line VOCs, and urine VOCs compared to four other models, achieving accuracies of 70%, 80%, 70%, and 90% accordingly on the validation dataset. The Blackboost model achieved 90% accuracy on the validation dataset in its prediction of lung tissue VOCs. With 90% accuracy on a validation dataset, the random forest model predicts pleural fluid volatile organic compounds efficiently. When compared to individual models, the proposed ensemble model predicts breath VOCs more effectively and achieves 100% accuracy on the validation dataset.
引用
收藏
页码:9809 / 9820
页数:12
相关论文
共 50 条
  • [21] A Survey of Feature Selection for Vulnerability Prediction Using Feature-based Machine Learning
    Li, ZhanJun
    Shao, Yan
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 30 - 36
  • [22] Solar Flare Prediction Using Advanced Feature Extraction, Machine Learning, and Feature Selection
    Ahmed, Omar W.
    Qahwaji, Rami
    Colak, Tufan
    Higgins, Paul A.
    Gallagher, Peter T.
    Bloomfield, D. Shaun
    SOLAR PHYSICS, 2013, 283 (01) : 157 - 175
  • [23] A Gas Emission Prediction Model Based on Feature Selection and Improved Machine Learning
    Shao, Liangshan
    Zhang, Kun
    PROCESSES, 2023, 11 (03)
  • [24] Machine Learning Model for Heart Failure Prediction with Feature Selection and Data Expansion
    Shen, Ziyang
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 6 - 11
  • [25] Machine Learning-Assisted Cervical Cancer Prediction Using Particle Swarm Optimization for Improved Feature Selection and Prediction
    Ileberi, Emmanuel
    Sun, Yanxia
    IEEE ACCESS, 2024, 12 : 152684 - 152695
  • [26] Machine Learning Model for Breast Cancer Data Analysis Using Triplet Feature Selection Algorithm
    Dhivya, P.
    Bazilabanu, A.
    Ponniah, Thirumalaikolundusubramanian
    IETE JOURNAL OF RESEARCH, 2023, 69 (04) : 1789 - 1799
  • [27] Antiprotozoal peptide prediction using machine learning with effective feature selection techniques
    Periwal, Neha
    Arora, Pooja
    Thakur, Ananya
    Agrawal, Lakshay
    Goyal, Yash
    Rathore, Anand S.
    Anand, Harsimrat Singh
    Kaur, Baljeet
    Sood, Vikas
    HELIYON, 2024, 10 (16)
  • [28] Sarcopenia risk prediction and feature selection by using quantum machine learning algorithms
    Ullah, Ubaid
    Maheshwari, Danyal
    Castillo Olea, Cristian
    Zapirain, Begonya Garcia
    QUANTUM MACHINE INTELLIGENCE, 2024, 6 (02)
  • [29] Exploring Volatile Organic Compounds in Breath for High-Accuracy Prediction of Lung Cancer
    Tsou, Ping-Hsien
    Lin, Zong-Lin
    Pan, Yu-Chiang
    Yang, Hui-Chen
    Chang, Chien-Jen
    Liang, Sheng-Kai
    Wen, Yueh-Feng
    Chang, Chia-Hao
    Chang, Lih-Yu
    Yu, Kai-Lun
    Liu, Chia-Jung
    Keng, Li-Ta
    Lee, Meng-Rui
    Ko, Jen-Chung
    Huang, Guan-Hua
    Li, Yaw-Kuen
    CANCERS, 2021, 13 (06) : 1 - 14
  • [30] Prediction and feature selection of low birth weight using machine learning algorithms
    Reza, Tasneem Binte
    Salma, Nahid
    JOURNAL OF HEALTH POPULATION AND NUTRITION, 2024, 43 (01)