A New Evolutionary Rough Fuzzy Integrated Machine Learning Technique for microRNA selection using Next-Generation Sequencing data of Breast Cancer

被引:2
|
作者
Sarkar, Jnanendra Prasad [1 ,6 ]
Saha, Indrajit [2 ]
Rakshit, Somnath [3 ]
Pal, Monalisa [4 ]
Wlasnowolski, Michal [3 ,5 ]
Sarkar, Anasua [6 ]
Maulik, Ujjwal [6 ]
Plewczynski, Dariusz [3 ,5 ]
机构
[1] Larsen & Toubro Infotech Ltd, Pune, Maharashtra, India
[2] Natl Inst Tech Teachers Training & Res, Dept Comp Sci & Engn, Kolkata, India
[3] Univ Warsaw, Ctr New Technol, Warsaw, Poland
[4] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
[5] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[6] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata, India
基金
欧盟地平线“2020”;
关键词
Breast Cancer; Clustering; Fuzzy Set; Feature Selection; Particle Swarm Optimization; Random Forest; Rough Set; GENES; MIRNAS;
D O I
10.1145/3319619.3326836
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control samples. Here, Rough and Fuzzy sets help to address the vagueness and overlapping characteristics of the dataset while performing clustering. On the other hand, Random Forest is applied to perform the classification task on the clustering results to yield better solutions. The integrated clustering and classification tasks are considered as an underlying optimization problem for Particle Swarm Optimization method where particles encode features, in this case, miRNAs. The performance of the proposed wrapper based method has been demonstrated quantitatively and visually on next-generation sequencing data of breast cancer from The Cancer Genome Atlas (TCGA). Finally, the selected miRNAs are validated through biological significance tests. The code and dataset used in this paper are available online(1).
引用
收藏
页码:1846 / 1854
页数:9
相关论文
共 50 条
  • [21] Discovery of microRNAs associated with breast cancer EMT using bioinformatics and next-generation sequencing
    Soo, Eliza
    Blick, Tony
    Waltham, Mark
    Goodall, Gregory
    Haviv, Izhak
    Simpson, Kaylene
    Thompson, Erik
    van denderen, Bryce
    MOLECULAR CANCER RESEARCH, 2013, 11
  • [22] A novel machine learning approach(svm Somatic) to distinguish somatic and germline mutations using next-generation sequencing data
    Yu-Fang Mao
    Xi-Guo Yuan
    Yu-Peng Cun
    Zoological Research, 2021, 42 (02) : 246 - 249
  • [23] A review of deep learning applications in human genomics using next-generation sequencing data
    Wardah S. Alharbi
    Mamoon Rashid
    Human Genomics, 16
  • [24] A review of deep learning applications in human genomics using next-generation sequencing data
    Alharbi, Wardah S.
    Rashid, Mamoon
    HUMAN GENOMICS, 2022, 16 (01)
  • [25] advanced prostate cancer (aPC) using machine learning and next-generation sequencing (NGS) of circulating tumor DNA (ctDNA).
    Lin, Edwin
    Hahn, Andrew W.
    Sonpavde, Guru
    Lilly, Michael B.
    Nussenzveig, Roberto
    Ledet, Elisa
    Pal, Sumanta K.
    Grivas, Petros
    Rich, Thereasa A.
    Raymond, Victoria M.
    Sartor, A. Oliver
    Yandell, Mark
    Agarwal, Neeraj
    JOURNAL OF CLINICAL ONCOLOGY, 2019, 37 (07)
  • [26] Next-Generation Sequencing in Breast Cancer Patients: Real-World Data for Precision Medicine
    Lee, Hyunwoo
    Cho, Yoon Ah
    Kim, Deok Geun
    Cho, Eun Yoon
    CANCER RESEARCH AND TREATMENT, 2024, 56 (01): : 149 - 161
  • [27] Clinical Application of Next-Generation Sequencing in Patients With Breast Cancer: Real-World Data
    Suh, Koung Jin
    Kim, Se Hyun
    Kim, Yu Jung
    Shin, Heechul
    Kang, Eunyoung
    Kim, Eun-Kyu
    Lee, Sejoon
    Woo, Ji Won
    Na, Hee Young
    Ahn, Soomin
    Jang, Bum-Sup
    Kim, In Ah
    Park, So Yeon
    Kim, Jee Hyun
    JOURNAL OF BREAST CANCER, 2022, 25 (05) : 366 - 378
  • [28] Next-generation sequencing (NGS) for personalized therapy in metastatic breast cancer: Selection of therapy and outcomes.
    Gomes, Jessica Ribeiro
    Moreira, Raphael Brandao
    Alessandretti, Matheus Bongers
    Cruz, Marcelo Rocha De Sousa
    JOURNAL OF CLINICAL ONCOLOGY, 2017, 35
  • [29] Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data
    Elizabeth Held
    Joshua Cape
    Nathan Tintle
    BMC Proceedings, 10 (Suppl 7)
  • [30] Integrated analysis of the miRNA mRNA next-generation sequencing data for finding their associations in different cancer types
    Bhowmick, Shib Sankar
    Bhattacharjee, Debotosh
    Rato, Luis
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2020, 84 (84)