A New Evolutionary Rough Fuzzy Integrated Machine Learning Technique for microRNA selection using Next-Generation Sequencing data of Breast Cancer

被引:2
|
作者
Sarkar, Jnanendra Prasad [1 ,6 ]
Saha, Indrajit [2 ]
Rakshit, Somnath [3 ]
Pal, Monalisa [4 ]
Wlasnowolski, Michal [3 ,5 ]
Sarkar, Anasua [6 ]
Maulik, Ujjwal [6 ]
Plewczynski, Dariusz [3 ,5 ]
机构
[1] Larsen & Toubro Infotech Ltd, Pune, Maharashtra, India
[2] Natl Inst Tech Teachers Training & Res, Dept Comp Sci & Engn, Kolkata, India
[3] Univ Warsaw, Ctr New Technol, Warsaw, Poland
[4] Indian Stat Inst, Machine Intelligence Unit, Kolkata, India
[5] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[6] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata, India
基金
欧盟地平线“2020”;
关键词
Breast Cancer; Clustering; Fuzzy Set; Feature Selection; Particle Swarm Optimization; Random Forest; Rough Set; GENES; MIRNAS;
D O I
10.1145/3319619.3326836
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
MicroRNAs (miRNA) play an important role in various biological process by regulating gene expression. Their abnormal expression may lead to cancer. Therefore, analysis of such data may discover potential biological insight for cancer diagnosis. In this regard, recently many feature selection methods have been developed to identify such miRNAs. These methods have their own merits and demerits as the task is very challenging in nature. Thus, in this article, we propose a novel wrapper based feature selection technique with the integration of Rough and Fuzzy sets, Random Forest and Particle Swarm Optimization, to identify putative miRNAs that can solve the underlying biological problem effectively, i.e. to separate tumour and control samples. Here, Rough and Fuzzy sets help to address the vagueness and overlapping characteristics of the dataset while performing clustering. On the other hand, Random Forest is applied to perform the classification task on the clustering results to yield better solutions. The integrated clustering and classification tasks are considered as an underlying optimization problem for Particle Swarm Optimization method where particles encode features, in this case, miRNAs. The performance of the proposed wrapper based method has been demonstrated quantitatively and visually on next-generation sequencing data of breast cancer from The Cancer Genome Atlas (TCGA). Finally, the selected miRNAs are validated through biological significance tests. The code and dataset used in this paper are available online(1).
引用
收藏
页码:1846 / 1854
页数:9
相关论文
共 50 条
  • [1] A New Evolutionary MicroRNA Marker Selection using Next-Generation Sequencing Data
    Lancucki, Adrian
    Saha, Indrajit
    Bhowmick, Shib Sankar
    Maulik, Ujjwal
    Lipinski, Piotr
    2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2016, : 2752 - 2759
  • [2] seqQscorer: automated quality control of next-generation sequencing data using machine learning
    Steffen Albrecht
    Maximilian Sprang
    Miguel A. Andrade-Navarro
    Jean-Fred Fontaine
    Genome Biology, 22
  • [3] seqQscorer: automated quality control of next-generation sequencing data using machine learning
    Albrecht, Steffen
    Sprang, Maximilian
    Andrade-Navarro, Miguel A.
    Fontaine, Jean-Fred
    GENOME BIOLOGY, 2021, 22 (01)
  • [4] MicroRNA analysis in Colorectal Cancer using Next-Generation Sequencing technology
    Roehr, C.
    Meinel, T.
    Timmermann, B.
    Chen, W.
    Grimm, C.
    Lehrach, H.
    Schweiger, M-R.
    ONKOLOGIE, 2010, 33 : 74 - 75
  • [5] SNPGenie: estimating evolutionary parameters to detect natural selection using pooled next-generation sequencing data
    Nelson, Chase W.
    Moncla, Louise H.
    Hughes, Austin L.
    BIOINFORMATICS, 2015, 31 (22) : 3709 - 3711
  • [6] Analysis of Next-Generation Sequencing Data of miRNA for the Prediction of Breast Cancer
    Saha, Indrajit
    Bhowmick, Shib Sankar
    Geraci, Filippo
    Pellegrini, Marco
    Bhattacharjee, Debotosh
    Maulik, Ujjwal
    Plewczynski, Dariusz
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING (SEMCCO 2015), 2016, 9873 : 116 - 127
  • [7] New era of mutation screening in breast cancer using targeted next-generation sequencing
    Kwong, A.
    Au, T.
    Law, F.
    Ho, D.
    Ip, B.
    Wong, A.
    Shin, V.
    Chan, C.
    Ma, E.
    EUROPEAN JOURNAL OF CANCER, 2014, 50 : S74 - S74
  • [8] Breast cancer prediction using an optimal machine learning technique for next generation sequences
    Kurian, Babymol
    Jyothi, V. L.
    CONCURRENT ENGINEERING-RESEARCH AND APPLICATIONS, 2021, 29 (01): : 49 - 57
  • [9] MICRORNA PROFILING USING NEXT-GENERATION SEQUENCING IN RENAL CANCER STEM CELLS: A NEW REGULATORY MECHANISM
    Serino, Grazia
    Sallustio, Fabio
    Galleggiante, Vanessa
    Rutigliano, Monica
    Curci, Claudia
    Lucarelli, Giuseppe
    Ditonno, Pasquale
    Cox, Sharon
    Battaglia, Michele
    Schena, Francesco
    NEPHROLOGY DIALYSIS TRANSPLANTATION, 2017, 32 : 450 - 450
  • [10] Molecular portraits of metastatic breast cancer using tissue next-generation sequencing
    Shah, Ami N.
    Davis, Andrew A.
    Carroll, Kristen J.
    Wehbe, Firas
    Behdad, Amir
    Cristofanilli, Massimo
    CANCER RESEARCH, 2020, 80 (04)