A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition

被引:20
|
作者
Tanchotsrinon, Watcharaporn [1 ]
Lursinsap, Chidchanok [1 ]
Poovorawan, Yong [2 ]
机构
[1] Chulalongkorn Univ, Adv Virtual & Intelligent Comp Res Ctr AVIC, Dept Math & Comp Sci, Fac Sci, Phayathai Rd, Bangkok, Thailand
[2] Chulalongkorn Univ, Fac Med, Dept Pediat, Ctr Excellence Clin Virol, Bangkok 10330, Thailand
来源
BMC BIOINFORMATICS | 2015年 / 16卷
关键词
HPV; Genotype; Chaos game representation; Singular value decomposition; Prediction; AMINO-ACID-COMPOSITION; HUMAN-PAPILLOMAVIRUS INFECTION; PROTEIN SUBCELLULAR LOCATION; GENE-EXPRESSION; STRUCTURAL CLASSES; CELLULAR-AUTOMATA; RISK TYPES; CLASSIFICATION; SEQUENCES; WOMEN;
D O I
10.1186/s12859-015-0493-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Human Papillomavirus (HPV) genotyping is an important approach to fight cervical cancer due to the relevant information regarding risk stratification for diagnosis and the better understanding of the relationship of HPV with carcinogenesis. This paper proposed two new feature extraction techniques, i.e. ChaosCentroid and ChaosFrequency, for predicting HPV genotypes associated with the cancer. The additional diversified 12 HPV genotypes, i.e. types 6, 11, 16, 18, 31, 33, 35, 45, 52, 53, 58, and 66, were studied in this paper. In our proposed techniques, a partitioned Chaos Game Representation (CGR) is deployed to represent HPV genomes. ChaosCentroid captures the structure of sequences in terms of centroid of each sub-region with Euclidean distances among the centroids and the center of CGR as the relations of all sub-regions. ChaosFrequency extracts the statistical distribution of mono-, di-, or higher order nucleotides along HPV genomes and forms a matrix of frequency of dots in each sub-region. For performance evaluation, four different types of classifiers, i.e. Multi-layer Perceptron, Radial Basis Function, K-Nearest Neighbor, and Fuzzy K-Nearest Neighbor Techniques were deployed, and our best results from each classifier were compared with the NCBI genotyping tool. Results: The experimental results obtained by four different classifiers are in the same trend. ChaosCentroid gave considerably higher performance than ChaosFrequency when the input length is one but it was moderately lower than ChaosFrequency when the input length is two. Both proposed techniques yielded almost or exactly the best performance when the input length is more than three. But there is no significance between our proposed techniques and the comparative alignment method. Conclusions: Our proposed alignment-free and scale-independent method can successfully transform HPV genomes with 7,000 - 10,000 base pairs into features of 1 - 11 dimensions. This signifies that our ChaosCentroid and ChaosFrequency can be served as the effective feature extraction techniques for predicting the HPV genotypes.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [1] A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition
    Watcharaporn Tanchotsrinon
    Chidchanok Lursinsap
    Yong Poovorawan
    BMC Bioinformatics, 16
  • [2] High-performance singular value decomposition
    Skillicorn, DB
    Yang, XL
    DATA MINING FOR SCIENTIFIC AND ENGINEERING APPLICATIONS, 2001, 2 : 401 - 424
  • [3] An Efficient Prediction of HPV Genotypes from Partial Coding Sequences by Chaos Game Representation and Fuzzy k-Nearest Neighbor Technique
    Tanchotsrinon, Watcharaporn
    Lursinsap, Chidchanok
    Poovorawan, Yong
    CURRENT BIOINFORMATICS, 2017, 12 (05) : 431 - 440
  • [4] Symmetrical singular value decomposition representation for pattern recognition
    Chen, Yuhui
    Tong, Shuiguang
    Cong, Feiyun
    Xu, Jian
    NEUROCOMPUTING, 2016, 214 : 143 - 154
  • [5] Dominant singular value decomposition representation for face recognition
    Lu, Jiwen
    Zhao, Yongwei
    SIGNAL PROCESSING, 2010, 90 (06) : 2087 - 2093
  • [6] High-performance Hardware Architecture for Tensor Singular Value Decomposition
    Deng, Chunhua
    Yin, Miao
    Liu, Xiao-Yang
    Wang, Xiaodong
    Yuan, Bo
    2019 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN (ICCAD), 2019,
  • [7] Subcellular Locations Prediction of Proteins Based on Chaos Game Representation
    Li Nana
    Niu Xiaohui
    Shi Feng
    Hu Xuehai
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 328 - 331
  • [8] Singular value decomposition based virtual representation for face recognition
    Guiying Zhang
    Wenbin Zou
    Xianjie Zhang
    Yong Zhao
    Multimedia Tools and Applications, 2018, 77 : 7171 - 7186
  • [9] Learning discriminative singular value decomposition representation for face recognition
    Tai, Ying
    Yang, Jian
    Luo, Lei
    Zhang, Fanlong
    Qian, Jianjun
    PATTERN RECOGNITION, 2016, 50 : 1 - 16
  • [10] Singular value decomposition based virtual representation for face recognition
    Zhang, Guiying
    Zou, Wenbin
    Zhang, Xianjie
    Zhao, Yong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (06) : 7171 - 7186