ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight

被引:1
|
作者
Zhang, Hao [1 ]
Huang, Zhiyi [1 ]
Chen, Yawen [1 ]
Liang, Jianguo [2 ]
Gao, Xiran [3 ,4 ]
机构
[1] Univ Otago, Dept Comp Sci, Dunedin 9054, New Zealand
[2] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao 266590, Peoples R China
[3] Chinese Acad Sci, ICT, State Key Lab Proc, Beijing, Peoples R China
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
Hybrid sequence alignment; Biological database search; Sunway TaihuLight; SW26010; Heterogeneous architecture; SMITH-WATERMAN; PERFORMANCE; PROCESSOR;
D O I
10.1016/j.parco.2023.103043
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In computational biology, biological database search has been playing a very important role. Since the COVID19 outbreak, it has provided significant help in identifying common characteristics of viruses and developing vaccines and drugs. Sequence alignment, a method finding similarity, homology and other information between gene/protein sequences, is the usual tool in the database search. With the explosive growth of biological databases, the search process has become extremely time-consuming. However, existing parallel sequence alignment algorithms cannot deliver efficient database search due to low utilization of the resources such as cache memory and performance issues such as load imbalance and high communication overhead. In this paper, we propose an efficient sequence alignment algorithm on Sunway TaihuLight, called ESA, for biological database search. ESA adopts a novel hybrid alignment algorithm combining local and global alignments, which has higher accuracy than other sequence alignment algorithms. Further, ESA has several optimizations including cache-aware sequence alignment, capacity-aware load balancing and bandwidth-aware data transfer. They are implemented in a heterogeneous processor SW26010 adopted in the world's 6th fastest supercomputer, Sunway TaihuLight. The implementation of ESA is evaluated with the Swiss-Prot database on Sunway TaihuLight and other platforms. Our experimental results show that ESA has a speedup of 34.5 on a single core group (with 65 cores) of Sunway TaihuLight. The strong and weak scalabilities of ESA are tested with 1 to 1024 core groups of Sunway TaihuLight. The results show that ESA has linear weak scalability and very impressive strong scalability. For strong scalability, ESA achieves a speedup of 338.04 with 1024 core groups compared with a single core group. We also show that our proposed optimizations are also applicable to GPU, Intel multicore processors, and heterogeneous computing platforms.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] rMSA: A Sequence Search and Alignment Algorithm to Improve RNA Structure Modeling
    Zhang, Chengxin
    Zhang, Yang
    Pyle, Anna Marie
    JOURNAL OF MOLECULAR BIOLOGY, 2023, 435 (14)
  • [22] AlineaGA—a genetic algorithm with local search optimization for multiple sequence alignment
    Fernando José Mateus da Silva
    Juan Manuel Sánchez Pérez
    Juan Antonio Gómez Pulido
    Miguel A. Vega Rodríguez
    Applied Intelligence, 2010, 32 : 164 - 172
  • [23] Global Biological Network Alignment by Using Efficient Memetic Algorithm
    Gong, Maoguo
    Peng, Zhenglin
    Ma, Lijia
    Huang, Jiaxiang
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (06) : 1117 - 1129
  • [24] Protein database search of hybrid alignment algorithm based on GPU parallel acceleration
    Zhou, Wei
    Cai, Zhanxiu
    Lian, Bo
    Wang, Jincai
    Ma, Jianping
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4517 - 4534
  • [25] Protein database search of hybrid alignment algorithm based on GPU parallel acceleration
    Wei Zhou
    Zhanxiu Cai
    Bo Lian
    Jincai Wang
    Jianping Ma
    The Journal of Supercomputing, 2017, 73 : 4517 - 4534
  • [26] A parallel wavefront algorithm for efficient biological sequence comparison
    Alves, CER
    Cáceres, EN
    Dehne, F
    Song, SW
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 2, PROCEEDINGS, 2003, 2668 : 249 - 258
  • [27] A profile-based protein sequence alignment algorithm for a domain clustering database
    Xu, Lin
    Zhang, Fa
    Liu, Zhiyong
    PROCEEDINGS OF THE 2006 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2006, : 193 - +
  • [28] A Novel Structure of the Smith-Waterman Algorithm for Efficient Sequence Alignment
    Zahid, Saad Khan
    Hasan, Laiq
    Khan, Asif Ali
    Ullah, Salim
    2015 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION, NETWORKING, AND WIRELESS COMMUNICATIONS (DINWC), 2015, : 6 - 9
  • [29] An Efficient Digital Circuit for Implementing Sequence Alignment Algorithm in an Extended Processor
    Kundeti, Vamsi
    Fei, Yunsi
    Rajasekaran, Sanguthevar
    2008 INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2008, : 156 - 161
  • [30] A space-efficient algorithm for three sequence alignment and ancestor inference
    Yue, Feng
    Tang, Jijun
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (02) : 192 - 204