`A PHASE TRANSITION FOR THE SCORE IN MATCHING RANDOM SEQUENCES ALLOWING DELETIONS

被引:74
作者
Arratia, Richard [1 ]
Waterman, Michael S.
机构
[1] Univ So Calif, Dept Math, Los Angeles, CA 90089 USA
关键词
Sequence matching; longest common subsequence; large deviations; Azuma-Hoeffding; phase transition; percolation;
D O I
10.1214/aoap/1177005208
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider a sequence matching problem involving the optimal alignment score for contiguous subsequences, rewarding matches and penalizing for deletions and mismatches. This score is used by biologists comparing pairs of DNA or protein sequences. We prove that for two sequences of length n, as n -> infinity, there is a phase transition between linear growth in n, when the penalty parameters are small, and logarithmic growth in n, when the penalties are large. The results are valid for independent sequences with iid or Markov letters. The crucial step in proving this is to derive a large deviation result for matching with deletions. The longest common subsequence problem of Chvatal and Sankoff is a special case of our setup. The proof of the large deviation result exploits the Azuma-Hoeffding lemma. The phase transition is also established for more general scoring schemes allowing general letter-to-letter alignment penalties and block deletion penalties. We give a general method for applying the bounded increments martingale method to Lipschitz functionals of Markov processes. The phase transition holds for matching Markov chains and for nonoverlapping repeats in a single sequence.
引用
收藏
页码:200 / 225
页数:26
相关论文
共 26 条
[1]  
Alon N., 1992, PROBABILISTIC METHOD
[2]  
APOSTOLICO A., 1992, LECT NOTES COMPUTER, V644
[3]   THE ERDOS-RENYI LAW IN DISTRIBUTION, FOR COIN TOSSING AND SEQUENCE MATCHING [J].
ARRATIA, R ;
GORDON, L ;
WATERMAN, MS .
ANNALS OF STATISTICS, 1990, 18 (02) :539-570
[4]   CRITICAL PHENOMENA IN SEQUENCE MATCHING [J].
ARRATIA, R ;
WATERMAN, MS .
ANNALS OF PROBABILITY, 1985, 13 (04) :1236-1249
[5]   THE ERDOS-RENYI STRONG LAW FOR PATTERN-MATCHING WITH A GIVEN PROPORTION OF MISMATCHES [J].
ARRATIA, R ;
WATERMAN, MS .
ANNALS OF PROBABILITY, 1989, 17 (03) :1152-1169
[6]   AN ERDOS-RENYI LAW WITH SHIFTS [J].
ARRATIA, R ;
WATERMAN, MS .
ADVANCES IN MATHEMATICS, 1985, 55 (01) :13-23
[7]  
CAPOCELLI R, 1990, SEQUENCES COMBINATOR
[8]  
CHVATAL V, 1975, J APPL PROBAB, V12
[9]  
EPPSTEIN D., 1992, J AM COMP MACH, V39, P547
[10]   1ST-PASSAGE PERCOLATION, NETWORK FLOWS AND ELECTRICAL RESISTANCES [J].
GRIMMETT, G ;
KESTEN, H .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1984, 66 (03) :335-366