Determining significance of pairwise co-occurrences of events in bursty sequences

被引:14
|
作者
Haiminen, Niina [1 ]
Mannila, Heikki [1 ,2 ]
Terzi, Evimaria [3 ]
机构
[1] Univ Helsinki, Dept Comp Sci, HIIT, FIN-00014 Helsinki, Finland
[2] Aalto Univ, Lab Comp & Informat Sci, HIIT, FI-02015 Helsinki, Finland
[3] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
基金
芬兰科学院;
关键词
D O I
10.1186/1471-2105-9-336
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Event sequences where different types of events often occur close together arise, e. g., when studying potential transcription factor binding sites (TFBS, events) of certain transcription factors (TF, types) in a DNA sequence. These events tend to occur in bursts: in some genomic regions there are more genes and therefore potentially more binding sites, while in some, possibly very long regions, hardly any events occur. Also some types of events may occur in the sequence more often than others. Tendencies of co-occurrence of binding sites of two or more TFs are interesting, as they may imply a co-operative role between the TFs in regulatory processes. Determining a numerical value to summarize the tendency for co-occurrence between two TFs can be done in a number of ways. However, testing for the significance of such values should be done with respect to a relevant null model that takes into account the global sequence structure. Results: We extend the existing techniques that have been considered for determining the significance of co-occurrence patterns between a pair of event types under different null models. These models range from very simple ones to more complex models that take the burstiness of sequences into account. We evaluate the models and techniques on synthetic event sequences, and on real data consisting of potential transcription factor binding sites. Conclusion: We show that simple null models are poorly suited for bursty data, and they yield many false positives. More sophisticated models give better results in our experiments. We also demonstrate the effect of the window size, i.e., maximum co-occurrence distance, on the significance results.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Corpus of Syntactic Co-Occurrences: A Delayed Promise
    Klyshinsky, Eduard S.
    Lukashevich, Natalia Y.
    ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE, 2018, 789 : 121 - 127
  • [22] Analyzing Relatedness by Toponym Co-Occurrences on Web Pages
    Liu, Yu
    Wang, Fahui
    Kang, Chaogui
    Gao, Yong
    Lu, Yongmei
    TRANSACTIONS IN GIS, 2014, 18 (01) : 89 - 107
  • [23] Discovering Significant Co-Occurrences to Characterize Network Behaviors
    Arthur-Durett, Kristine
    Carroll, Thomas E.
    Chikkagoudar, Satish
    HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INTERACTION, VISUALIZATION, AND ANALYTICS, HIMI 2018 HELD AS PART OF HCI 2018, PART I, 2018, 10904 : 609 - 623
  • [24] Disentangling categorical relationships through a graph of co-occurrences
    Martinez-Romo, Juan
    Araujo, Lourdes
    Borge-Holthoefer, Javier
    Arenas, Alex
    Capitan, Jose A.
    Cuesta, Jose A.
    PHYSICAL REVIEW E, 2011, 84 (04)
  • [25] Diversification Improvements Through News Article Co-occurrences
    Yaros, John Robert
    Imielinski, Tomasz
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING & ECONOMICS (CIFER), 2014, : 130 - 137
  • [26] Image estimation of words based on adjective Co-occurrences
    Shimizu, Kouhei
    Hagiwara, Masafumi
    Systems and Computers in Japan, 2007, 38 (12) : 14 - 24
  • [27] Attitudes From Mere Co-Occurrences Are Guided by Differentiation
    Alves, Hans
    Hoegden, Fabia
    Gast, Anne
    Aust, Frederik
    Unkelbach, Christian
    JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 2020, 119 (03) : 560 - 581
  • [28] Shared and unique mutational gene co-occurrences in cancers
    Liu, Junqi
    Zhao, Di
    Fan, Ruitai
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2015, 465 (04) : 777 - 783
  • [29] Word co-occurrences as a principle of an algorithm for extraction of terminology
    Zunker, G
    Rapp, R
    COGNITIVE ASPECTS OF LANGUAGE, 1996, 360 : 293 - 298
  • [30] Random Projections of Residuals as an Alternative to Co-occurrences in Steganalysis
    Holub, Vojtech
    Fridrich, Jessica
    Denemark, Tomas
    MEDIA WATERMARKING, SECURITY, AND FORENSICS 2013, 2013, 8665