Determining significance of pairwise co-occurrences of events in bursty sequences

被引:14
|
作者
Haiminen, Niina [1 ]
Mannila, Heikki [1 ,2 ]
Terzi, Evimaria [3 ]
机构
[1] Univ Helsinki, Dept Comp Sci, HIIT, FIN-00014 Helsinki, Finland
[2] Aalto Univ, Lab Comp & Informat Sci, HIIT, FI-02015 Helsinki, Finland
[3] IBM Corp, Almaden Res Ctr, San Jose, CA 95120 USA
基金
芬兰科学院;
关键词
D O I
10.1186/1471-2105-9-336
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Event sequences where different types of events often occur close together arise, e. g., when studying potential transcription factor binding sites (TFBS, events) of certain transcription factors (TF, types) in a DNA sequence. These events tend to occur in bursts: in some genomic regions there are more genes and therefore potentially more binding sites, while in some, possibly very long regions, hardly any events occur. Also some types of events may occur in the sequence more often than others. Tendencies of co-occurrence of binding sites of two or more TFs are interesting, as they may imply a co-operative role between the TFs in regulatory processes. Determining a numerical value to summarize the tendency for co-occurrence between two TFs can be done in a number of ways. However, testing for the significance of such values should be done with respect to a relevant null model that takes into account the global sequence structure. Results: We extend the existing techniques that have been considered for determining the significance of co-occurrence patterns between a pair of event types under different null models. These models range from very simple ones to more complex models that take the burstiness of sequences into account. We evaluate the models and techniques on synthetic event sequences, and on real data consisting of potential transcription factor binding sites. Conclusion: We show that simple null models are poorly suited for bursty data, and they yield many false positives. More sophisticated models give better results in our experiments. We also demonstrate the effect of the window size, i.e., maximum co-occurrence distance, on the significance results.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Framing comorbidities and co-occurrences in a migraine with aura patient
    Andrea Negro
    Lidia D’Alonzo
    Paolo Martelletti
    Internal and Emergency Medicine, 2014, 9 : 603 - 604
  • [32] Visualizing Textbook Concepts: Beyond Word Co-occurrences
    Sastry, Chandramouli Shama
    Jagaluru, Darshan Siddesh
    Mahesh, Kavi
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2017), PT I, 2018, 10761 : 363 - 376
  • [33] Video Classification using Semantic Concept Co-occurrences
    Assari, Shayan Modiri
    Zamir, Amir Roshan
    Shah, Mubarak
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2529 - 2536
  • [34] Mining Geometrical Motifs Co-occurrences in the CMS Dataset
    Musci, Mirto
    Ferretti, Marco
    DATABASE AND EXPERT SYSTEMS APPLICATIONS: DEXA 2018 INTERNATIONAL WORKSHOPS, 2018, 903 : 179 - 190
  • [35] Framing comorbidities and co-occurrences in a migraine with aura patient
    Negro, Andrea
    D'Alonzo, Lidia
    Martelletti, Paolo
    INTERNAL AND EMERGENCY MEDICINE, 2014, 9 (05) : 603 - 604
  • [36] Video Segmentation and Feature Co-occurrences for Activity Classification
    Trichet, Remi
    Nevatia, Ramakant
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 385 - 392
  • [37] Plant Texture Classification Using Gabor Co-occurrences
    Cope, James S.
    Remagnino, Paolo
    Barman, Sarah
    Wilkin, Paul
    ADVANCES IN VISUAL COMPUTING, PT II, 2010, 6454 : 669 - +
  • [38] Using concept co-occurrences for a biomedical facts acquisition
    Minarro-Gimenez, Jose A.
    Costa, Catalina M.
    Schulz, Stefan
    E-HEALTH - FOR CONTINUITY OF CARE, 2014, 205 : 1200 - 1200
  • [39] Clustering of a Health Dataset Using Diagnosis Co-Occurrences
    Wartelle, Adrien
    Mourad-Chehade, Farah
    Yalaoui, Farouk
    Chrusciel, Jan
    Laplanche, David
    Sanchez, Stephane
    APPLIED SCIENCES-BASEL, 2021, 11 (05): : 1 - 20
  • [40] SemGloVe: Semantic Co-Occurrences for GloVe From BERT
    Gan, Leilei
    Teng, Zhiyang
    Zhang, Yue
    Zhu, Linchao
    Wu, Fei
    Yang, Yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 2696 - 2704