Comparison of large networks with sub-sampling strategies

被引:10
|
作者
Ali, Waqar [1 ]
Wegner, Anatol E. [1 ]
Gaunt, Robert E. [1 ]
Deane, Charlotte M. [1 ]
Reinert, Gesine [1 ]
机构
[1] Univ Oxford, Dept Stat, 24-29 St Giles, Oxford OX1 3LB, England
来源
SCIENTIFIC REPORTS | 2016年 / 6卷
基金
英国工程与自然科学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
PROTEIN-INTERACTION NETWORKS; GLOBAL ALIGNMENT; RANDOM GRAPHS; EVOLUTION; DATABASE; MOTIFS;
D O I
10.1038/srep28955
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Networks are routinely used to represent large data sets, making the comparison of networks a tantalizing research question in many areas. Techniques for such analysis vary from simply comparing network summary statistics to sophisticated but computationally expensive alignment-based approaches. Most existing methods either do not generalize well to different types of networks or do not provide a quantitative similarity score between networks. In contrast, alignment-free topology based network similarity scores empower us to analyse large sets of networks containing different types and sizes of data. Netdis is such a score that defines network similarity through the counts of small sub-graphs in the local neighbourhood of all nodes. Here, we introduce a sub-sampling procedure based on neighbourhoods which links naturally with the framework of network comparisons through local neighbourhood comparisons. Our theoretical arguments justify basing the Netdis statistic on a sample of similar-sized neighbourhoods. Our tests on empirical and synthetic datasets indicate that often only 10% of the neighbourhoods of a network suffice for optimal performance, leading to a drastic reduction in computational requirements. The sampling procedure is applicable even when only a small sample of the network is known, and thus provides a novel tool for network comparison of very large and potentially incomplete datasets.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Windowing as a Sub-Sampling Method for Distributed Data Mining
    Martinez-Galicia, David
    Guerra-Hernandez, Alejandro
    Cruz-Ramirez, Nicandro
    Limon, Xavier
    Grimaldo, Francisco
    MATHEMATICAL AND COMPUTATIONAL APPLICATIONS, 2020, 25 (03)
  • [42] ACTIVE COVARIANCE ESTIMATION BY RANDOM SUB-SAMPLING OF VARIABLES
    Pavez, Eduardo
    Ortega, Antonio
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4034 - 4038
  • [43] Automotive Radar Sub-Sampling via Object Detection Networks: Leveraging Prior Signal Information
    Sakthi, Madhumitha
    Arvinte, Marius
    Vikalo, Haris
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 4 : 858 - 869
  • [44] Sub-Sampling PLL For Millimeter Wave Applications: An Overview
    Gao, Xiang
    2019 IEEE MTT-S INTERNATIONAL MICROWAVE CONFERENCE ON HARDWARE AND SYSTEMS FOR 5G AND BEYOND (IMC-5G), 2019,
  • [45] Temporal Frame Sub-Sampling for Video Object Tracking
    Xuan Wang
    Yu Hen Hu
    Robert G. Radwin
    John D. Lee
    Journal of Signal Processing Systems, 2020, 92 : 569 - 581
  • [46] Estimation of Population Mean In Successive Sampling by Sub-Sampling Non-Respondents
    Singh, Housila P.
    Kumar, Sunil
    Bhougal, Sandeep
    JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2011, 10 (01) : 51 - 60
  • [47] River condition assessment may depend on the sub-sampling method: field live-sort versus laboratory sub-sampling of invertebrates for bioassessment
    Nichols, Susan J.
    Norris, Richard H.
    HYDROBIOLOGIA, 2006, 572 (1) : 195 - 213
  • [48] Sampling Jitter Estimation and Mitigation in Direct RF Sub-Sampling Receiver Architecture
    Syrjala, Ville
    Valkama, Mikko
    2009 6TH INTERNATIONAL SYMPOSIUM ON WIRELESS COMMUNICATION SYSTEMS (ISWCS 2009), 2009, : 323 - 327
  • [49] REGIONAL SUB-SAMPLING AND STATISTICAL-INFERENCE IN FORESTED HABITATS
    NANCE, JD
    AMERICAN ANTIQUITY, 1979, 44 (01) : 172 - 176
  • [50] STBC-OFDM Communication Systems with Sub-Sampling Support
    Petrellis, Nikos
    2016 5TH INTERNATIONAL CONFERENCE ON MODERN CIRCUITS AND SYSTEMS TECHNOLOGIES (MOCAST), 2016,