Scalable I/O and analytics

被引:14
|
作者
Choudhary, Alok [1 ]
Liao, Wei-keng [1 ]
Gao, Kui [1 ]
Nisar, Arifa [1 ]
Ross, Robert [2 ]
Thakur, Rajeev [2 ]
Latham, Robert [2 ]
机构
[1] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA
[2] Argonne Natl Lab, Div Math & Comp Sci, Argonne, IL 60439 USA
基金
美国国家科学基金会;
关键词
D O I
10.1088/1742-6596/180/1/012048
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
High-performance computing systems have already approached peta-scale with hundreds of thousands of processors/cores in many deployments. These systems promise a new level of predictive and knowledge discovery ability as researchers gain the capability to model dependencies between phenomena at scales not seen earlier. These applications are highly I/O and data intensive, leading scientists to observe that performing I/O and subsequent analyses are major bottlenecks in effectively utilizing peta-scale systems and a major hurdle in accelerating discoveries. Although significant progress has been made in performance, interfaces, and middleware runtime systems for I/O in the recent past, significantly more research and development needs to be carried out to scale the performance to the desired levels for systems containing tens to hundreds of thousands of cores. In this work we outline our recent achievements and current research for designing scalable I/O software and enabling data analytics in storage systems. We also enumerate key challenges for the I/O systems and discuss ongoing efforts that address these challenges.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Scalable driver I/O macromodels for statistical analysis
    Mutnury, B
    Swaminathan, M
    Cases, M
    Pham, N
    de Araujo, DN
    Matoglu, E
    ELECTRICAL PERFORMANCE OF ELECTRONIC PACKAGING, 2004, : 239 - 242
  • [22] PARALLEL I/O OPTIMIZATIONS FOR SCALABLE DEEP LEARNING
    Pumma, Sarunya
    Si, Min
    Feng, Wu-chun
    Balaji, Pavan
    2017 IEEE 23RD INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2017, : 720 - 729
  • [23] A scalable and I/O optimal skyline processing Algorithm
    Luo, Yi
    Lu, Hai-Xin
    Lin, Xuemin
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3129 : 218 - 228
  • [24] Datix: A System for Scalable Network Analytics
    Mellia, Marco
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2015, 45 (05) : 21 - 28
  • [25] Socrates A System For Scalable Graph Analytics
    Savkli, C.
    Carr, R.
    Chapman, M.
    Chee, B.
    Minch, D.
    2014 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2014,
  • [26] GADBMS: A framework for scalable array analytics
    Clemons, Tyler
    Parthasarathy, Srinivasan
    Sadayappan, P.
    2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1322 - 1325
  • [27] Towards Scalable Video Analytics at the Edge
    Stone, Theodore
    Stone, Nathaniel
    Jain, Puneet
    Jiang, Yurong
    Kim, Kyu-Han
    Nelakuditi, Srihari
    2019 16TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON SENSING, COMMUNICATION, AND NETWORKING (SECON), 2019,
  • [28] Supporting Scalable Analytics with Latency Constraints
    Li, Boduo
    Diao, Yanlei
    Shenoy, Prashant
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2015, 8 (11): : 1166 - 1177
  • [29] Scalable Data-Intensive Analytics
    Hsu, Meichun
    Chen, Qiming
    BUSINESS INTELLIGENCE FOR THE REAL-TIME ENTERPRISE, 2009, 27 : 97 - +
  • [30] A Graph Algebra for Scalable Visual Analytics
    Shaverdian, Anna A.
    Zhou, Hao
    Michailidis, George
    Jagadish, Hosagrahar V.
    IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2012, 32 (04) : 26 - 33