Towards an Intelligent Framework for Scientific Computational Steering in Big Data Systems

被引:0
|
作者
Zhang, Yijie [1 ]
Wu, Chase Q. [1 ]
机构
[1] New Jersey Inst Technol, Dept Data Sci, Newark, NJ 07102 USA
关键词
Big Data; Computational Steering; Parameter Tuning; Machine Learning; Scientific Innovation; FILE; HDFS;
D O I
10.1109/CCGrid59990.2024.00085
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Scientific applications of the next generation are undergoing a paradigm shift, transitioning from traditional experiment-centric methodologies to extreme-scale simulation-centric computations. These simulations, characterized by intricate numerical modeling with numerous adjustable parameters, generate vast datasets that necessitate meticulous processing and analysis against experimental or observational data for parameter calibration and model validation. However, manual parameter adjustment by domain experts in complex and distributed environments proves impractical. To address this challenge, we propose an online computational steering service facilitating real-time multi-user interaction. Towards this end, we design a versatile steering framework and conduct a theoretical performance evaluation of the steering service empowered by machine learning techniques. Furthermore, we present a case study involving the Weather Research and Forecast (WRF) model, comparing the performance of our steering solution with alternative heuristic methods and default settings to demonstrate its efficacy. The processing of big data generated by scientific simulations typically requires the use of big data systems as exemplified by Hadoop with Hadoop Distributed File System (HDFS) serving as a foundational technology layer. HDFS supports parallel computing in upper layers, offering fault tolerance and high throughput in data storage through block replication and cluster-wide distribution. However, the default block distribution strategy in HDFS overlooks the diverse capacities and data access patterns of nodes in heterogeneous Hadoop clusters, rendering it suboptimal for such environments. To address this issue, we formulate a class of block distribution problems in heterogeneous clusters, establishing its NP-completeness, and design an approximate algorithm, LPIR-BD, which leverages linear programming-based iterative rounding with a rigorous performance guarantee. Extensive experimental evaluations demonstrate the superior performance of LPIR-BD over several existing algorithms, corroborating our theoretical analyses and underscoring its efficacy in heterogeneous clusters.
引用
收藏
页码:671 / 675
页数:5
相关论文
共 50 条
  • [1] Towards Cloud Big Data Services for Intelligent Transport Systems
    Kemp, Gavin
    Vargas-Solar, Genoveva
    Ferreira Da Silva, Catarina
    Ghodous, Parisa
    Collet, Christine
    Lopez Amaya, Pedropablo
    TRANSDISCIPLINARY LIFECYCLE ANALYSIS OF SYSTEMS, 2015, 2 : 377 - 385
  • [2] Computational Framework for Analytical Operation in Intelligent Transportation System using Big Data
    Mahendra, G.
    Roopashree, H. R.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (07) : 401 - 413
  • [3] Data-Less Big Data Analytics (Towards Intelligent Data Analytics Systems)
    Triantafillou, Peter
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1666 - 1667
  • [4] Big traffic data processing framework for intelligent monitoring and recording systems
    Xia, Yingjie
    Chen, Jinlong
    Lu, Xindai
    Wang, Chunhui
    Xu, Chao
    NEUROCOMPUTING, 2016, 181 : 139 - 146
  • [5] Towards an IoT Big Data Analytics Framework: Smart Buildings Systems
    Bashir, Muhammad Rizwan
    Gill, Asif Qumer
    PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 1325 - 1332
  • [6] A big data analytics framework for scientific data management
    Fiore, Sandro
    Palazzo, Cosimo
    D'Anca, Alessandro
    Foster, Ian
    Williams, Dean N.
    Aloisio, Giovanni
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [7] A Managerial Framework for Intelligent Big Data Analytics
    Sun, Zhaohao
    Huo, Yanxia
    PROCEEDINGS OF THE 2019 2ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND INFORMATION MANAGEMENT (ICSIM 2019) / 2019 2ND INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (ICBDSC 2019), 2019, : 152 - 156
  • [8] Big Data in Intelligent Information Systems
    Anandakumar Haldorai
    Sri Devi Ravana
    Joan Lu
    Arulmurugan Ramu
    Mobile Networks and Applications, 2022, 27 : 997 - 999
  • [9] Big data and intelligent software systems
    Jalal, Ahmed Adeeb
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2018, 22 (03) : 177 - 193
  • [10] Intelligent transportation systems in big data
    Xiang Li
    Journal of Ambient Intelligence and Humanized Computing, 2019, 10 : 305 - 306