A BigData MapReduce Hadoop Distribution Architecture for Processing Input Splits to solve the Small Data Problem

被引:0
|
作者
Manjunath, R. [1 ]
Tejus [1 ]
Channabasava, R. K. [1 ]
Balaji, S. [2 ]
机构
[1] City Engn Coll, Dept CSE, Hyderabad, Andhra Pradesh, India
[2] Jain Univ, Bengaluru, India
来源
PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT) | 2016年
关键词
Hadoop; MapReduce; input splits;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Hadoop deals with big data which is an open source java framework. There are two core components in it namely: HDFS (Hadoop distributed file system) is the ability of a system to continue normal operation against hardware or software faults using inexpensive hardware and which stocks huge extent of data another one is MapReduce is a processing technique and programming model done in lateral and scattered manner. Hadoop does not perform well for short data because huge amount of short data could be greater task on the NameNode of HDFS which inturn its execution time is prolonged for which MapReduce is encountered. While dealing with great amount of short data as it is particularly designed to handle huge amount of data, hadoop experienced with a performance cost. This analysis permits the indetail description of HDFS, actual ways to deal with the problems along with proposed approach to handle short data files and short data file problems. In proposed approach, small files are merged using programming model on hadoop known as MapReduce. By this approach of Hadoop performance of handling small files which is larger than block size is improved. We also propose a Traffic analyzer with the combination of Hadoop and Map-Reduce paradigm. The joint of Hadoop and MapReduce programming tools makes it possible to provide batch analysis in minimum response time and in memory computing capacity in order to process log in a high available, efficient and stable way.
引用
收藏
页码:480 / 487
页数:8
相关论文
共 34 条
  • [31] Power Flow Calculations for Small Distribution Networks under Time-Dependent and Uncertain Input Data
    Mazza, Andrea
    Chicco, Gianfranco
    Bakirtzis, Emmanouil
    Bakirtzis, Anastasios
    De Bonis, Antonio
    Catalao, Joao P. S.
    2014 IEEE PES T&D CONFERENCE AND EXPOSITION, 2014,
  • [32] NORMAL APPROXIMATION OF THE DISTRIBUTION OF THE OPTIMUM POINT IN THE DATA-PROCESSING PROBLEM BY THE METHOD OF LEAST MODULI
    BORSHCHEVSKII, AV
    IVANOV, AV
    CYBERNETICS, 1985, 21 (06): : 831 - 839
  • [33] Enhanced Input-Doubling Method Leveraging Response Surface Linearization to Improve Classification Accuracy in Small Medical Data Processing
    Izonin, Ivan
    Tkachenko, Roman
    Yendyk, Pavlo
    Pliss, Iryna
    Bodyanskiy, Yevgeniy
    Gregus, Michal
    COMPUTATION, 2024, 12 (10)
  • [34] APPLICATION OF STATISTICAL REGULARIZATION METHOD FOR PROCESSING OF X-RAY SMALL-ANGLE SCATTERING DATA - DETERMINATION OF INHOMOGENEITY DISTRIBUTION ACCORDING TO SIZES
    PLAVNIK, GM
    KOZHEVNIKOV, AI
    SHISHKIN, AV
    DOKLADY AKADEMII NAUK SSSR, 1976, 226 (03): : 630 - 633