An Automated Metadata Generation Method for Data Lake of Industrial WoT Applications

被引:5
|
作者
Yu, Han [1 ]
Cai, Hongming [1 ]
Liu, Zhiyuan [1 ]
Xu, Boyi [2 ]
Jiang, Lihong [1 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Software, Shanghai 200240, Peoples R China
[2] Shanghai Jiao Tong Univ, Coll Econ & Management, Shanghai 200052, Peoples R China
基金
中国国家自然科学基金;
关键词
Metadata; Semantics; Runtime; Data mining; Ontologies; Text recognition; Conferences; Data lake (DL); data modeling; entity recognition; metadata generation; stream processing; Web of Things (WoT); ACQUISITION; EXTRACTION;
D O I
10.1109/TSMC.2021.3119871
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent trends in the Web of Things (WoT) have led to data explosion. Data lake (DL), as a flexible on-demand heterogeneous data management architecture, has become a feasible solution in data management. Metadata modeling for DLs is the key basis for smart analysis and processing. However, the varieties in structures and semantics of industrial WoT data hinder metadata modeling and maintenance. Moreover, the lack of textual descriptions and the semantics hidden in value streams make it hard to automatically construct semantic metadata. The dynamic nature of WoT requires on-time evolution on metadata. To overcome these challenges, we propose an automated bottom-up metadata generation approach for DL of WoT applications. Applying a data-driven framework, raw data are notated as linked data and self-organizing map-based online clustering is applied to real timely extract data characteristics. To recognize entities, concepts and relations, semantics-based entity discovery approach from short texts is proposed according to the feature of WoT data. The numerical analysis is performed to find the hidden relations from raw values. Full-dimensional metadata with rich semantic knowledge are finally built. Experiments on a real-world dataset are conducted to verify the effectiveness of methods and a case study on an energy WoT system is provided to demonstrate the feasibility of the approach.
引用
收藏
页码:5235 / 5248
页数:14
相关论文
共 50 条
  • [1] Automated geodata analysis and metadata generation
    Balfanz, D
    VISUALIZATION AND DATA ANALYSIS 2002, 2002, 4665 : 285 - 295
  • [2] On data lake architectures and metadata management
    Sawadogo, Pegdwende
    Darmont, Jerome
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2021, 56 (01) : 97 - 120
  • [3] On data lake architectures and metadata management
    Pegdwendé Sawadogo
    Jérôme Darmont
    Journal of Intelligent Information Systems, 2021, 56 : 97 - 120
  • [4] New automated method for macrocontaminant analysis: Industrial applications
    Ossard, Sylvie
    Huber, Patrick
    Borel, Pascal
    Soysouvanh, Davy
    Delagoutte, Thierry
    TAPPI JOURNAL, 2017, 16 (11): : 623 - 631
  • [5] New automated method for macrocontaminant analysis: Industrial applications
    Ossard S.
    Huber P.
    Borel P.
    Soysouvanh D.
    Delagoutte T.
    1600, Technical Assoc. of the Pulp and Paper Industry Press (16): : 623 - 631
  • [6] Dredging a Data Lake: Decentralized Metadata Extraction
    Skluzacek, Tyler J.
    MIDDLEWARE'19: PROCEEDINGS OF THE 2019 20TH INTERNATIONAL MIDDLEWARE CONFERENCE DOCTORAL SYMPOSIUM, 2019, : 51 - 53
  • [7] A flexible template generation and matching method with applications for publication reference metadata extraction
    Yang, Ting-Hao
    Hsieh, Yu-Lun
    Liu, Shih-Hung
    Chang, Yung-Chun
    Hsu, Wen-Lian
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2021, 72 (01) : 32 - 45
  • [8] Contextual Advertising for IPTV Using Automated Metadata Generation
    Begeja, Lee
    Van Vleck, Paul
    2009 6TH IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, VOLS 1 AND 2, 2009, : 437 - 441
  • [9] Automated scanning of observational datasets for the generation of formal metadata
    Fabritz, JE
    Denbo, DW
    18TH INTERNATIONAL CONFERENCE ON INTERACTIVE INFORMATION AND PROCESSING SYSTEMS (IIPS) FOR METEOROLOGY, OCEANOGRAPHY, AND HYDROLOGY, 2002, : 60 - 61
  • [10] Functionalities for automatic metadata generation applications: A survey of metadata experts' opinions
    School of Information and Library Science, University of North Carolina at Chapel Hill, 100 Manning Hall, CB 3360, Chapel Hill, NC, 27599-3360, United States
    Int. J. Metadata Semant. Ontol., 2006, 1 (3-20):