Variational Autoencoder Model Combining Deep Learning and Probability Statistics and Its Application in Large-scale Data Analysis

被引:0
|
作者
Zou, Lingguo [1 ]
Zhang, Meihua [2 ]
机构
[1] School of Public Education, Xiamen Ocean Vocational College, Xiamen,361009, China
[2] College of General Education, Xiamen Huatian International Vocation Institute, Xiamen,361102, China
来源
Informatica (Slovenia) | 2024年 / 48卷 / 22期
关键词
Large datasets;
D O I
10.31449/inf.v48i22.6921
中图分类号
学科分类号
摘要
A multi-layer generative model is proposed as a means of enhancing the accuracy of large-scale data analysis. This model addresses the problem of limited feature extraction capability and insufficient association with label information in existing topic models. The model is divided into three main modules: text encoding, autoencoder inference, and layer-by-layer learning. The model combines a hierarchical Bayesian model with a deterministic upward random downward network structure. It uses a Poisson Gamma Belief Network as a decoder to capture hierarchical implicit features in text data during text encoding, autoencoder inference, and layer-by-layer learning. Random Gradient Monte Carlo sampling is used for posterior inference to improve the model efficiency. In addition, the Fisher information matrix is used to adaptively adjust the learning rate of different levels and topic parameters, and a layer-by-layer learning strategy is introduced to construct a learning network. Based on this, text data and label information are combined for feature extraction. The results demonstrated that the test error rates of the designed model on the 20News, RCV1, and IMDB datasets were 16.52%, 18.72%, and 11.67%, respectively, all of which were the lowest. Additionally, the testing time was the shortest, at 0.020s, 0.017s, and 0.015s, respectively, indicating a high level of accuracy and efficiency. In addition, the perplexity levels on the 20News, RCV1, and Wiki datasets were 590.23, 953.12, and 982.67, respectively, significantly lower than those of other comparison models. Given this, the designed model has high data analysis and interpretation capabilities and relatively high computational efficiency, which can provide scientific tools for accurately analyzing large-scale data in batches. © 2024 Slovene Society Informatika. All rights reserved.
引用
收藏
页码:31 / 46
相关论文
共 50 条
  • [21] A Comparison of Svm With Deep Learning Models for Large-Scale Intents Analysis
    Islamic, Toqeer Ali
    Jan, Salman
    Faizullah, Safiullah
    Musa, Shahrulniza
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (07): : 38 - 46
  • [22] Combining citizen science and deep learning for large-scale estimation of outdoor nitrogen dioxide concentrations
    Weichenthal, Scott
    Dons, Evi
    Hong, Kris Y.
    Pinheiro, Pedro O.
    Meysman, Filip J. R.
    ENVIRONMENTAL RESEARCH, 2021, 196 (196)
  • [23] Software abstractions for large-scale deep learning models in big data analytics
    Khan A.H.
    Qamar A.M.
    Yusuf A.
    Khan R.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (04): : 557 - 566
  • [24] A Data-Centric Approach for Analyzing Large-Scale Deep Learning Applications
    Vineet, S. Sai
    Joseph, Natasha Meena
    Korgaonkar, Kunal
    Paul, Arnab K.
    PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, ICDCN 2023, 2023, : 282 - 283
  • [25] Optimizing coagulant dosage using deep learning models with large-scale data
    Kim J.
    Hua C.
    Kim K.
    Lin S.
    Oh G.
    Park M.-H.
    Kang S.
    Chemosphere, 2024, 350
  • [26] Software Abstractions for Large-Scale Deep Learning Models in Big Data Analytics
    Khan, Ayaz H.
    Qamar, Ali Mustafa
    Yusuf, Aneeq
    Khan, Rehanullah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (04) : 557 - 566
  • [27] Stacked Autoencoder-Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks
    Jiang, Feibo
    Wang, Kezhi
    Dong, Li
    Pan, Cunhua
    Yang, Kun
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (10) : 9278 - 9290
  • [28] Learning Compact Model for Large-Scale Multi-Label Data
    Wei, Tong
    Li, Yu-Feng
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5385 - 5392
  • [29] A Supervised Learning Model for High-Dimensional and Large-Scale Data
    Peng, Chong
    Cheng, Jie
    Cheng, Qiang
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2017, 8 (02)
  • [30] Large-scale data classification method based on machine learning model
    Department of Electrical Engineering, Dalian Institute of Science and Technology, Dalian, China
    Int. J. Database Theory Appl., 2 (71-80):